This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: Including URL-encoded query string in XHTML document
- To: "Kiyko, Yelena" <Yelena dot Kiyko at gs dot com>
- Subject: Re: [xsl] Including URL-encoded query string in XHTML document
- From: Jeni Tennison <mail at jenitennison dot com>
- Date: Thu, 11 Jan 2001 18:07:52 +0000
- CC: "'XSL-List at lists dot mulberrytech dot com'" <XSL-List at lists dot mulberrytech dot com>
- Organization: Jeni Tennison Consulting Ltd
- References: <E594FE85086FD411901000D0B7BA3454C2A762@gsny45e.et.gs.com>
- Reply-To: xsl-list at lists dot mulberrytech dot com
Hi Yelena,
> I'm trying to process an XML data feed that contains URL-encoded query
> strings, like the following:
>
> <item url="research.exe?ticker=GS&type=1" date="01/01/2000">goldman
> sachs</item>
The isn't well-formed XML and the XML parser that you're using should
complain when it sees it. In XML, it's illegal to have a '&' character
that doesn't mark the start of a general entity reference. The XML you
need to use is:
<item url="research.exe?ticker=GS&type=1"
date="01/01/2000">goldman sachs</item>
The XML that you see in a file is just a *serialisation* of a node
tree. In the node tree, entity references are substituted for whatever
they reference. So the node tree for the above looks like:
+- (element) item
| +- (attribute) url = research.exe?ticker=GS&type=1
| +- (attribute) date = 01/01/2000
+- (text) goldman sachs
Note the url attribute has a value with the character '&' in it rather
than the entity reference.
> Any advice on what is the best way to pass a URL-encoded string through the
> XSLT transformation?
> I substituted "&" with "&" in the original data, but then the output
> XSLT document also contains & and there seems to be no way to print "&"
> as it is.
> Using <xsl:output method="html" > or "disable-output-escape" directives did
> not seem to help.
When you create some output with XSLT, if it's creating XML it sticks
to XML rules. So because XML doesn't allow a '&' that isn't the start
of an entity reference, the XSLT processor outputs '&' instead.
When you tell it to output in HTML with <xsl:output method="html" />,
it still sticks with this rule because you can have entity references
in HTML as well, and you need to know when an '&' is an ampersand
character and when it's the start of an entity reference. Almost
always, an '&' in an HTML node tree will be serialised as '&' when
it's written to a file.
But this shouldn't be a problem. Whatever program looks at the HTML
and reads it should interpret the '&' correctly and Do The Right
Thing. You shouldn't have to worry about it. Obviously it is causing
you a problem though - is it really the case that if you create an
HTML document with the following links in it:
<p>
<a href="research.exe?ticker=GS&type=1">goldman sachs
(entity)</a>;
<a href="research.exe?ticker=GS&type=1">goldman sachs
(character)</a>;
</p>
that the second works and the first doesn't? If so, you've got a
dodgy browser.
> and use the stylesheet below to construct an href tag for each item
> element:
>
> <xsl:template match="item">
> <a>
> <xsl:attribute name="href">
> <xsl:value-of select="@url">
> </xsl:attribute>
> <xsl:value-of select="." />
> </a>
> </xsl:template>
It's not directly relevant, but this is equivalent to:
<xsl:template match="item">
<a href="{@url}"><xsl:value-of select="." /></a>
</xsl:template>
I hope that helps,
Jeni
---
Jeni Tennison
http://www.jenitennison.com/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list