This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: url encoding of ampersands


Sivan Mozes wrote:
> Question: if the xslt processor was passed this character from the xml
> parser, how is it selected for conversion as opposed to other characters, by
> range?

Yes. It is up to the processor implementation, actually. For compatibility
with Netscape, they tend to emit entity references for characters 128-255
and numeric character references for 256 and higher. This is only when the
output method is HTML.

> Shouldn't there be a way to be more specific in HTML output mode in regards
> to ampersand handling? (using xerces/xalan).

The idea is to make it so you don't have to think about such things, and
can always know that your output, be it XML or HTML, will always be
well-formed (or HTML's equivalent of well-formed).

That is why the disable-output-escaping on xsl:text and xsl:value-of is
held in low regard; it was added to the spec at the last minute and makes
it possible to force a processor to emit something that can't be read back
in by a conforming XML parser or HTML user agent.

> I eventually wrapped all my entities in CDATA so I can later on encode the
> ampersands, and did another assignment pass w/ disable-output-escaping to
> parse these entities for displaying link content.
> 
> Although it works, I don't like the idea of introducing an exception into
> the xml itself, which is being handled by non-techies.

...and of course, CDATA sections strip whatever is in them of any logical
structure other than just being a run of character data. For example,
&foo; in a CDATA section means the 5 characters & f o o ; rather than an
entity reference to a general parsed entity named foo. You're saying you
want them to be characters, not markup, so ideally you don't want your
serializer to ever emit them in such a way that they would be
misinterpreted as being markup. But of course, that *is* what you are
asking for.. I'm just explaining why you don't have the level of control
you want.

> Only for specific elements CDATA needs to be used for entities while
> everywhere else, entities are handled in a standard manner. I don't
> think this can be specified in the DTD.

That's correct.

   - Mike
____________________________________________________________________
Mike J. Brown, software engineer at            My XML/XSL resources: 
webb.net in Denver, Colorado, USA              http://skew.org/xml/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]