This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: & in SGML vs XML


At 12:48 4-11-2000 +0100, Matthias Häußer wrote:
>I have another tricky &-related question:

It's not XSL-related, and is better suited for XML-L (mail 
listserv@listserv.heanet.ie, no subject, body "subscribe xml-l") or 
comp.text.xml.

>I have SGML documents which can easily be converted to XML by just
>exchanging the declaration in the first line(s),
>except for that they contain &'s standing alone, as in
><line>you & me</line>.
>
>This is legal in SGML, but XML parsers and XT do not accept it.
>Is there a way of getting this right except for string replacement
>(& -> &amp;)? (Which is tricky because "real" entities like &Ccaron;
>must not be destroyed.)
>James Clark's sx does it alright, but I'd prefer a Java solution
>(ideally, one line of declaration either in the stylesheets or the XML).
>
>In other words: Is there a way of treating an XML document like
><line>you & me</line>?

An ampersand is recognized as a "delimiter in context", meaning that it 
must be followed by a name start character (see product [59] of ISO 
8879).  Assuming your SGML used the reference concrete syntax, you could do 
something like

s/&\([^a-zA-Z]\)/\&amp;\1/g # ampersand followed by innocuous character
                             # is replaced by &amp; and character
s/&$/\&amp;/                # ampersand at end of line is replaced by
                             # &amp;

See <URL:http://www.oreilly.com/%7Ecrism/sgmldefs.html> for the SGML formal 
productions, but they aren't very useful without the text of the Standard.

-Chris
--
Christopher R. Maden, Senior XML Analyst, Lexica LLC
222 Kearny St., Ste. 202, San Francisco, CA 94108-4510
+1.415.901.3631 tel./+1.415.477.3619 fax
<URL:http://www.lexica.net/> <URL:http://www.oreilly.com/%7Ecrism/>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]