This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: invalid character was found in text content


At 01/09/11 20:03 -0400, Melvyn Rosengarden wrote:
>The file header I create indicates ISO-8859-1 encoding. When I attempt to
>parse
>my XML file with the MS SAX interface I get the following error;
>"invalid character was found in text content".

This is a message about your XML characters ... not about your encoding.

>When it first occured I
>discovered that an
>embedded Hex 1E character was the culprit so my parsing routine "swallowed"
>that
>character. A few days later the problem reoccured and the culprit was a Hex
>05 character.
>I do NOT want to be surprised again tomorrow. Is there a comprehensive list
>of invalid
>characters for the ISO-8859-1 encoding scheme

This is not what you are looking for, though you don't realize it.

>that I could use to create the
>necessary
>pre-process filter ??

You need to filter out non-XML characters ... neither hex 1E nor nex 05 are 
in XML, but they are both in the C0 set of ISO-2022, the framework within 
which Latin-1 ISO-8859-1 can be used in either the GL or GR (typically GR).

The list of valid XML characters is in the XML recommendation.  According 
to production [2], only tab, linefeed and carriage return are allowed from 
the C0 set of control characters.  Note these are *not* in Latin-1, but in 
the control set.

Please see the Recommendation to determine which characters are 
allowed.  This is specified in Unicode, and all characters of Latin-1 are 
in Unicode.  The list I gave you above is the complete list of the three 
allowed control characters, as specified in production [2].

I hope this helps.

........................ Ken

--
Training Blitz: 3-days XSLT/XPath, 2-days XSLFO in Ottawa 2001-10-01/05

G. Ken Holman                      mailto:gkholman@CraneSoftwrights.com
Crane Softwrights Ltd.               http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0     +1(613)489-0999   (Fax:-0995)
Web site:     XSL/XML/DSSSL/SGML/OmniMark services, training, products.
Book:  Practical Transformation Using XSLT and XPath ISBN 1-894049-06-3
Article: What is XSLT? http://www.xml.com/pub/2000/08/holman/index.html
Next public instructor-led training:      2001-09-18,09-19,10-01,10-04,
-                                         10-22,11-05,12-09,12-10,02-02


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]