This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
RE: SAXON and UTF-8
- To: <xsl-list at lists dot mulberrytech dot com>
- Subject: RE: [xsl] SAXON and UTF-8
- From: "Julian Reschke" <julian dot reschke at gmx dot de>
- Date: Thu, 27 Sep 2001 15:47:02 +0200
- Reply-To: xsl-list at lists dot mulberrytech dot com
But then maybe it's the missing support for UTF-8 Byte Order Marks?
> -----Original Message-----
> From: owner-xsl-list@lists.mulberrytech.com
> [mailto:owner-xsl-list@lists.mulberrytech.com]On Behalf Of Michael Kay
> Sent: Thursday, September 27, 2001 3:28 PM
> To: xsl-list@lists.mulberrytech.com
> Subject: RE: [xsl] SAXON and UTF-8
>
>
> > Newbie observations: I get the following error when feeding
> > SAXON with a XML document with UTF-8 encoding.
> >
> > --
> > E:\test\sampledocs>saxon dataseq.xml sampledoc.xsl > dataseq.fo
> > Fatal error reported by XML parser: required character (found
> > "?") (expected
> > "<"
> > )
> > URL: file:/E:/test/sampledocs/dataseq.xml
> > Line: 1
> > Column: 5
> > Error
> > required character (found "?") (expected "<")
>
> This message suggests that there's no problem with your UTF-8,
> but there is
> a problem with your XML. Without seeing the file, I can't tell
> you what the
> problem is.
>
> > Saving in "plain text" triggers the appropriate error message
> > from SAXON:
> >
> > E:\test\sampledocs>saxon dataseq.xml sampledoc.xsl > dataseq.fo
> > Fatal error reported by XML parser: bad continuation of
> > multi-byte UTF-8 sequence (character code: 0x72)
> > URL: file:/E:/test/sampledocs/dataseq.xml
> > Line: -1
> > Column: 1477
> > Error
> > bad continuation of multi-byte UTF-8 sequence (character code: 0x72)
> > Transformation failed
> >
> > That error message could have been better.
>
> Yes. AElfred first tries to decode a buffer-full of bytes into characters,
> and then looks for the newline characters that determine line
> endings. If it
> fails in the first step, then the line number is -1 and the
> column number is
> the byte offset where it hit trouble. In general, if the file is
> not in the
> expected encoding then line boundaries will not be detected correctly.
>
> Mike Kay
> >
>
>
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
>
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list