This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Choosing a characterset for DocBook


On Fri, 15 Mar 2002, Christopher R. Maden wrote:

> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> At 01:52 AM 3/15/02, Jens Stavnstrup wrote:
> >The editing have been done on a Unix platform with Emacs. Occasionally,
> >when copying text from a word document,  Saxon protests (actually
> >Aelfred protests), complaining over "bad continuation of multi-byte UTF-8
> >sequence", which have been a problem, since I have chosen the ISO-8859-1
> >encoding (don't remember why).
> 
> The parser obviously is not aware that you have chosen ISO 8859-1.  That is 
> the expected error message if an 8859-1 document contains any high bytes 
> (128+) and the parser is trying to parse it as UTF-8.
> 
> 1) Do all of your entities (i.e., files) have encoding declarations?  What 
> are they?  Remember that UTF-8 is the default unless you explicitly specify 
> a different encoding (or use a byte-order mark, in which case UTF-16 is the 
> default).
> 

The encoding chosed is as stated above ISO-8859-1, and yes that is 
specified in the XML desclaration statement.


> 2) How are you invoking the parser?  From within SAXON, obviously - is 
> SAXON being called from the command line, or within another program?  What 
> exactly are the parameters it's being passed?
> 

>From Ant, no specific parameters specified (What are you BTW refering to 
?)

I am still using Saxon 6.4.4, and checking the Change history in 6.5.1, I 
do not see any specific problem with using ISO-8859-1.

My problem is not so much which encoding, I choose (If there  any bugs 
(e.g. characters the parser can't accept), I can fix them). But rather 
trying to avoid my colleagues to ran into these issues.

Regards

Jens


> ~Chris
> - -- 
> Christopher R. Maden, Principal Consultant, crism consulting
> DTDs/schemas - conversion - ebooks - publishing - Web - B2B - training
> <URL: http://crism.maden.org/consulting/ >
> PGP Fingerprint: BBA6 4085 DED0 E176 D6D4  5DFC AC52 F825 AFEC 58DA
> -----BEGIN PGP SIGNATURE-----
> Version: PGP Personal Privacy 6.5.8
> 
> iQA/AwUBPJHHt6xS+CWv7FjaEQLVtACeK8vhtpW0lR1Lglhu7WVezv7JmC4AoLa9
> W3ZjhuuDFZojp05ANUG/pp56
> =GoOY
> -----END PGP SIGNATURE-----
> 

-- 




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]