This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [docbook] Question about prettyprinting Docbook documents and character entities


Taro Ikai <tikai@ABINITIO.COM> writes:

> I am having a few problems prettyprinting my Docbook documents. I am using 
> Cygwin distribution of Tidy.
> 
> 1) Tidy seems to translate the character entities:
> 
>  &ensp; into a two-byte sequence of 0x20, 0x02, and 
>  &emsp; into a two-byte sequence of 0x20, 0x03
> 
> Is this expected? I want to keep the &entityname; notations in the output. 
> How can I do this?

I don't think you can. For XML output, I think Tidy is hard-coded to
translate the entity names into numeric ones.

> 2) Tidy fails to produce any output with Japanese UTF-8 encoded documents.

I've seen the same thing with Cygwin Tidy -- no output for any UTF-8
encoded documents. I think its UTF-8 handling is just broken. But it
does seem to handle UTF-16 and Shift-JIS correctly.

Attachment: pgp00000.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]