This is the mail archive of the docbook-tools-discuss@sources.redhat.com mailing list for the docbook-tools project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: multiple bugs and security hole (was: Re: BUG: docbook2html --nochunks)


Hi Mark,

Thanks for your feedback.

> Thanks for the quick response.  I applied that patch directly to
> /usr/bin/jw and it sorta-kinda fixed the problem.  Still, it is a
> kludge rather than a proper bugfix. docbook2html still can't be used
> as a proper filter, for example:
> 
>    <generate_docbook> | docbook2html ... | tidy ... | ...

Well then all the other backends are 'broken', if you take that
attitude.  I think a more useful approach is to have consistent
behaviour across all the backends: that of generating one or more
output files in the current (or a specified) directory.  That's what
the man page says it does.

> This is un*x.  Filters should be able to take input on standard in
> and send output to standard out with errors to standard error.

If jw were to output to stdout, it would (in general) need to send a
tar file!

> The blow chunks mode is also probably also a serious security
> hole in many situations (it creates files on the host system with
> names based on text supplied by the untrustworthy remote user who
> supplied the file).   Don't believe me?  Try this
>      <chapter id="/etc/youarescrewed">

Yes, this is an interesting attack.  The docbook-dsssl package by
default makes up its own names for output files when chunking; the Red
Hat Linux docbook-utils package comes with a default custom stylesheet
which turns on a feature to use IDs as filenames.  We'll be correcting
that shortly.

> Denial of service attack:  Lets suppose that on a system with
> a 65536 inode limit, I process a mailicious file which has 65536
> <chapter>'s.

I can say the same thing about tar files (for example).

> On a related note, Docbook2html files actually need to be tidy'ed so
> badly that you might consider making a call to tidy (with
> configurable options), a built option (or better yet, fix the
> generator - but that is probably jade).  The output is technically
> legal HTML but the formatting violates the spirit of HTML.

The output is determined by the stylesheets.  They are the way they
are because of technical details---significant whitespace is the
reason for '>' being separate to the rest of the element, for example.

I'm sure that Norm would welcome patches that make the HTML output
nicer to read.  How's your DSSSL? ;-)

(On the other hand, who is it that is editing generating output rather
than editing the source?)

> Another question: does either 0.6.9 or the upcoming release fix
> the "URL not supported" problem?   docbook2html chokes on the DOCTYPE
> in files generated by abiword:
> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
> 	"http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd";

For a long time the Red Hat Linux openjade package came with HTTP
support disabled.  It is enabled in the current package (in
Skipjack).

But you might want to consider using an XSL processor for DocBook
XML.  Take a look at the xmlto package for a way to start.

> Now, this appears to be at least two bugs:
>    - URL in DOCTYPE is unimplemented feature

(Actually a feature that defaults to 'disabled'.)

>    - failure to use a good catch-all document type where an exact
>      stylesheet match is not found.

This is an unreasonable requirement and would just generate bogus bug
reports.  People should install the DTD for the document they are
processing.

Tim.
*/

Attachment: msg00049/pgp00000.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]