This is the mail archive of the mailing list for the DocBook project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: SV: Strange newlines in HTML output


On Tue, Dec 12, 2000 at 01:38:53PM +0100, Thorbjoern Ravn Andersen wrote:

> Unfortunately tidy has a few short comings, and I have found it
> difficult to use in batches because it complains too much.

There are others that have requested a superquiet option in JTidy (I
have made an internal version that does this), so this may be coming
up soon.

> Tidy also can choke on input, so you are not guaranteed that your input
> is processed.

Could you please send examples of documents that choke JTidy (or Tidy)
either to me or to the list? Thanks.

> I looked at jTidy, which has the same problems, but which -- being a
> Java program -- also have hooks in order to be used as a SAX parser for
> the HTML-part.  I could not get massaged input though.
> What in my eyes would be perfect, would be a JTidy frontend to
> SAX-confomant parsers, which allowed us to use HTML files directly.
> This might require work though, as it does not appear to be the goal of
> the current maintainer(s).

If I understand you correctly, this is exactly the current goal. We
are in the process of removing the outdated DOM I/F from JTidy and
providing an interface to any SAX2-compliant parser (such as Xerces),
which allows the tool to be integrated to almost any processing chain.


-Sami (JTidy release coordinator)

ICQ:19002710  *************  apt-get a life

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]