This is the mail archive of the
mailing list for the DocBook project.
Re: SV: Strange newlines in HTML output
On Tue, Dec 12, 2000 at 01:38:53PM +0100, Thorbjoern Ravn Andersen wrote:
> Unfortunately tidy has a few short comings, and I have found it
> difficult to use in batches because it complains too much.
There are others that have requested a superquiet option in JTidy (I
have made an internal version that does this), so this may be coming
> Tidy also can choke on input, so you are not guaranteed that your input
> is processed.
Could you please send examples of documents that choke JTidy (or Tidy)
either to me or to the firstname.lastname@example.org list? Thanks.
> I looked at jTidy, which has the same problems, but which -- being a
> Java program -- also have hooks in order to be used as a SAX parser for
> the HTML-part. I could not get massaged input though.
> What in my eyes would be perfect, would be a JTidy frontend to
> SAX-confomant parsers, which allowed us to use HTML files directly.
> This might require work though, as it does not appear to be the goal of
> the current maintainer(s).
If I understand you correctly, this is exactly the current goal. We
are in the process of removing the outdated DOM I/F from JTidy and
providing an interface to any SAX2-compliant parser (such as Xerces),
which allows the tool to be integrated to almost any processing chain.
-Sami (JTidy release coordinator)
ICQ:19002710 ************* apt-get a life