This is the mail archive of the
docbook-apps@lists.oasis-open.org
mailing list .
RE: [docbook-apps] HTML to Docbook
- From: "Peter Ring" <PRI at magnus dot dk>
- To: "Thomas Jones" <admin at buddhalinux dot org>,<docbook-apps at lists dot oasis-open dot org>
- Date: Fri, 4 Mar 2005 09:04:07 +0100
- Subject: RE: [docbook-apps] HTML to Docbook
A few links:
Converting HTML to Docbook SGML/XML Using html2db
"html2db is a small utility to convert HTML to Docbook SGML/XML. It uses
TidyLib for parsing the HTML."
http://www.eecs.umich.edu/~ppadala/projects/tidy/
html2db.xsl
"html2db.xsl converts an XHTML source document into a Docbook output
document. It provides features for customizing the generation of the
output, so that the output can be tuned by annotating the source, rather
than hand-editing the output."
http://osteele.com/software/xslt/html2db/index.html
Html2DocBook
"This project was created by JeffBeal. He has been working with DocBook
since November, 2001, and has so far converted three sets of project
documentation from HTML to DocBook. Due to inconsistencies in HTML
coding and the often many-to-one relationship between DocBook elements
and HTML elements, there has always been a need to review and re-tag
manually, but the following process does minimize that effort somewhat."
http://wiki.docbook.org/topic/Html2DocBook
The Tidy patch has not been maintained for a while. It worked quite well
for me some years ago. If someone decides to go along with this, you
should start with tidy source from March 2003.
Please report your findings!
Kind regards
Peter Ring
For DocBook tools in general, always start here:
DocBookTools
http://wiki.docbook.org/topic/DocBookTools
and here
Docbook tools
http://www.dpawson.co.uk/docbook/tools.html
> -----Original Message-----
> From: Thomas Jones [mailto:admin@buddhalinux.org]
> Sent: 4. marts 2005 04:16
> To: docbook-apps@lists.oasis-open.org
> Subject: [docbook-apps] HTML to Docbook
>
>
> Does anyone know of a utility to convert HTML to Docbook?
>
> I found a few mentioned utilities via google but none are
> available, and/or
> have not been maintained since 2000.
>
> I thought of building a stylesheet; but what an undertaking!
> ;)
>
> Thanks,
> Thomas
>