This is the mail archive of the
mailing list .
RE: DOCBOOK: MS files included with elements?
- To: DocBook-apps mailing list <docbook-apps at lists dot oasis-open dot org>
- Subject: DOCBOOK-APPS: RE: DOCBOOK: MS files included with elements?
- From: "Prikryl,Petr" <PRIKRYLP at skil dot cz>
- Date: Fri, 11 May 2001 12:00:17 +0200
(first, sorry to Norman Walsh -- this should go here, not explicitly
to you ;-)
/ Galen Boyer <firstname.lastname@example.org> was heard to say:
| Oh God, I'll probably get killed for this question.
| Is there some tag which can be used to include a word doc or
| excel file or other element?
I suppose that this would be extremely difficult. I guess that you
should want to convert the doc into XML. The following may help
you only if you want to do it once with the Word document.
I am very new to XML/SGML and DocBook, but I did the conversion
of say 150 pages Word document into XML. I did it via exporting the
doc into HTML, and then I did a lot of perl fiddling... Now I have
well-formed XML, but not the DocBook markup, yet.
The process was rather painful -- because I did not know
HTML Tidy program before!!! (My thanks to Dave Raggett
who wrote it and to Jirka Kosek who mentioned it in his book.)
So, if I was forced to do it again, I would do it this way:
1. Export the Word to HTML (manually).
2. Use HTML Tidy (off line) do convert the <font ...> and the like
tags into markup that uses CSS (automatically) and to
output the XML result.
3. Use ImageMagick to convert the images into the desired
format (off line).
4. Use some XSLT processor and write XSL file to prescribe
the conversion of that XML to DocBook XML (off line).
5. Perl may still be needed.
Well, I never did the third step (being very new to XSL), nor I know
whether it is the best approach. I guess that there could be some
easier way. Anyway, I think that "Word to HTML" is the first step
to follow and I do not think that can be done off-line.
Any comments? (I want to learn something better ;-)
Petr Prikryl, SKIL, spol. s r.o., email@example.com
To unsubscribe from this elist send a message with the single word
"unsubscribe" in the body to: firstname.lastname@example.org