This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [docbook-apps] RE: HTML -> DocBook Conversion?


Roger Kitain <Roger.Kitain@Sun.COM> writes:

> Here's what I've tried - all with no success..  :-(

[...]

> Attempt 2:
> Export Frame 6 document to xml
> use "fm2doc.xsl" stylesheet (found off the web) to attempt to produce
> docbook:
>     xsltproc -v -o JSF-db.xml fm2doc.xsl JSF.xml
> ** xslt processing stuck in infinite loop **

I just did a quick search and found this site:

  http://www.brownhen.com/XML/fm2doc.html

I'm guessing that's the same place you found the fm2doc.xsl
stylesheet you mention.

But even if that worked as expected and wasn't causing the
processing to go into an infinite loop, it wouldn't do you much
good at all unless your entire Frame document is made up entirely
of headings of a single level followed by body paragaphs. Because
all that it seems to be designed to do is process those. If you
have other parapraph formats in your source (bulleted lists, for
example), it won't do anything at all with those.

In fact, that stylesheet seems to have been developed based on a
misunderstanding about Frame XML output; on the site, it says:

  The XML that comes out of FrameMaker is flat: it doesn't have
  the sort of hierarchical structure that Docbook would represent
  with parent <sect1> elements, for example.

That's not true. Frame 6 is actually smarter than that. What is
can do is generate DIV sectioning/wrapper elements into the XML
output based on information in a mappings table in one of the
reference pages in the doc. But you need to set up the mappings
table to make it work. Then the XML output will in fact include
DIV elements that you can easily tranform to DocBook sections.

One problem is that, in my experience at least, if you manually do
a "Save as XML" from within Frame (instead of using dzbatcher), it
corrupts that mappings table and then the next to you try to
generate XML from the document, it won't contain any of those DIVs.

> At this point, I am not very hopeful of getting the conversion to work
> with any automated process.  I'm not an xslt guru.
> Looks like this is going to be a *very* painful (and manual) process
> (sigh)....

Before you give up, here are some other options:

  - WebWorks Publisher

    http://webworks.com/

    Has a GUI-driven interface for creating custom Frame-to-XML
    mappings, and a pretty powerful macro language. It can
    definitely do what you need, but does have a learning curve.

  - Mif2Go

    http://www.omsys.com/

    I've never tried it, but I think it_might_ be able to do
    something similar to what you can do with WebWorks

  - Frame 7

    Mentioned already in this thread.

The thing is, if you need to do "smart" automated transformation
of your Frame source to DocBook, it is going to take a significant
amount of work no matter what you use.

There isn't some magic out there somewhere that you just haven't
found yet. What you're needing to do is something that Norm Walsh
has described as "dragging markup uphill" and it is never easy.

The least painful way you're probably going to find to do it might
be by going from Frame to HTML to DocBook, if you can make it
work. But even if you can, it's only going to give you a very
simplistic sort of DocBook output, and you're going to need to go
in an do a bunch of manual work on the afterwards to get your
content into "real" DocBook.

  --Mike

> Michael Smith wrote:
> 
> >Steve Whitlatch <swhitlat@getnet.net> writes:
> >
> > 
> >
> >>Hello Roger,
> >>
> >>Is the FrameMaker document already in structured form in
> >>FrameMaker? If not, then this information may not help. I
> >>have webbed a detailed rerecord of my experience with
> >>DocBook+FrameMaker at:
> >>
> >>http://www.getnet.net/~swhitlat
> >>
> >>Follow the DocBook link on the left.
> >>   
> >>
> >
> >Do you have a summary written up about any problems or limitations
> >you ran into getting valid DocBook output from Frame 7? A couple
> >years back Bob Stayton wrote up a list of some problems he found,
> >and I did the same. Summary is at:
> >
> > http://groups.yahoo.com/group/xml-doc/message/3257
> >
> >Did you run into those same problems? If so, how did you work
> >around them? Post-processing, maybe? Or some custom proramming.
> >
> >[...]
> >
> > 
> >
> >>For a single document, I would probably do the work
> >>manually. However, there could be a solution going to MIF
> >>and then sending the MIF file through some type of
> >>tag-mapping process via Perl or another text manipulation
> >>tool. To learn how to do that would probably take some
> >>people (me) much, much longer than the manual process.
> >>
> >>So, for a single document, it's just work. For hundreds or
> >>thousands of documents, a MIF expert who knows the text
> >>conversion tools would be the solution. Some consultants who
> >>fit that category often participate on the various
> >>FrameMaker mailing lists.
> >>   
> >>
> >
> >One thing about working with MIF is, there is no free open-source
> >MIF parser for Frame 5 (or 6 or 7). There was one once that could
> >handle Frame 4 files, I think. So if you were really to build your
> >own system for working with MIF in Perl or whatever, you'd first
> >need to create a MIF parser.
> >
> >If (and this is a big If, I know) you don't need to preserve the
> >content of your Frame markers (index markers, hypertext links,
> >etc.) on conversion to DocBook, I think going from Frame's "plain"
> >XML output through a custom XSLT stylesheet to generate DocBook
> >works pretty well.
> >
> >And one big advantage of it is you don't need to learn any
> >proprietary application-specific language/system (e.g., Frame 7's
> >stuff or WebWorks Publisher's macro language).  All you need to
> >learn is some basic XSLT, and of course learning that will end up
> >being useful for a lot more than just converting Frame content.
> >
> >(And Steve, I don't mean you personally, because I know you
> >already know XSLT -- I'm just using "you" in the general sense).
> >
> > --Mike
> > 
> >
> 

Attachment: pgp00000.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]