This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
FW: [docbook-apps] RE: HTML -> DocBook Conversion?

From: "Steve Whitlatch" <swhitlat at getnet dot net>
To: <docbook-apps at lists dot oasis-open dot org>
Date: Thu, 7 Oct 2004 09:02:23 -0700
Subject: FW: [docbook-apps] RE: HTML -> DocBook Conversion?
 > Do you have a summary written up about any 
 > problems or limitations
 > you ran into getting valid DocBook output from 
 > Frame 7? A couple
 > years back Bob Stayton wrote up a list of some 
 > problems he found,
 > and I did the same. Summary is at:
 > 
 >   http://groups.yahoo.com/group/xml-doc/message/3257
 
 Thank you to you and Bob for that summary. I 
 found it and read it before I attempted any 
 DocBook work in FrameMaker. It was an excellent 
 heads up. I'm still new to DocBook and XSL, but 
 I'd been using FrameMaker (unstructured) for 
 years. So, I had some confidence in the venture 
 and with the summary from you and Bob I went 
 ahead. The problem myself and many others have 
 when things go wrong is that we don't know 
 whether the problem is caused by us (mistakes) or 
 by the software (deficiencies, bugs). A heads up 
 beforehand can really help with the learning. 
 
 I guess that the entire README for my 
 FrameMaker+DocBook project can't be considered a 
 summary. It is really long and detailed. The most 
 important bit of info, just my opinion, is that 
 virtually all the structured FrameMaker trouble 
 is attributable to FrameMaker's need to translate 
 XML to its FrameMaker format, including the 
 DTD-to-EDD translation, which can be (is) also a 
 source of trouble. Add to that the scuttlebut on 
 the Internet that Adobe let go its entire 
 FrameMaker programming staff after releasing 
 FrameMaker version 7, the remaining maintenance 
 off to India, and what do we have? Adobe 
 announced that the Mac version would be phased 
 out; the experimental Linux version died long ago. 
 
 Again, just my opinion, and I do pay close 
 attention to what more experienced people say on 
 this list and other lists, I think that a 
 real-time-WYSIWYG XML authoring tool that formats 
 both on-screen display and output according to 
 legal, standard XSL is what's needed. If someone 
 does not want the WYSIWYG part, they can turn it 
 off. FrameMaker partially provides this, just not 
 with XSL. I forgot to add, the XSL from the 
 mythical tool I describe must be portable, no lockin.
 
 Maybe Arbortext's Styler provides just as good 
 functionality. I may have the chance to learn 
 some Arbortext tools, and I don't really know 
 what Styler is/does, so I am open to hearing 
 about it from others.
 
 In the free world, my own little project: DocBook 
 XSL Configurator (far from a WYSIWYG tool), 
 provides some help for creating XSL customization 
 layers for DocBook:
 https://sourceforge.net/projects/db-xsl-cfg/   
 
 OK, I've become long winded. You asked for a 
 summary about the problems I may have had getting 
 valid DocBook XML output from FrameMaker. In the 
 FrameMaker world, they call it "round-tripping." 
 Here is an excerpt from the README:
 
 *****************************
   Public/System Identifiers. 
     To get FrameMaker to write a public 
 identifier in output XML, I used 
     the following read/write rule:
     *******
     writer external dtd is public "-//OASIS//DTD 
 DocBook XML V4.2//EN" 
     "/usr/share/docbook-xml42/docbookx.dtd";
     *******
     FrameMaker 7.0 cannot correctly write out a 
 URL used as a system 
     identifier. For example, 
     
 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd";
     is always changed to 
     
 "http:/www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
     Note the missing forward slash. 
 
   Upon XML export. 
     1) Lots of fairly ordinary ascii characters 
              were changed to internal general entities 
              (such as hyphens, etc.). 
     2) "imagedata" element "fileref" attributes 
                 became external parameter entities 
     3) Each structured file in the FrameMaker 
             book became an external file referenced 
              by an external parameter entity, but so 
              did the unstructured files (TOC, LOF, LOT, 
              and Index). The unstructured files had to 
              be removed from the exported XML tree before 
        validating.  
     4) Many (all?) default attribute values not 
             explicit in the input XML were made explicit
              in the output XML.
 
     I suppose all import/export behavior could be 
        changed with a "custom client," but that is 
         like selling someone a piece of software as 
         "able to do _anything_ you want," and
         adding "you just have to do some programming." 
         What is this fabulous piece of software that 
         is so versatile it can do anything I want? 
         Well, of course, it's a compiler!
  *****************************   
 
 It's an exaggeration to say this, but I think the 
 point comes across. With structured FrameMaker, 
 everything can be cured with a custom API client, 
 which one typically writes in C using the 
 FrameMaker Developer Kit and some other 
 programming packages from Adobe. Personally, I 
 found that to be more work than I was willing to put in. 
  
 > Did you run into those same problems? If so, how 
 > did you work around them? 
 
 Yes, the same problems.
 
 > Post-processing, maybe? Or some 
 > custom proramming.
 
 I manually made the necessary adjustments to the 
 XML that FrameMaker output in order to get the 
 output XML to validate. But then I only had one 
 document to do that with. In large enterprise 
 production environments, I'm sure they use 
 automated processes to do the same or similar. 
 However, it's really not good that any of the 
 adjustments to output XML need to be done at all.
 
 > One thing about working with MIF is, there is no 
 > free open-source MIF parser for Frame 5 (or 6 
 or 7). There was one 
 > once that could handle Frame 4 files, I think. 
 So if you were 
 > really to build your own system for working 
 with MIF in Perl or 
 > whatever, you'd first need to create a MIF parser.
 
 I did not know that. That's interesting info. 
 
 Obviously, when we eliminate the unnecessary 
 intermediate formats, many problems disappear. 
 Just plain DocBook XML authored in GNU emacs and 
 then processed by libxml2 tools takes me a long 
 way with fewer opportunities for trouble. But I'm 
 just one guy at home on my computer. I know that 
 what works for me may not work for large enterprises.  
 
 > (And Steve, I don't mean you personally, 
 because I know you
 > already know XSLT 
 
 Not very well. I am heavily dependant on others, 
 like Bob Stayton, for the DocBook XSL 
 stylesheets. I don't know, but there may be 
 literally thousands like me who are similarly 
 dependant on the DocBook XSL stylesheets.
 
 
 Steve Whitlatch
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]