This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: XSLT and Text Processing Languages


Thorbjørn,

I think the biggest memory hog is the following command:

	do xml-parse document scan file "%g(DTDFile)" || 
		file "%g(inputFileName)"

The concatination operator (i.e. || ) does its magic in memory, which
means you are loading the entire input document into RAM before the
parse begins. 

You would be better off passing both the DTD and the file in on the
command line as follows:

	omnimark -s myprogram.xom mydtd.dtd myfile.xml

then changing your parse action to the following:

	do xml-parse document scan #main-input

This will allow for efficient stream processing of the document. 

Rick Geimer
National Semiconductor
rick.geimer@nsc.com

Thorbjørn Ravn Andersen wrote:
> 
> > This sounds like you were using some features of OmniMark
> > that don't exist in
> > XSLT - like shelves/arrays and probably referenents for
> > foward processing of
> > the document without processing it twice or requiring the
> > entire source document
> > in memory.
> 
> Shouldn't think so.  I'm no Omnimark expert so I have enclosed the source at the end (it isn't long) so you can see for yourself.
> 
> > released. To fix the
> > underscore issue all you have to do is provide an SGML
> > Declaration that allows
> > '_' to be used in names, by default SGML didn't allow this. A
> > 2 minute fix
> > at best.
> 
> As usual, if you know how to do it.  I don't.
> 
> > Yeah and how many times do you want to go back in and try to
> > remember what you
> > did in that regular expression? What is nice about OmniMark
> 
> Comments are allowed in source.  This applies for Perl as well as XSLT and Omnimark.
> 
> [XSLT]
> 
> > data. Yeah you may want or need access to the network and
> > ODBC connections for
> > other uses and more complete solutions but when I you have a
> > basic XML -> ?
> > issue, XSLT is by far easier than any of the other XML
> > solutions in Perl and
> > Java and OmniMark is at least in the SGML a complete solution
> > for DTD based
> > processing. Soon to have support for DTD less work.
> 
> "Soon to have" is too late for us.
> 
> All your stated solutions have advantages and disadvantages in one way or another.
> 
> You just appear to be in an environment where memory usage is not a problem -- unfortunately I'm not.
> 
> The source:
> 
> ;  down-translate
> 
> global stream XmlDclFile
> global stream DTDFile
> global stream inputFileName
> global stream DataItemFile
> global stream DataLinkFile
> global stream DataItemFileName
> global stream DataLinkFileName
> 
> global stream DataitemKey
> global stream DataitemType
> global stream Category
> global stream DescDisplay
> global stream DescIndex
> global stream FromDataitem
> global stream FromDatatype
> global stream ToDataitem
> global stream ToDatatype
> global stream DataLinktype
> 
> DEFINE FUNCTION OpenFiles
> AS
>         open DataItemFile as "%g(DataItemFileName)"
>         open DataLinkFile as "%g(DataLinkFileName)"
> 
> DEFINE FUNCTION CloseFiles
> AS
>         close DataItemFile
>         close DataLinkFile
> 
> DEFINE FUNCTION PrintDataitem
> AS
>         put DataItemFile "%"%g(DataitemKey)%";%"%g(DataitemType)%";%"%g(DescDisplay)%";%"%g(DescIndex)%"%n"
> 
> DEFINE FUNCTION PrintDatalink
> AS
>         put DataLinkFile "%g(FromDataitem)%";%"%g(FromDatatype)%";%"%g(ToDataitem)%";%"%g(ToDatatype)%";%"%g(DataLinktype)%"%n"
> 
> process
> ; do xml-parse document scan "test.xml"
> ;  do xml-parse document scan file "%g(DTDFile)" || file "%g(inputFileName)"
>   do xml-parse document scan file "%g(DTDFile)" || file "%g(inputFileName)"
> ;  do xml-parse document scan file  "%g(inputFileName)"
>     output "%c"
>   done
> 
> element rowset
>         OpenFiles
>         OUTPUT "%c"
>         CloseFiles
> 
> element dataitem
>     set DataitemKey to ""
>     set DataitemType to ""
>     set DescDisplay to ""
>     set DescIndex to ""
>         OUTPUT "%c"
>         PrintDataitem
> 
> element datalink
>     set FromDataitem to ""
>     set FromDatatype to ""
>     set ToDataitem to ""
>     set ToDatatype to ""
>     set DataLinktype to ""
>         OUTPUT "%c"
>         PrintDatalink
> 
> element fromdataitem
>         set FromDataitem to "%c"
> 
> element fromdatatype
>         set FromDatatype to "%c"
> 
> element todataitem
>         set ToDataitem to "%c"
> 
> element todatatype
>         set ToDatatype to "%c"
> 
> element datalinktype
>         set DataLinktype to "%c"
> 
> element dataitemkey
>         set DataitemKey to "%c"
> 
> element dataitemtype
>         set DataitemType to "%c"
> 
> element category
>         set Category to "%c"
> 
> element descdspl
>         set DescDisplay to "%c"
> 
> element descindx
>         set DescIndex to "%c"
> --
>   Thorbjørn Ravn Andersen             "...and...Tubular Bells!"
>   http://bigfoot.com/~thunderbear
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]