This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: Re[2]: correct use of keys?


I made similar performance discovery recently.

The xml is about 1.5M and looks something like this:

    <page>
        <document category="catOne" url="someUrl" ... more attribute />
        <document category="catOne" url="someUrl" .../>
        <document category="catTwo" url="someUrl" .../>
        <document category="catTwo" url="someUrl" .../>
        <document category="catThree" url="someUrl" .../>
        <document category="catThree" url="someUrl" .../>
        ..........
    </page>

There are approx. 3400 document nodes in the file. The applied XSL writes
out an HTML table. Then marketing wanted a separator between the different
categories. So I added this template rule:

    <xsl:if test="not(@category = preceding::document/@category)">
        <xsl:value-of select="category"/>
    </xsl:if>

Up untill then performance differences between Xalan 2, MSXML 3 and Saxon
6.4 were minimal, but this rule brought both the MS and Xalan parsers to a
hold; just over 2 minutes each to do the job while Saxon had the job done in
12 seconds. To make sure I wasn't using an under powered machine I tried it
again at home on a dual processor with 512M Ram, resulting in faster times
but the same differences.
Perry


Inflexions (WA) Pty Ltd
PO Box 57
Inglewood WA 6052
Australia
t: +61 08 9371 2140
m: 0401 677 453
e: perry@inflexions.com


____________________________________________________________________________
________________


Microsoft XML Parser 4.0 July 2001 Technology Preview

http://msdn.microsoft.com/downloads/default.asp?url=/downloads/sample.asp?ur
l=/msdn-files/027/001/677/msdncompositedoc.xml

Bullet point 4:
'Substantially faster XSLT engine. Our tests show about x4, and for some
scenarios x8, acceleration (except the known serious performance bug for
xsl:keys).'


> -----Original Message-----
> From: Kevin Burges [mailto:xmldude@burieddreams.com]
> Sent: 13 September 2001 12:34
> To: Michael Kay
> Subject: Re[2]: [xsl] correct use of keys?
>
>
> Michael + Thomas:
>
> >> I have a stylesheet which, when run on a 10MB doc turns it
> into a 30MB
> >> doc in ~600 seconds.
> >> Even for such a large doc, this seems like along time
> given my machine
> >> is a 1.33GHz Athlon, 256MB.
>
> MK> It seems a long time to me, too. Which processor are you
> using? Are you
> MK> getting thrashing due to shortage of memory?
>
> I'm using the latest MSXML 4 (July?). Toward the end of the
> transformation there is a small amount of swapping going on, but
> certainly not what I'd call thrashing. For the majority of the time
> there is virtually no drive access at all.
>
>
> MK> better off doing a preprocess of the document in which
> elements whose name
> MK> contains 'field' are given an extra attribute,
> field="yes", and then use
> MK> this attribute in the second phase. In any case, I
> suspect that you are not
> MK> interested in all nodes whose name contains 'field', but
> only in elements
> MK> whose name contains 'field'. Replacing "node()" by "*"
> will speed things up
> MK> a bit.
>
> I tried this in a couple of stages:
>         Changing "node()" to "*" made no difference
>         Using "field = 'yes'" instead of "contains(....)" made no
>         difference
>
> I also tried using specifically
>   "*[(name() = 'field') or (name() = 'datefield') or (name() =
>   'computedfield')]
> This also made no difference.
>
>
> In fact, when I used the "field = 'yes'" method, the Win2k task
> manager said my program was using up to 175MB memory, where
> previously I had not seen it above 105MB.
>
>
> TP> Then I suggest that you temporarily change the stylesheet so it
> TP> outputs only one node where it has to use one of the keys, and see
> TP> how long it takes. This will check whether compiling the
> TP> stylesheet and building the key indices is taking an inordinately
> TP> long time.
>
> I tried this, and the index was generated instantly. Presumably
> because the document that is being indexed is fairly small.
>
>
> TP> Another thing is whether you are testing out the transfomation in
> TP> an environment where the result is displayed in a browser (like
> TP> XML Cooktop or XML Spy).
>
> No, I'm transforming programatically in VB so that's not an issue.
>
>
> One thing I did notice is that if the keys are empty (I had made a
> mistake), the transform only takes 60 seconds as opposed to 600.
>
> This suggests surely that there must either be something wrong with
> the way I am using the keys (so they are being ineffectual), or MSXML
> has a very poor implementation of keys. Any other suggestions???
>
> --
> groovy baby,
>  Kevin                    mailto:xmldude@burieddreams.com
>
> ++++++++++++ Cool music - http://burieddreams.com/marshan
> ++++++ Attitude Webzine - http://burieddreams.com/attitude
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]