This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
replacing key() with pipe.
- To: xsl-list at mulberrytech dot com
- Subject: replacing key() with pipe.
- From: Paul Tchistopolskii <paul at qub dot com>
- Date: Sat, 05 Aug 2000 05:30:19 -0700
- Organization: The Qub Group
- Reply-To: xsl-list at mulberrytech dot com
Dear Sebastian.
In this letter I'm providing the simple invariant of yours test6.xsl
But first some long ( sorry) explanation.
On my box my invariant is working twice as slow ( comparing to key())
on the 'special' file which is:
<?xml version="1.0"?><!DOCTYPE cemetery SYSTEM "cem.dtd"
[
<!ENTITY data1 SYSTEM "data1.xml">
<!ENTITY data2 SYSTEM "data2.xml">
]>
<cemetery>
&data1;
&data2;
&data2;
</cemetery>
I had to produce such a strange file because with your 'smallest'
file the difference in speed was not that easy to find, but on your
'biggest' file I got constant swapping ( Windows, 128 Mb ). So
I produced 'something' 'relatively big, but without
swapping'.
saxon + test6 = 1 minute.
saxon + my test6 = 2 minutes.
<realitycheck>
Honestly - I don't care spending 1 minute or 2 minutes
( or even 3 or 4 minutes ) for this *exotic* activity. It should
be all powered by the repository. Text file is not a good
storage for this kind of information if you want to query
this file every five minutes and if you want to make
that query once per week / day it will not hurt to wait
for 2 minutes instead of one.
</realitycheck>
I can 'improve' the pipe using java extension with side-effects
( the biggest weakness is 'flat -> hierarchical shift which is
based on the weak ( but standard for XSLT ) 'count-based
recursion'. ) It looks that with java extension emulating 1 ( one )
updateable variable this could make it significantly faster.
<realitycheck>
But is it worth trying? Do you really care is it
1 minute or 3 minutes ? Anyway it seems that
it does not scale because of the memory first of all.
And of course - it is ages behind scalability provided
by any SQL server ( including MySQL ).
</realitycheck>
Now what I did. I'm sorry for explaining many details, but
I think it could be interesting what happened with this task.
1. First ( and most important ) I started thinking about the
task itself, about the functionality I have to provide ( not thinking
about the 'key()' or other XSLT stuff at all ).
What test6.xsl actually does :
2.A Query.
- It takes all the /cemetery/person.
- It pulls out :
person/died/date/yr
second name
first name
- Persons should be sorted by : Year, Second name , First Name
2.B Rendering.
- It then renders the list of persons, but the Year is displayed only
for the group.
So here we go.
3. Query part.
<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0"
>
<!--JOB: process cemetery file to make a year catalogue, NOT using keys (1) -->
<xsl:template match="/cemetery">
<doc>
<xsl:for-each select="stone/person">
<xsl:sort select="died/date/yr"/>
<xsl:sort select="name/snm"/>
<xsl:sort select="name/fnm"/>
<person>
<year><xsl:value-of select="died/date/yr"/></year>
<snm><xsl:value-of select="name/snm"/></snm>
<fnm><xsl:value-of select="name/fnm"/></fnm>
</person>
</xsl:for-each>
</doc>
</xsl:template>
</xsl:stylesheet>
I think it is easy to understand what happens here. We are just
blindly translating the requirements for Query part into XSLT.
So this component have produced the stream:
<doc>
<person><year>123</year><snm>NAME</snm><fnm>NAME</fnm></person>
<person><year>123</year><snm>NAME</snm><fnm>NAME</fnm></person>
....
</doc>
Now all we need is to render this 'flat' structure into the 'groups' ( because
we want the year to get displayed only once per 'group'. - as it is in
requirement for Rendering part ). I could write this in XSLScript ;-)
But for the sake of conformance here comes the ugly XSLT call-template.
<side-effect>
In the next version of XSLScript there will be yet another loop compiler
'meta-construction' , not only 'else' ( because I finally got tired with
this loop --> recursion conversion ).
</side-effect>
Whatever. This is again - *typical* recursive XSLT.
- take the first elements from list by some criteria
- draw them
- recursively call yourself with the rest of the list.
<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0"
>
<!--JOB: process cemetery file to make a year catalogue, NOT using keys (2) -->
<xsl:template match="/doc">
<html>
<head>
<title>Protestant Cemetery Catalogue </title>
</head>
<body>
<xsl:call-template name="draw_year">
<xsl:with-param name="list" select="/doc/*"/>
</xsl:call-template>
</body>
</html>
</xsl:template>
<xsl:template name="draw_year">
<xsl:param name="list"/>
<xsl:if test="$list">
<xsl:variable name="year" select="$list[1]/year"/>
<xsl:variable name="n_souls" select="count( $list[year = $year ])"/>
<xsl:variable name="rest" select="$list[ (position() > $n_souls) ]"/>
<h2><xsl:value-of select="$year"/></h2>
<ol>
<xsl:for-each select="$list[ not (position() > $n_souls) ]">
<li><b><xsl:value-of select="snm"/></b>,
<xsl:value-of select="fnm"/></li>
</xsl:for-each>
</ol>
<xsl:call-template name="draw_year">
<xsl:with-param name="list" select="$rest"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Design patterns used.
------------------------------
Ux is about pipes of simple XSLT components. Have you
mentioned that there is no HTML tags in the Query
part at all ?
Another Ux 'design pattern' is that Query - it is producing
some kind of 'formatting objects' for the 'renderer'. Renderer
is just blindly doing the production of HTML.
I wish this explains why I'm not using key(). Those
'select from .. dual' could be hardly produced from the
functional specification ( the code written above is just a
simple reflection of functional specification into simple
and general XSLT constructions. ).
Yes, I have to admit - if not polluting this with some
'other' ugly constructions it works twice as slow ( maybe
tree times as slow ) than key() - based solution.
Should I start polluting this 'plain XSLT' thing with
ugly java hacks, or we can wait 2 minutes instead
of 1 minute ( but keep the code supportable by
anybody ?)
Rgds.Paul.
PS. I encountered *crazy* jumps of the speed on different
boxes and different versions of the VM. On some boxes
SAXON is ( significantly ) faster than XT ( on some 'other
stylesheets' ) because it seems that instant SAXON was
compiled with some tool which works nice with MS VM.
What is the tool? Ah - there are at least 3 of Java boosters
out there and some are specifically Windows oriented.
Benchmarking XSLT is hard, I think.
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list