This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: document() merge DISTINCT
- From: Trevor Nash <tcn at melvaig dot co dot uk>
- To: xsl-list at lists dot mulberrytech dot com
- Date: Wed, 19 Dec 2001 11:45:58 +0000
- Subject: Re: [xsl] document() merge DISTINCT
- Organization: Melvaig Software Engineering Limited
- References: <001901c1886f$dc982370$0401020a@ssh.intern>
- Reply-to: xsl-list at lists dot mulberrytech dot com
>I want to merge these files so that I get a list of all <person> that are in
>any <project> but the preson/@id should be unique, that is, no <person>
>element should be listed twice.
>
>In the book 'XSLT' from Dough Tidwell (chapter 7) there is an example that
>works but is using a lot of disk reads and deep recursion.
>It goes like this:
>
>1: build a variable var1 as a white-space separated sorted list of all @id .
>(using <xsl:for-each select="document(...)"..../> )
>2: build a variable var2 of unique @id from var1 (by recursion);
>3: with var2 call a template that calls <xsl:for-each select=
>"document(....)"../> for each id in var2 and produces the output.
>
>Is there a better way to do this?
>
Deep recursion and "repeated" calls to document() are not necessarily
a problem: do you say there are a lot of disk reads because you have
observed this, or are you just guessing from looking at the code? An
XSLT processor should only read each input document once, regardless
of how many times you call document().
Also, many processors are able to turn what looks like deep recursion
into a loop. This usually works best when the recursive call is the
last thing in the template.
If your input documents are large, you might simply be running into
the problem of not having enough real memory.
If the recursion is a problem (i.e. you are using a lot of memory,
causing page swaps to disk) then look for 'divide and conquer' in the
archive. This is a technique for reducing depth of recursion.
I do not have the book you quote, so its hard to say if the algorithm
presented is the 'best' one in your case. But one suggestion would be
to use the node-set extension function, and use variables containing
lists of nodes rather than a space-delimited string. Keys may also
help - look up meunchian grouping.
Regards,
Trevor Nash
--
Traditional training & distance learning,
Consultancy by email
Melvaig Software Engineering Limited
voice: +44 (0) 1445 771 271
email: tcn@melvaig.co.uk
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list