This is the mail archive of the xsl-list@mulberrytech.com mailing list .

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: document() merge DISTINCT

From: Trevor Nash <tcn at melvaig dot co dot uk>
To: xsl-list at lists dot mulberrytech dot com
Date: Wed, 19 Dec 2001 11:45:58 +0000
Subject: Re: [xsl] document() merge DISTINCT
Organization: Melvaig Software Engineering Limited
References: <001901c1886f$dc982370$0401020a@ssh.intern>
Reply-to: xsl-list at lists dot mulberrytech dot com

>I want to merge these files so that I get a list of all <person> that are in
>any <project> but the preson/@id should be unique, that is, no <person>
>element should be listed twice.
>
>In the book 'XSLT' from Dough Tidwell (chapter 7) there is an example that
>works but is using a lot of disk reads and deep recursion.
>It goes like this:
>
>1: build a variable var1 as a white-space separated sorted list of all @id .
>(using <xsl:for-each select="document(...)"..../> )
>2: build a variable var2 of unique @id from  var1 (by recursion);
>3: with var2 call a template that  calls <xsl:for-each select=
>"document(....)"../> for each id in var2 and produces the output.
>
>Is there a better way to do this?
>
Deep recursion and "repeated" calls to document() are not necessarily
a problem: do you say there are a lot of disk reads because you have
observed this, or are you just guessing from looking at the code?  An
XSLT processor should only read each input document once, regardless
of how many times you call document().
Also, many processors are able to turn what looks like deep recursion
into a loop.  This usually works best when the recursive call is the
last thing in the template.
If your input documents are large, you might simply be running into
the problem of not having enough real memory.

If the recursion is a problem (i.e. you are using a lot of memory,
causing page swaps to disk) then look for 'divide and conquer' in the
archive.  This is a technique for reducing depth of recursion.

I do not have the book you quote, so its hard to say if the algorithm
presented is the 'best' one in your case.  But one suggestion would be
to use the node-set extension function, and use variables containing
lists of nodes rather than a space-delimited string.  Keys may also
help - look up meunchian grouping.

Regards,
Trevor Nash

--
Traditional training & distance learning,
Consultancy by email

Melvaig Software Engineering Limited
voice:     +44 (0) 1445 771 271 
email:     tcn@melvaig.co.uk

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Follow-Ups:
- Simple search help
  - From: Andrew Welch

References:
- document() merge DISTINCT
  - From: Alex Schuetz

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]