This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Divide and Conquer implementations of str-foldl and str-map (Was: Re: big recursive function)
- From: Dimitre Novatchev <dnovatchev at yahoo dot com>
- To: xsl-list at lists dot mulberrytech dot com
- Date: Wed, 28 Nov 2001 08:23:13 -0800 (PST)
- Subject: [xsl] Divide and Conquer implementations of str-foldl and str-map (Was: Re: big recursive function)
- Reply-to: xsl-list at lists dot mulberrytech dot com
> > I haven't seen yet different way to process characters inside a node
> > text (except making an extension function but it's cheating :-). It looks
> > like the right way to do things from an XSLT point of view.
>
> using a divide and conquer method (see posts of dimitre's) should change
> the recursion depth from being O(n) to O(log n) (at the cost of giving
> up tail recursion) so on a system that doesn't implement tail recursion
> elimination this should be better (and may well be faster anyway).
Here's what I think of as a generic way to solve the majority of problems with deep
recursion when processing long lists of characters (strings).
In my previous posts I already introduced a very general list-folding function
foldl, and its string counterpart -- str-foldl.
I mentioned (I think) how the map() function can be defined using the foldl
function:
map :: (a -> b) -> [a] -> [b]
map f = foldl ((:).f) []
The map function takes a function and a list as arguments and returns a new list,
having the same number of elements as the list-argument, every element of which is
the result of applying the function f() on the corresponding element of the
list-argument.
Another, more understandable definition of map is:
map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = (f(x) : map f xs)
For example,
map (+1) [1,2,3] = [2,3,4]
Using foldl and map we may be able to solve 99% of all problems we have with lists.
Using str-foldl and str-map we may be able to solve 99% of all problems we have with
strings.
The only problem could be that when processing very big lists/strings on XSLT
processors that do not optimize tail-recursion with iteration, there would be an
exhaustive increase of the call stack leading to crash.
The solution in such cases is to use ***divide and conquer*** (DVC) implementations
of foldl and map or, respectively, of str-foldl and str-map.
Here's this implementation:
dvc-str-foldl.xsl:
-----------------
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template name="dvc-str-foldl">
<xsl:param name="pFunc" select="/.."/>
<xsl:param name="pA0"/>
<xsl:param name="pStr"/>
<xsl:choose>
<xsl:when test="not($pStr)">
<xsl:copy-of select="$pA0"/>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vcntList" select="string-length($pStr)"/>
<xsl:choose>
<xsl:when test="$vcntList = 1">
<xsl:apply-templates select="$pFunc[1]">
<xsl:with-param name="arg0" select="$pFunc[position() > 1]"/>
<xsl:with-param name="arg1" select="$pA0"/>
<xsl:with-param name="arg2" select="substring($pStr,1,1)"/>
</xsl:apply-templates>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vHalfLen"
select="floor($vcntList div 2)"/>
<xsl:variable name="vFunResult1">
<xsl:call-template name="dvc-str-foldl">
<xsl:with-param name="pFunc" select="$pFunc"/>
<xsl:with-param name="pA0" select="$pA0"/>
<xsl:with-param name="pStr" select="substring($pStr,
1,
$vHalfLen
)"/>
</xsl:call-template>
</xsl:variable>
<xsl:call-template name="dvc-str-foldl">
<xsl:with-param name="pFunc" select="$pFunc"/>
<xsl:with-param name="pStr"
select="substring($pStr,$vHalfLen+1)"
/>
<xsl:with-param name="pA0" select="$vFunResult1"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
str-dvc-map.xsl:
---------------
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:map-foldl-func="map-foldl-func"
exclude-result-prefixes="xsl map-foldl-func"
>
<xsl:import href="dvc-str-foldl.xsl"/>
<map-foldl-func:map-foldl-func/>
<xsl:template name="str-dvc-map">
<xsl:param name="pFun" select="/.."/>
<xsl:param name="pStr" select="/.."/>
<xsl:variable name="vFoldlFun" select="document('')/*/map-foldl-func:*[1]"/>
<xsl:variable name="vFuncComposition">
<xsl:copy-of select="$vFoldlFun"/>
<xsl:copy-of select="$pFun"/>
</xsl:variable>
<xsl:variable name="vFComposition"
select="msxsl:node-set($vFuncComposition)/*"/>
<xsl:call-template name="dvc-str-foldl">
<xsl:with-param name="pFunc" select="$vFComposition"/>
<xsl:with-param name="pStr" select="$pStr"/>
<xsl:with-param name="pA0" select="/.."/>
</xsl:call-template>
</xsl:template>
<xsl:template name="mapL" match="*[namespace-uri() = 'map-foldl-func']">
<xsl:param name="arg0" select="/.."/>
<xsl:param name="arg1" select="/.."/>
<xsl:param name="arg2" select="/.."/>
<!-- $arg1 must be A0 -->
<xsl:copy-of select="$arg1"/>
<xsl:apply-templates select="$arg0[1]">
<xsl:with-param name="arg1" select="substring($arg2,1,1)"/>
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
Now, to solve the problem of indenting every new line, we simply map every character
to itself, except for the NL character, which we map to NL + "' '" (new line plus
4 spaces):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:testmap="testmap"
exclude-result-prefixes="xsl testmap"
>
<xsl:import href="str-dvc-map.xsl"/>
<testmap:testmap/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vTestMap" select="document('')/*/testmap:*[1]"/>
<xsl:call-template name="str-dvc-map">
<xsl:with-param name="pFun" select="$vTestMap"/>
<xsl:with-param name="pStr" select="string(/*)"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="indentNL" match="*[namespace-uri() = 'testmap']">
<xsl:param name="arg1"/>
<xsl:value-of select="$arg1"/>
<xsl:if test="$arg1=' '">
<xsl:value-of select="' '"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
When the above transformation is applied on the following xml source:
<t>
line0
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
</t>
it produced this output:
line0
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
I also tried it on a 10000 line input and it worked OK (but a little bit slow)
without crashing.
Of course, a more efficient solution will be a special DVC algorithm, using the XSLT
functions contains(), substring-before() and substring-after().
The advantages of foldl and map is that they've been already implemented and can be
readily used to solve multitude of problems, without requiring difficult and
error-prone additional programming.
Cheers,
Dimitre Novatchev.
P.S. The best solution to the original problem is just to enclose the text as a
child of a "blockquote" element... :o)
__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list