This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: MSXML vs. Saxon: different handling of tabs & newlines


> I am observing an interesting difference in the way MSXML and 
> Saxon are treating tabs and newlines in my XML instance when viewing 
> the resulting HTML.

The difference is that for MSXML3, the input you supply to the XSLT
processor is in the form of a DOM, and MSXML3 is doing extra whitespace
stripping by default when you build the DOM (i.e. before the tree gets
anywhere near the XSLT processor). I believe it's possible to suppress this.
There are varying views on whether they are conformant in this area, but
since you are building the DOM using a proprietary Microsoft API, it's hard
to point to the spec that they are not conforming to. The final result
certainly defeats the intended effect of the XSLT whitespace rules.

It's actually a problem implementing the whitespace-stripping rules when you
take input from a DOM, since there's a reasonable expectation that the XSLT
processor shouldn't modify the input tree, and doing whitespace-stripping on
the fly as you navigate the tree is likely to be incredibly expensive. If
you supply a DOM as input to Saxon, I copy the whole thing into a new data
structure (which is also expensive).

Mike Kay

 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]