This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
RE: Un-escape and re-transform
- To: <xsl-list at lists dot mulberrytech dot com>
- Subject: RE: [xsl] Un-escape and re-transform
- From: "Robert C. Lyons" <boblyons at unidex dot com>
- Date: Tue, 10 Apr 2001 10:47:02 -0400
- Cc: <bas dot alberts at group2000 dot nl>
- Reply-To: xsl-list at lists dot mulberrytech dot com
Bas writes:
> My Content Provider delivers XML files with partially escaped HTML tags,
for
> example:
> <content>
> <web>
> <P>This is text.</P>
> <P>This is more text.</P>
> </web>
> </content>
>
> My quest is to replace the "<" by the un-escaped "<" character, and
then
> redo the XSLT for that <P>...</P> bit.
Bas,
I would beg the Content Provider to place well-formed
HTML (or XHTML) in the XML documents (rather than HTML,
in which the markup is escaped).
A few weeks ago, we had the exact same problem.
We were lucky, since the sender of the XML data was
willing to embed well-formed HTML in the XML document.
I hope that you are as lucky.
If not, then perhaps you could use the following XSLT
stylesheet to unescape the markup:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="text()">
<xsl:value-of disable-output-escaping="yes" select="."/>
</xsl:template>
<xsl:template priority="-1"
match="@* | * | text() | processing-instruction() |
comment()">
<!-- Identity transformation. -->
<xsl:copy>
<xsl:apply-templates
select="@* | * | text() | processing-instruction() | comment()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The problem with this approach is that it
will unescape markup characters that are
not really markup. For example:
<content>
<web>
<P>C'est dommage. :-< </P>
</web>
</content>
If there's any chance that the escaped
HTML will contain markup characters that are
not really markup, then I think you'll need
to write a more sophisticated unescape
algorithm.
Hope this helps.
Bob
<sig name = 'Bob Lyons'
title = 'B2B Integration Consultant'
company = 'Unidex, Inc.'
phone = '+1-732-975-9877'
email = 'boblyons@unidex.com'
url = 'http://www.unidex.com/'
product = 'XML Convert: transforms flat files to XML and vice versa' />
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list