This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: Problems with characters
- From: Tony Graham <Tony dot Graham at sun dot com>
- To: xsl-list at lists dot mulberrytech dot com
- Date: Wed, 20 Feb 2002 14:13:20 +0000
- Subject: Re: [xsl] Problems with characters
- References: <F175zHu2Mcv9jg22HuP0000f144@hotmail.com>
- Reply-to: xsl-list at lists dot mulberrytech dot com
Ragulf Pickaxe wrote at 20 Feb 2002 07:56:04 +0000:
> The left character now depicts what I intended it to show but in the
> homepage it shows as:
> [ ę ] (ę ) LATIN SMALL LETTER E WITH OGONEK
> >>ø instead of ¸
> The left chacter also suddently showed the correct character in this mail,
> but the homepage showed this as:
> (It does not show what I see but I see the charater like u with same IGONEK
> as the former character or perhaps as greek character (the one used in
> measuring meaning 1e-6)).
> >>å is depicted as å
> This one is the only one showing the correct result all over (also on the
> HTML page).
You are mixing ISO 8859-1 (Latin1) and ISO 8859-13 (Latin7,
a.k.a. Baltic Rim).
See the code pages at http://www.czyborra.com/charsets/iso8859.html
You want to use:
å LATIN SMALL LETTER A WITH RING ABOVE
æ LATIN SMALL LETTER AE
ø LATIN SMALL LETTER O WITH STROKE
Quoting from iso8859-13.txt from czyborra.com:
------------------------------------------------------------
=B8 U+00F8 LATIN SMALL LETTER O WITH STROKE
...
=BF U+00E6 LATIN SMALL LETTER AE
...
=E5 U+00E5 LATIN SMALL LETTER A WITH RING ABOVE
=E6 U+0119 LATIN SMALL LETTER E WITH OGONEK
...
=F8 U+0173 LATIN SMALL LETTER U WITH OGONEK
------------------------------------------------------------
When you say that you see LATIN SMALL LETTER E WITH OGONEK and LATIN
SMALL LETTER U WITH OGONEK, you are (i.e. your software is)
interpreting your text as being in ISO 8859-13. You are seeing the
ISO 8859-13 characters that are at the same code points as the
characters of interest are in ISO 8859-1.
When you send mail and most of the rest of the world sees ¸,
CEDILLA, and ¿, INVERTED QUESTION MARK, we are seeing the
characters of interest as encoded in ISO 8859-13 but we're
interpreting them as the ISO 8859-1 characters at those ISO 8859-13
code points. Some aspect of how you composed your mail managed to map
the characters of interest to their ISO 8859-13 positions but, as
someone already noted, your email didn't indicate its encoding, so our
mail agents interpreted your email as ISO 8859-1 text.
This solves the mystery but doesn't solve your problem.
Is it possible that the intermittent correct display is because your
browser is trying to autodetect the encoding and failing? Did the
comment that you deleted contain non-ASCII characters that may have
confused autodetection? You may get consistent ISO 8859-1 results if
you manually select the character set/encoding in your browser.
If you are generating HTML, you can include a META element that
indicates the character set, which may help solve the problem.
You could also specify UTF-8 as the encoding for the output of your
two stylesheets, but that may cause more problems if the rest of your
software can't really handle UTF-8.
Regards,
Tony Graham
------------------------------------------------------------------------
XML Technology Center - Dublin mailto:tony.graham@sun.com
Sun Microsystems Ireland Ltd Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3 x(70)19708
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list