This is the mail archive of the
kawa@sourceware.org
mailing list for the Kawa project.
Escaping of non-ASCII characters in XML
- From: Дмитрий <dmymd at yandex dot ru>
- To: kawa at sources dot redhat dot com
- Date: Mon, 23 Jul 2012 15:57:43 +0400
- Subject: Escaping of non-ASCII characters in XML
Hello!
I believe the current XML functions for creating XML and found XML in Kawa
practically unusable for languages with a non-Latin script.
E.g. <p>ÐÐÑÐÐÑÑÐÐ</p> is automatically escaped to
<p>Перевірка</p>.
All non-ASCII characters are escaped.
Does anyone really need this kind of escaping? Kawa's internal HTTP server
escapes strings after this anyway, so in this case it's a mere duplication.
(The server escaping is also not quite adequate for Ukrainian and Russian,
but this is a different issue.)
Is it possible to add "xp.escapeNonAscii = false;" somewhere in the the
gnu.kawa.xml.KNode:toString function (gnu\kawa\xml\KNode.java, after line 32).
[I believe this should turn the escaping off, but I don't have JDK at hand to
check.] xp.escapeNonAscii shouldn't affect control characters (these are
encoded anyway), only characters outside ASCII.
If this escaping is desirable for some reason (though I can't think of any),
is it possible to add some variable like *xml-escape-string* to turn this
escaping off?
---
Yours sincerely,
Dmitry Kushnariov