This is the mail archive of the kawa@sourceware.org mailing list for the Kawa project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Escaping of non-ASCII characters in XML


Hello!

  I believe the current XML functions for creating XML and found XML in Kawa
practically unusable for languages with a non-Latin script.

  E.g. <p>ÐÐÑÐÐÑÑÐÐ</p> is automatically escaped to
<p>&#x41F;&#x435;&#x440;&#x435;&#x432;&#x456;&#x440;&#x43A;&#x430;</p>.
All non-ASCII characters are escaped.

  Does anyone really need this kind of escaping? Kawa's internal HTTP server
escapes strings after this anyway, so in this case it's a mere duplication.
(The server escaping is also not quite adequate for Ukrainian and Russian,
but this is a different issue.)

  Is it possible to add "xp.escapeNonAscii = false;" somewhere in the the
gnu.kawa.xml.KNode:toString function (gnu\kawa\xml\KNode.java, after line 32).
[I believe this should turn the escaping off, but I don't have JDK at hand to
check.] xp.escapeNonAscii shouldn't affect control characters (these are
encoded anyway), only characters outside ASCII.

  If this escaping is desirable for some reason (though I can't think of any),
is it possible to add some variable like *xml-escape-string* to turn this
escaping off?

---
Yours sincerely,
Dmitry Kushnariov


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]