[Bug localedata/22387] Replace unicode sequences <Uxxxx> for characters inside the ASCII printable range
egmont at gmail dot com
sourceware-bugzilla@sourceware.org
Thu Nov 2 20:40:00 GMT 2017
https://sourceware.org/bugzilla/show_bug.cgi?id=22387
Egmont Koblinger <egmont at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |egmont at gmail dot com
--- Comment #6 from Egmont Koblinger <egmont at gmail dot com> ---
(In reply to Andreas Schwab from comment #5)
> See
> <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.
> html#tag_07_03> for the full rules.
Bullet point 2 here says "Within a string, the double-quote character, the
escape character, and the right angle bracket character shall be escaped [...]"
Why not the left angle bracket too? Otherwise you can't tell for sure whether
"<U+0020>" stands for a space, or for literal
lessthan-you-plus-oh-oh-two-oh-greaterthan.
I think it doesn't hurt to remain a bit safer with special characters, e.g.
escape the comma, semicolon, less-than, greater-than, backshash, and whatever
the escape character (typically overridden to slash in locale files)
everywhere.
---
On the other hand, what about non-ASCII characters? Are they allowed as raw
UTF-8, or do they still need to be escaped? Allowing raw UTF-8, such as a
weekday name of "hétfő" rather than "h<U00E9>tf<U0151>" would highly improve
readability of the file.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libc-locales
mailing list