Bug 22387 - Replace unicode sequences <Uxxxx> for characters inside the ASCII printable range
Summary: Replace unicode sequences <Uxxxx> for characters inside the ASCII printable r...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: localedata (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: 2.27
Assignee: Mike FABIAN
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-02 13:59 IST by Claude Paroz
Modified: 2017-11-16 23:39 IST (History)
6 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2017-11-02 00:00:00
fweimer: security-


Attachments
Partial patch for opinions (5.63 KB, patch)
2017-11-02 14:15 IST, Claude Paroz
Details | Diff
Complete patch (194.92 KB, patch)
2017-11-07 15:12 IST, Claude Paroz
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Claude Paroz 2017-11-02 13:59:59 IST
Quoting Mike Fabian from #22382:

>> As a side note, I see less unicode sequence codes like <U0063> in locale
>> files. Do you have a new policy in place?

>We agreed that it is OK to use ASCII directly, so one has to use <U....>
>only for stuff which is not ASCII.

>> Would you like patches for more
>> global replacements for all files?

>I think yes. When we started to use more ASCII a while ago, we did not
>do global replacements and changed it only in the files we touched anyway
>to see whether it would cause any problems. As far as I know we did not
>encounter any problems so far, so it seems OK to do it globally.
Comment 1 Claude Paroz 2017-11-02 14:15:41 IST
Created attachment 10570 [details]
Partial patch for opinions

Hereby a first draft of what the patch could be. Is this the right direction? Do you want one big patch or anything else?
Comment 2 Mike FABIAN 2017-11-02 17:00:42 IST
(In reply to Claude Paroz from comment #1)
> Created attachment 10570 [details]
> Partial patch for opinions
> 
> Hereby a first draft of what the patch could be. Is this the right
> direction? 

Yes, but I should have been clearer that even some ASCII characters are
not allowed, for example % is usually the comment character, so it
cannot be used like this:

punct   !;";#;$;%;&;';(;);*;+;,;-;.;<U002F>;:;;;<;=;>;?;@;[;\;];^;_;`;{;|;};~

And / is usually the line continuation character.

> Do you want one big patch or anything else?

Whatever is easiest, I could also write a script to do it ...
Comment 3 Claude Paroz 2017-11-02 17:05:58 IST
OK, I'll special case '%' and '/'.

I'm also using a script to do much of the changes. A quick manual review allows for example for comment deletion where it is merely a copy of the (now unobfuscated) value.
Comment 4 Claude Paroz 2017-11-02 17:21:08 IST
But hopefully we can still use '%' and '/' when they are inside a string (see for example the d_fmt   "%d/%m/%y" line in the current an_ES locale).
Comment 5 Andreas Schwab 2017-11-02 17:38:15 IST
The escape character is also special inside strings.

See <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_03> for the full rules.
Comment 6 Egmont Koblinger 2017-11-02 20:40:12 IST
(In reply to Andreas Schwab from comment #5)

> See
> <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.
> html#tag_07_03> for the full rules.

Bullet point 2 here says "Within a string, the double-quote character, the escape character, and the right angle bracket character shall be escaped [...]"

Why not the left angle bracket too? Otherwise you can't tell for sure whether "<U+0020>" stands for a space, or for literal lessthan-you-plus-oh-oh-two-oh-greaterthan.

I think it doesn't hurt to remain a bit safer with special characters, e.g. escape the comma, semicolon, less-than, greater-than, backshash, and whatever the escape character (typically overridden to slash in locale files) everywhere.

---

On the other hand, what about non-ASCII characters? Are they allowed as raw UTF-8, or do they still need to be escaped? Allowing raw UTF-8, such as a weekday name of "hétfő" rather than "h<U00E9>tf<U0151>" would highly improve readability of the file.
Comment 7 keld@keldix.com 2017-11-02 23:34:46 IST
I think we should not do this, as it would make locales unusable
with ebcdic encodings. I am also unsure how it will work with utf-16.

I propose you use better mnemonics for the ascii range, such as <a> for a,
etc.  That is, use the mnemonics defined in the POSIX standard for the ascii range.

best regards
keld

On Thu, Nov 02, 2017 at 02:15:41PM +0000, claude at 2xlibre dot net wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> 
> --- Comment #1 from Claude Paroz <claude at 2xlibre dot net> ---
> Created attachment 10570 [details]
>   --> https://sourceware.org/bugzilla/attachment.cgi?id=10570&action=edit
> Partial patch for opinions
> 
> Hereby a first draft of what the patch could be. Is this the right direction?
> Do you want one big patch or anything else?
> 
> -- 
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 8 Carlos O'Donell 2017-11-03 03:12:25 IST
(In reply to keld@keldix.com from comment #7)
> I think we should not do this, as it would make locales unusable
> with ebcdic encodings. I am also unsure how it will work with utf-16.

Please provide a justification for this requirement to support EBCDIC and UTF-16, included systems that would be impacted today by this change.

I spoke with Ulrich Drepper directly, and he did point out that the design idea behind using <Uxxxx> sequences was indeed to support the locales on systems that had other encodings like EBCDIC, but with the rise of UTF-8 as the defacto standard, no such systems have really materialized.
 
> I propose you use better mnemonics for the ascii range, such as <a> for a,
> etc.  That is, use the mnemonics defined in the POSIX standard for the ascii
> range.

I disagree strongly with this, why use '<a>' instead of 'a'? Please provide strong rationale for why we should keep using the <Uxxxx> format.
Comment 9 Andreas Schwab 2017-11-03 06:49:03 IST
(In reply to Egmont Koblinger from comment #6)
> Bullet point 2 here says "Within a string, the double-quote character, the
> escape character, and the right angle bracket character shall be escaped
> [...]"
> 
> Why not the left angle bracket too?

I think "right" is a typo here.  It doesn't really make sense otherwise.
Comment 10 Egmont Koblinger 2017-11-03 09:52:28 IST
(In reply to Andreas Schwab from comment #9)

> I think "right" is a typo here.  It doesn't really make sense otherwise.

Or they meant "right angle" (i.e. 90 degrees) bracket :-D
Comment 11 Egmont Koblinger 2017-11-03 09:56:16 IST
I don't understand the EBCDIC worries at all.

These locale definition files are in ASCII. If you interpret these same files in EBCDIC, section names and property names don't make any sense, and neither do encoded characters such as "<U0020>", I mean it's no longer less/greater-than, uppercase U and digits.

Then, if you iconv the file, the resulting <U0020> and friends still define Unicode codepoints and not EBCDIC ones.

So, in order to use these files in an EBCDIC environment, they need to be converted on two different levels.

This does not become any harder or any more complicated by allowing plain ASCII characters.
Comment 12 Mike FABIAN 2017-11-03 15:43:47 IST
(In reply to Andreas Schwab from comment #5)
> The escape character is also special inside strings.
> 
> See
> <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.
> html#tag_07_03> for the full rules.

To understand what that means in practice, here an example.
The following source:

d_fmt	"%d/%m/%Y  %% // /% %/ \% \/ \\"

produces this:

[root@taka /]# LC_ALL=tpi_PG.UTF-8 locale -k d_fmt
d_fmt="%d%m%Y  %% / % % \% \ \\"

I.e. the current source for tpi_PG has an error in d_fmt,
instead of 

"%d/%m/%Y" 

it should be

"%d//%m//%Y"
Comment 13 Egmont Koblinger 2017-11-03 15:51:24 IST
Can't a proposed patch be verified by comparing the compiled locale files byte-by-byte?
Comment 14 joseph@codesourcery.com 2017-11-03 16:49:06 IST
Furthermore, glibc effectively requires locales to be ASCII-compatible, in 
that plenty of code dealing with strings in glibc directly generates or 
interprets ASCII characters based on string or character constants in the 
glibc code.  (There may be a few variations for a few characters in some 
locales, and it can't be assumed that toupper ('i') or tolower ('I') are 
ASCII-compatible because of Turkish locales.)  Thus we can assume that 
localedef is run in an ASCII-compatible locale.  EBCDIC variants are 
supported by iconv for character set conversions; they are *not* supported 
as locale character sets.
Comment 15 Claude Paroz 2017-11-06 13:23:55 IST
(In reply to Egmont Koblinger from comment #13)
> Can't a proposed patch be verified by comparing the compiled locale files
> byte-by-byte?

Thanks, that was an excellent suggestion that allowed me to spot some errors in my (almost-ready) forthcoming patch.
Comment 16 Claude Paroz 2017-11-06 13:26:03 IST
(In reply to Mike FABIAN from comment #12)
> I.e. the current source for tpi_PG has an error in d_fmt,
> instead of 
> 
> "%d/%m/%Y" 
> 
> it should be
> 
> "%d//%m//%Y"

I reported similar errors in #22403 for locales an_ES, kab_DZ and om_ET.
Comment 17 Claude Paroz 2017-11-07 15:12:13 IST
Created attachment 10577 [details]
Complete patch

Here's the complete patch implementing these sequence replacements.

I'm open to split the patch in smaller chunks if it's easier to review.
Comment 18 keld@keldix.com 2017-11-07 23:54:42 IST
Hi

I am not sure if I can write a very strong case against using ASCII in strings.
I hav no practical experience with problems, but I see a number of possible
conflicts. You could also say that because of the design used until now,
where all our locales have been character coding independent, we have
not seen any problems!

I am the editor of ISO 14652 and ISO 30112, and of the 100 pages annex in
the POSIX standard that originally introduced the codeset independent locales
for POSIX and thus Linux. 14652 and 30112 are the standards that define many
of the extensions from POSIX that we use in glibc for i18n and l10n.
From an architectual view I would really like that we keep glibc locales
character coding independent, so our locales can be used without change 
on all systems that adheres to those standards.

But even if we restrict ourselves to only look at glibc implementations,
using coding dependent locales may cause problems. Not on everyday Linux
systems, were we mostly operate in UTF-8, and sometimes in other coded
character sets, but on other systems. 

gcc and glibc is probably the most ported C compiler and C library in the world.
Some of the platforms it has been ported to run in non-ascii compatible
environments, I think this includes

    MS windows, which uses UTF-16, and which now includes an Ubuntu system
    MAC OS/ IOS , which uses UTF16, and where gcc/glibc ports exists
    EBCDIC machines, where gcc/glibc ports exist - they run many banking and aviation systems
    Embedded systems where many kinds of character sets are used.
    Older systems in Eastern Asia, still using older Eastern Asia 14-bit character sets.

I do see the need for better looking locales, They would be easer to write and debug.
Thereofre I propose that we use the mnemonics defined in ISO 14652/ISO 30112 at least
for the ASCII characters. These were also used in the original locales that I wrote
and Ulrich Drepper used for his initial work for glibc. At some point Ulrich decided
to use Uxxxx mnemonics, which made locales more unreadable. I do agree that using Uxxxx
is a good solution for the characters that are not known to everybody, such as Chinese,
Korean and Japanese characters. This gives a chance to everybody in the world to
work on locales using these characters, which actually in our moderne world means
all locales in the world, as we all may use full UTF-8 or the like.

best regards
Keld

On Fri, Nov 03, 2017 at 03:12:25AM +0000, carlos at redhat dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> 
> Carlos O'Donell <carlos at redhat dot com> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |carlos at redhat dot com
> 
> --- Comment #8 from Carlos O'Donell <carlos at redhat dot com> ---
> (In reply to keld@keldix.com from comment #7)
> > I think we should not do this, as it would make locales unusable
> > with ebcdic encodings. I am also unsure how it will work with utf-16.
> 
> Please provide a justification for this requirement to support EBCDIC and
> UTF-16, included systems that would be impacted today by this change.
> 
> I spoke with Ulrich Drepper directly, and he did point out that the design idea
> behind using <Uxxxx> sequences was indeed to support the locales on systems
> that had other encodings like EBCDIC, but with the rise of UTF-8 as the defacto
> standard, no such systems have really materialized.
> 
> > I propose you use better mnemonics for the ascii range, such as <a> for a,
> > etc.  That is, use the mnemonics defined in the POSIX standard for the ascii
> > range.
> 
> I disagree strongly with this, why use '<a>' instead of 'a'? Please provide
> strong rationale for why we should keep using the <Uxxxx> format.
> 
> -- 
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 19 keld@keldix.com 2017-11-08 20:00:13 IST
On Fri, Nov 03, 2017 at 03:12:25AM +0000, carlos at redhat dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> 
> Carlos O'Donell <carlos at redhat dot com> changed:
> 
> > I propose you use better mnemonics for the ascii range, such as <a> for a,
> > etc.  That is, use the mnemonics defined in the POSIX standard for the ascii
> > range.
> 
> I disagree strongly with this, why use '<a>' instead of 'a'? Please provide
> strong rationale for why we should keep using the <Uxxxx> format.

If you use <a> instead of 'a' then the compiled locale would have a well-defined 
content, when the source encoding and the target encoding differ. This is not
the case when just using 'a', giving wrong results.

I listed a number of scenarios where you would use different source and
target character encoding in my previous mail. This involves major OS'es
like OS X, Microsoft Windows, IOS and embedded systems.

Best regards
keld
Comment 20 keld@keldix.com 2017-11-09 10:19:17 IST
On Fri, Nov 03, 2017 at 09:56:16AM +0000, egmont at gmail dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> 
> --- Comment #11 from Egmont Koblinger <egmont at gmail dot com> ---
> I don't understand the EBCDIC worries at all.
> 
> These locale definition files are in ASCII. If you interpret these same files
> in EBCDIC, section names and property names don't make any sense, and neither
> do encoded characters such as "<U0020>", I mean it's no longer
> less/greater-than, uppercase U and digits.

Yes all source files should be converted from Ascii to the ebcdic in question.
This is also the case on UTF-16 systems, the source files should be converted
from some sort of ascii compatible encoding to UTF-16. Or the other way - if you
move sources from a non ascii-compatible system to an ascii-compatible system.

This process can be done automatically using eg iconv.

> Then, if you iconv the file, the resulting <U0020> and friends still define
> Unicode codepoints and not EBCDIC ones.

No they are not unicode (or UCS) codepoints. When you compile the locale into a binary
format, then you apply an EBCDIC charmap, and the symbolic <uxxxx> character names get
encoded according to the EBCDIC encoding applied by localedef -f option question.

> So, in order to use these files in an EBCDIC environment, they need to be
> converted on two different levels.

No, only one level of conversion is needed and that can be fully automated.

> This does not become any harder or any more complicated by allowing plain ASCII
> characters.

Well, not so, if you operate in an environment with a source encoding different
from the ebcdic target encoding, and vice versa. 

best regards
Keld
Comment 21 joseph@codesourcery.com 2017-11-09 16:31:17 IST
On Thu, 9 Nov 2017, keld at keldix dot com wrote:

> Yes all source files should be converted from Ascii to the ebcdic in question.
> This is also the case on UTF-16 systems, the source files should be converted
> from some sort of ascii compatible encoding to UTF-16. Or the other way - if
> you
> move sources from a non ascii-compatible system to an ascii-compatible system.
> 
> This process can be done automatically using eg iconv.

No, it can't be done automatically, without having information somewhere 
about which character set each source file is in (it's entirely possible 
some, e.g. those representing expected output of testcases, are in mixed 
character sets - and in any case represent particular sequences of octets 
that must be preserved because they are to be compared against test output 
in particular locales).

glibc does not make any attempt to support locales that are not more or 
less ASCII compatible, and does not make any attempt to support 16-bit 
bytes (which are not supported by POSIX either) which would be needed for 
UTF-16 to be a valid locale encoding.  We should not pretend that it does, 
any more than we should pretend it supports non-ELF object formats.
Comment 22 cvs-commit@gcc.gnu.org 2017-11-14 08:09:44 IST
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  a259f5d388d6195da958b2d147d17c2e2d16b857 (commit)
      from  cae87e64dca14f50da7bbd99085c7f5e413ad0f8 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a259f5d388d6195da958b2d147d17c2e2d16b857

commit a259f5d388d6195da958b2d147d17c2e2d16b857
Author: Claude Paroz <claude@2xlibre.net>
Date:   Thu Nov 2 15:10:42 2017 +0100

    Replaced unicode sequences in the ASCII printable range
    
    	[BZ #22387]
    	* localedata/locales/aa_DJ: Improved readibility by replacing
    	<Uxxxx> sequences in the ASCII printable range by their ASCII
    	character equivalents.
    	* localedata/locales/aa_ER: Likewise.
    	* localedata/locales/aa_ER@saaho: Likewise.
    	* localedata/locales/aa_ET: Likewise.
    	* localedata/locales/af_ZA: Likewise.
    	* localedata/locales/agr_PE: Likewise.
    	* localedata/locales/ak_GH: Likewise.
    	* localedata/locales/am_ET: Likewise.
    	* localedata/locales/anp_IN: Likewise.
    	* localedata/locales/ar_AE: Likewise.
    	* localedata/locales/ar_BH: Likewise.
    	* localedata/locales/ar_DZ: Likewise.
    	* localedata/locales/ar_EG: Likewise.
    	* localedata/locales/ar_IN: Likewise.
    	* localedata/locales/ar_IQ: Likewise.
    	* localedata/locales/ar_JO: Likewise.
    	* localedata/locales/ar_KW: Likewise.
    	* localedata/locales/ar_LB: Likewise.
    	* localedata/locales/ar_LY: Likewise.
    	* localedata/locales/ar_MA: Likewise.
    	* localedata/locales/ar_OM: Likewise.
    	* localedata/locales/ar_QA: Likewise.
    	* localedata/locales/ar_SA: Likewise.
    	* localedata/locales/ar_SD: Likewise.
    	* localedata/locales/ar_SS: Likewise.
    	* localedata/locales/ar_SY: Likewise.
    	* localedata/locales/ar_TN: Likewise.
    	* localedata/locales/ar_YE: Likewise.
    	* localedata/locales/as_IN: Likewise.
    	* localedata/locales/ast_ES: Likewise.
    	* localedata/locales/ayc_PE: Likewise.
    	* localedata/locales/az_AZ: Likewise.
    	* localedata/locales/az_IR: Likewise.
    	* localedata/locales/be_BY: Likewise.
    	* localedata/locales/be_BY@latin: Likewise.
    	* localedata/locales/bem_ZM: Likewise.
    	* localedata/locales/ber_DZ: Likewise.
    	* localedata/locales/ber_MA: Likewise.
    	* localedata/locales/bg_BG: Likewise.
    	* localedata/locales/bhb_IN: Likewise.
    	* localedata/locales/bho_IN: Likewise.
    	* localedata/locales/bi_VU: Likewise.
    	* localedata/locales/bn_BD: Likewise.
    	* localedata/locales/bn_IN: Likewise.
    	* localedata/locales/bo_CN: Likewise.
    	* localedata/locales/bo_IN: Likewise.
    	* localedata/locales/br_FR: Likewise.
    	* localedata/locales/brx_IN: Likewise.
    	* localedata/locales/bs_BA: Likewise.
    	* localedata/locales/byn_ER: Likewise.
    	* localedata/locales/ca_AD: Likewise.
    	* localedata/locales/ca_ES: Likewise.
    	* localedata/locales/ca_FR: Likewise.
    	* localedata/locales/ca_IT: Likewise.
    	* localedata/locales/ce_RU: Likewise.
    	* localedata/locales/chr_US: Likewise.
    	* localedata/locales/cmn_TW: Likewise.
    	* localedata/locales/crh_UA: Likewise.
    	* localedata/locales/cs_CZ: Likewise.
    	* localedata/locales/csb_PL: Likewise.
    	* localedata/locales/cv_RU: Likewise.
    	* localedata/locales/cy_GB: Likewise.
    	* localedata/locales/da_DK: Likewise.
    	* localedata/locales/de_AT: Likewise.
    	* localedata/locales/de_BE: Likewise.
    	* localedata/locales/de_CH: Likewise.
    	* localedata/locales/de_DE: Likewise.
    	* localedata/locales/de_IT: Likewise.
    	* localedata/locales/de_LI: Likewise.
    	* localedata/locales/de_LU: Likewise.
    	* localedata/locales/doi_IN: Likewise.
    	* localedata/locales/dv_MV: Likewise.
    	* localedata/locales/dz_BT: Likewise.
    	* localedata/locales/el_CY: Likewise.
    	* localedata/locales/el_GR: Likewise.
    	* localedata/locales/en_AG: Likewise.
    	* localedata/locales/en_AU: Likewise.
    	* localedata/locales/en_BW: Likewise.
    	* localedata/locales/en_CA: Likewise.
    	* localedata/locales/en_DK: Likewise.
    	* localedata/locales/en_GB: Likewise.
    	* localedata/locales/en_HK: Likewise.
    	* localedata/locales/en_IE: Likewise.
    	* localedata/locales/en_IL: Likewise.
    	* localedata/locales/en_IN: Likewise.
    	* localedata/locales/en_NG: Likewise.
    	* localedata/locales/en_NZ: Likewise.
    	* localedata/locales/en_PH: Likewise.
    	* localedata/locales/en_SG: Likewise.
    	* localedata/locales/en_US: Likewise.
    	* localedata/locales/en_ZA: Likewise.
    	* localedata/locales/en_ZM: Likewise.
    	* localedata/locales/en_ZW: Likewise.
    	* localedata/locales/eo: Likewise.
    	* localedata/locales/es_AR: Likewise.
    	* localedata/locales/es_BO: Likewise.
    	* localedata/locales/es_CL: Likewise.
    	* localedata/locales/es_CO: Likewise.
    	* localedata/locales/es_CR: Likewise.
    	* localedata/locales/es_CU: Likewise.
    	* localedata/locales/es_DO: Likewise.
    	* localedata/locales/es_EC: Likewise.
    	* localedata/locales/es_ES: Likewise.
    	* localedata/locales/es_GT: Likewise.
    	* localedata/locales/es_HN: Likewise.
    	* localedata/locales/es_MX: Likewise.
    	* localedata/locales/es_NI: Likewise.
    	* localedata/locales/es_PA: Likewise.
    	* localedata/locales/es_PE: Likewise.
    	* localedata/locales/es_PR: Likewise.
    	* localedata/locales/es_PY: Likewise.
    	* localedata/locales/es_SV: Likewise.
    	* localedata/locales/es_US: Likewise.
    	* localedata/locales/es_UY: Likewise.
    	* localedata/locales/es_VE: Likewise.
    	* localedata/locales/et_EE: Likewise.
    	* localedata/locales/eu_ES: Likewise.
    	* localedata/locales/eu_ES@euro: Likewise.
    	* localedata/locales/fa_IR: Likewise.
    	* localedata/locales/ff_SN: Likewise.
    	* localedata/locales/fi_FI: Likewise.
    	* localedata/locales/fil_PH: Likewise.
    	* localedata/locales/fo_FO: Likewise.
    	* localedata/locales/fr_BE: Likewise.
    	* localedata/locales/fr_CA: Likewise.
    	* localedata/locales/fr_CH: Likewise.
    	* localedata/locales/fr_FR: Likewise.
    	* localedata/locales/fr_LU: Likewise.
    	* localedata/locales/fur_IT: Likewise.
    	* localedata/locales/fy_DE: Likewise.
    	* localedata/locales/fy_NL: Likewise.
    	* localedata/locales/ga_IE: Likewise.
    	* localedata/locales/gd_GB: Likewise.
    	* localedata/locales/gez_ER: Likewise.
    	* localedata/locales/gez_ET: Likewise.
    	* localedata/locales/gl_ES: Likewise.
    	* localedata/locales/gu_IN: Likewise.
    	* localedata/locales/gv_GB: Likewise.
    	* localedata/locales/ha_NG: Likewise.
    	* localedata/locales/hak_TW: Likewise.
    	* localedata/locales/he_IL: Likewise.
    	* localedata/locales/hi_IN: Likewise.
    	* localedata/locales/hif_FJ: Likewise.
    	* localedata/locales/hne_IN: Likewise.
    	* localedata/locales/hr_HR: Likewise.
    	* localedata/locales/hsb_DE: Likewise.
    	* localedata/locales/ht_HT: Likewise.
    	* localedata/locales/hu_HU: Likewise.
    	* localedata/locales/hy_AM: Likewise.
    	* localedata/locales/i18n: Likewise.
    	* localedata/locales/ia_FR: Likewise.
    	* localedata/locales/id_ID: Likewise.
    	* localedata/locales/ig_NG: Likewise.
    	* localedata/locales/ik_CA: Likewise.
    	* localedata/locales/is_IS: Likewise.
    	* localedata/locales/it_CH: Likewise.
    	* localedata/locales/it_IT: Likewise.
    	* localedata/locales/iu_CA: Likewise.
    	* localedata/locales/ja_JP: Likewise.
    	* localedata/locales/ka_GE: Likewise.
    	* localedata/locales/kk_KZ: Likewise.
    	* localedata/locales/kl_GL: Likewise.
    	* localedata/locales/kn_IN: Likewise.
    	* localedata/locales/ko_KR: Likewise.
    	* localedata/locales/kok_IN: Likewise.
    	* localedata/locales/ks_IN: Likewise.
    	* localedata/locales/ks_IN@devanagari: Likewise.
    	* localedata/locales/ku_TR: Likewise.
    	* localedata/locales/kw_GB: Likewise.
    	* localedata/locales/ky_KG: Likewise.
    	* localedata/locales/lb_LU: Likewise.
    	* localedata/locales/lg_UG: Likewise.
    	* localedata/locales/li_BE: Likewise.
    	* localedata/locales/li_NL: Likewise.
    	* localedata/locales/lij_IT: Likewise.
    	* localedata/locales/ln_CD: Likewise.
    	* localedata/locales/lo_LA: Likewise.
    	* localedata/locales/lt_LT: Likewise.
    	* localedata/locales/lv_LV: Likewise.
    	* localedata/locales/lzh_TW: Likewise.
    	* localedata/locales/mag_IN: Likewise.
    	* localedata/locales/mai_IN: Likewise.
    	* localedata/locales/mg_MG: Likewise.
    	* localedata/locales/mhr_RU: Likewise.
    	* localedata/locales/mi_NZ: Likewise.
    	* localedata/locales/mk_MK: Likewise.
    	* localedata/locales/ml_IN: Likewise.
    	* localedata/locales/mn_MN: Likewise.
    	* localedata/locales/mni_IN: Likewise.
    	* localedata/locales/mr_IN: Likewise.
    	* localedata/locales/ms_MY: Likewise.
    	* localedata/locales/mt_MT: Likewise.
    	* localedata/locales/my_MM: Likewise.
    	* localedata/locales/nan_TW: Likewise.
    	* localedata/locales/nan_TW@latin: Likewise.
    	* localedata/locales/nb_NO: Likewise.
    	* localedata/locales/nds_DE: Likewise.
    	* localedata/locales/nds_NL: Likewise.
    	* localedata/locales/ne_NP: Likewise.
    	* localedata/locales/nhn_MX: Likewise.
    	* localedata/locales/niu_NU: Likewise.
    	* localedata/locales/niu_NZ: Likewise.
    	* localedata/locales/nl_AW: Likewise.
    	* localedata/locales/nl_BE: Likewise.
    	* localedata/locales/nl_NL: Likewise.
    	* localedata/locales/nn_NO: Likewise.
    	* localedata/locales/nr_ZA: Likewise.
    	* localedata/locales/nso_ZA: Likewise.
    	* localedata/locales/oc_FR: Likewise.
    	* localedata/locales/om_ET: Likewise.
    	* localedata/locales/om_KE: Likewise.
    	* localedata/locales/or_IN: Likewise.
    	* localedata/locales/os_RU: Likewise.
    	* localedata/locales/pa_IN: Likewise.
    	* localedata/locales/pa_PK: Likewise.
    	* localedata/locales/pap_AW: Likewise.
    	* localedata/locales/pap_CW: Likewise.
    	* localedata/locales/pl_PL: Likewise.
    	* localedata/locales/ps_AF: Likewise.
    	* localedata/locales/pt_BR: Likewise.
    	* localedata/locales/pt_PT: Likewise.
    	* localedata/locales/quz_PE: Likewise.
    	* localedata/locales/raj_IN: Likewise.
    	* localedata/locales/ro_RO: Likewise.
    	* localedata/locales/ru_RU: Likewise.
    	* localedata/locales/ru_UA: Likewise.
    	* localedata/locales/rw_RW: Likewise.
    	* localedata/locales/sa_IN: Likewise.
    	* localedata/locales/sat_IN: Likewise.
    	* localedata/locales/sc_IT: Likewise.
    	* localedata/locales/sd_IN: Likewise.
    	* localedata/locales/sd_IN@devanagari: Likewise.
    	* localedata/locales/se_NO: Likewise.
    	* localedata/locales/sgs_LT: Likewise.
    	* localedata/locales/shs_CA: Likewise.
    	* localedata/locales/si_LK: Likewise.
    	* localedata/locales/sid_ET: Likewise.
    	* localedata/locales/sk_SK: Likewise.
    	* localedata/locales/sl_SI: Likewise.
    	* localedata/locales/sm_WS: Likewise.
    	* localedata/locales/so_DJ: Likewise.
    	* localedata/locales/so_ET: Likewise.
    	* localedata/locales/so_KE: Likewise.
    	* localedata/locales/so_SO: Likewise.
    	* localedata/locales/sq_AL: Likewise.
    	* localedata/locales/sq_MK: Likewise.
    	* localedata/locales/sr_ME: Likewise.
    	* localedata/locales/sr_RS: Likewise.
    	* localedata/locales/sr_RS@latin: Likewise.
    	* localedata/locales/ss_ZA: Likewise.
    	* localedata/locales/st_ZA: Likewise.
    	* localedata/locales/sv_FI: Likewise.
    	* localedata/locales/sv_SE: Likewise.
    	* localedata/locales/sw_KE: Likewise.
    	* localedata/locales/sw_TZ: Likewise.
    	* localedata/locales/szl_PL: Likewise.
    	* localedata/locales/ta_IN: Likewise.
    	* localedata/locales/ta_LK: Likewise.
    	* localedata/locales/tcy_IN: Likewise.
    	* localedata/locales/te_IN: Likewise.
    	* localedata/locales/tg_TJ: Likewise.
    	* localedata/locales/th_TH: Likewise.
    	* localedata/locales/the_NP: Likewise.
    	* localedata/locales/ti_ER: Likewise.
    	* localedata/locales/ti_ET: Likewise.
    	* localedata/locales/tig_ER: Likewise.
    	* localedata/locales/tk_TM: Likewise.
    	* localedata/locales/tl_PH: Likewise.
    	* localedata/locales/tn_ZA: Likewise.
    	* localedata/locales/to_TO: Likewise.
    	* localedata/locales/tpi_PG: Likewise.
    	* localedata/locales/tr_CY: Likewise.
    	* localedata/locales/tr_TR: Likewise.
    	* localedata/locales/ts_ZA: Likewise.
    	* localedata/locales/tt_RU: Likewise.
    	* localedata/locales/tt_RU@iqtelif: Likewise.
    	* localedata/locales/ug_CN: Likewise.
    	* localedata/locales/uk_UA: Likewise.
    	* localedata/locales/unm_US: Likewise.
    	* localedata/locales/ur_IN: Likewise.
    	* localedata/locales/ur_PK: Likewise.
    	* localedata/locales/uz_UZ: Likewise.
    	* localedata/locales/uz_UZ@cyrillic: Likewise.
    	* localedata/locales/ve_ZA: Likewise.
    	* localedata/locales/vi_VN: Likewise.
    	* localedata/locales/wa_BE: Likewise.
    	* localedata/locales/wae_CH: Likewise.
    	* localedata/locales/wal_ET: Likewise.
    	* localedata/locales/wo_SN: Likewise.
    	* localedata/locales/xh_ZA: Likewise.
    	* localedata/locales/yi_US: Likewise.
    	* localedata/locales/yo_NG: Likewise.
    	* localedata/locales/yue_HK: Likewise.
    	* localedata/locales/yuw_PG: Likewise.
    	* localedata/locales/zh_CN: Likewise.
    	* localedata/locales/zh_HK: Likewise.
    	* localedata/locales/zh_SG: Likewise.
    	* localedata/locales/zh_TW: Likewise.
    	* localedata/locales/zu_ZA: Likewise.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                           |  305 +++++++++++++++++++++++++++++++++++
 localedata/locales/aa_DJ            |  136 ++++++----------
 localedata/locales/aa_ER            |  129 ++++++---------
 localedata/locales/aa_ER@saaho      |  104 +++++-------
 localedata/locales/aa_ET            |  130 ++++++---------
 localedata/locales/af_ZA            |  113 +++++--------
 localedata/locales/agr_PE           |  113 ++++++--------
 localedata/locales/ak_GH            |  152 +++++++-----------
 localedata/locales/am_ET            |   50 ++----
 localedata/locales/anp_IN           |   35 ++---
 localedata/locales/ar_AE            |   71 ++++-----
 localedata/locales/ar_BH            |   68 +++-----
 localedata/locales/ar_DZ            |   68 +++-----
 localedata/locales/ar_EG            |   68 +++-----
 localedata/locales/ar_IN            |   39 ++---
 localedata/locales/ar_IQ            |   84 ++++------
 localedata/locales/ar_JO            |   84 ++++------
 localedata/locales/ar_KW            |   69 +++-----
 localedata/locales/ar_LB            |   84 ++++------
 localedata/locales/ar_LY            |   69 +++-----
 localedata/locales/ar_MA            |   68 +++-----
 localedata/locales/ar_OM            |   67 +++-----
 localedata/locales/ar_QA            |   69 +++-----
 localedata/locales/ar_SA            |   59 +++-----
 localedata/locales/ar_SD            |   71 +++-----
 localedata/locales/ar_SS            |   70 +++-----
 localedata/locales/ar_SY            |   85 ++++------
 localedata/locales/ar_TN            |   69 +++-----
 localedata/locales/ar_YE            |   68 +++-----
 localedata/locales/as_IN            |   36 ++---
 localedata/locales/ast_ES           |   85 +++++------
 localedata/locales/ayc_PE           |  113 ++++++-------
 localedata/locales/az_AZ            |  114 ++++++-------
 localedata/locales/az_IR            |   39 +----
 localedata/locales/be_BY            |   49 +++----
 localedata/locales/be_BY@latin      |  116 ++++++-------
 localedata/locales/bem_ZM           |  150 +++++++-----------
 localedata/locales/ber_DZ           |  129 +++++++--------
 localedata/locales/ber_MA           |  129 +++++++--------
 localedata/locales/bg_BG            |   58 +++----
 localedata/locales/bhb_IN           |  114 ++++++--------
 localedata/locales/bho_IN           |   34 ++---
 localedata/locales/bi_VU            |  158 ++++++++-----------
 localedata/locales/bn_BD            |   50 +++----
 localedata/locales/bn_IN            |   35 ++---
 localedata/locales/bo_CN            |   18 +--
 localedata/locales/bo_IN            |   19 +--
 localedata/locales/br_FR            |   93 +++++------
 localedata/locales/brx_IN           |   47 ++----
 localedata/locales/bs_BA            |  105 ++++++-------
 localedata/locales/byn_ER           |   52 +++----
 localedata/locales/ca_AD            |   34 ++---
 localedata/locales/ca_ES            |  101 +++++-------
 localedata/locales/ca_FR            |   26 +--
 localedata/locales/ca_IT            |   26 +--
 localedata/locales/ce_RU            |   61 +++-----
 localedata/locales/chr_US           |   28 ++--
 localedata/locales/cmn_TW           |   90 ++++------
 localedata/locales/crh_UA           |  129 +++++++--------
 localedata/locales/cs_CZ            |  190 +++++++++++-----------
 localedata/locales/csb_PL           |   60 ++++----
 localedata/locales/cv_RU            |  113 ++++++-------
 localedata/locales/cy_GB            |   91 +++++------
 localedata/locales/da_DK            |  119 ++++++--------
 localedata/locales/de_AT            |   98 +++++------
 localedata/locales/de_BE            |   94 +++++------
 localedata/locales/de_CH            |  104 ++++++-------
 localedata/locales/de_DE            |  140 +++++++---------
 localedata/locales/de_IT            |   82 +++++-----
 localedata/locales/de_LI            |   31 ++--
 localedata/locales/de_LU            |  106 ++++++-------
 localedata/locales/doi_IN           |   43 ++----
 localedata/locales/dv_MV            |   51 +++----
 localedata/locales/dz_BT            |   69 ++++-----
 localedata/locales/el_CY            |   46 ++----
 localedata/locales/el_GR            |   60 +++-----
 localedata/locales/en_AG            |  112 ++++++-------
 localedata/locales/en_AU            |  117 ++++++--------
 localedata/locales/en_BW            |   46 ++----
 localedata/locales/en_CA            |  116 ++++++--------
 localedata/locales/en_DK            |  105 ++++++-------
 localedata/locales/en_GB            |  117 ++++++--------
 localedata/locales/en_HK            |   99 +++++-------
 localedata/locales/en_IE            |  106 ++++++-------
 localedata/locales/en_IL            |   85 +++++------
 localedata/locales/en_IN            |   85 +++++------
 localedata/locales/en_NG            |  129 ++++++---------
 localedata/locales/en_NZ            |  117 ++++++--------
 localedata/locales/en_PH            |  101 +++++-------
 localedata/locales/en_SG            |  104 ++++++-------
 localedata/locales/en_US            |  133 +++++++---------
 localedata/locales/en_ZA            |  161 +++++++------------
 localedata/locales/en_ZM            |  101 +++++-------
 localedata/locales/en_ZW            |   47 ++----
 localedata/locales/eo               |  107 ++++++-------
 localedata/locales/es_AR            |  116 ++++++-------
 localedata/locales/es_BO            |  112 ++++++-------
 localedata/locales/es_CL            |  112 ++++++-------
 localedata/locales/es_CO            |  117 ++++++--------
 localedata/locales/es_CR            |  131 +++++++--------
 localedata/locales/es_CU            |  109 ++++++-------
 localedata/locales/es_DO            |  116 ++++++-------
 localedata/locales/es_EC            |  112 ++++++-------
 localedata/locales/es_ES            |  114 ++++++-------
 localedata/locales/es_GT            |  116 ++++++-------
 localedata/locales/es_HN            |  115 ++++++-------
 localedata/locales/es_MX            |  115 ++++++-------
 localedata/locales/es_NI            |  122 +++++++--------
 localedata/locales/es_PA            |  116 ++++++-------
 localedata/locales/es_PE            |  118 ++++++--------
 localedata/locales/es_PR            |  116 ++++++-------
 localedata/locales/es_PY            |  112 ++++++-------
 localedata/locales/es_SV            |  116 ++++++-------
 localedata/locales/es_US            |  113 ++++++-------
 localedata/locales/es_UY            |  112 ++++++-------
 localedata/locales/es_VE            |  118 ++++++--------
 localedata/locales/et_EE            |  107 ++++++-------
 localedata/locales/eu_ES            |  109 ++++++-------
 localedata/locales/eu_ES@euro       |    8 +-
 localedata/locales/fa_IR            |   67 +++------
 localedata/locales/ff_SN            |  178 ++++++++-------------
 localedata/locales/fi_FI            |  123 +++++++--------
 localedata/locales/fil_PH           |  114 ++++++--------
 localedata/locales/fo_FO            |   98 +++++------
 localedata/locales/fr_BE            |  110 ++++++-------
 localedata/locales/fr_CA            |   99 +++++-------
 localedata/locales/fr_CH            |   99 +++++-------
 localedata/locales/fr_FR            |  123 ++++++--------
 localedata/locales/fr_LU            |  107 ++++++-------
 localedata/locales/fur_IT           |   82 ++++------
 localedata/locales/fy_DE            |   90 +++++------
 localedata/locales/fy_NL            |  102 +++++-------
 localedata/locales/ga_IE            |  114 ++++++-------
 localedata/locales/gd_GB            |  109 +++++--------
 localedata/locales/gez_ER           |   37 ++---
 localedata/locales/gez_ET           |   36 ++---
 localedata/locales/gl_ES            |  112 ++++++-------
 localedata/locales/gu_IN            |   38 ++---
 localedata/locales/gv_GB            |  129 +++++++--------
 localedata/locales/ha_NG            |  103 +++++-------
 localedata/locales/hak_TW           |   90 ++++------
 localedata/locales/he_IL            |   60 +++----
 localedata/locales/hi_IN            |   48 ++----
 localedata/locales/hif_FJ           |  129 +++++++--------
 localedata/locales/hne_IN           |   32 ++---
 localedata/locales/hr_HR            |  105 ++++++-------
 localedata/locales/hsb_DE           |   97 +++++------
 localedata/locales/ht_HT            |  144 +++++++----------
 localedata/locales/hu_HU            |  116 ++++++--------
 localedata/locales/hy_AM            |   48 +++---
 localedata/locales/i18n             |   51 +++----
 localedata/locales/ia_FR            |   93 +++++------
 localedata/locales/id_ID            |  115 ++++++-------
 localedata/locales/ig_NG            |   98 +++++------
 localedata/locales/ik_CA            |   92 +++++------
 localedata/locales/is_IS            |  116 ++++++-------
 localedata/locales/it_CH            |  109 ++++++-------
 localedata/locales/it_IT            |  115 ++++++--------
 localedata/locales/iu_CA            |   34 ++---
 localedata/locales/ja_JP            |  108 ++++++-------
 localedata/locales/ka_GE            |   40 ++---
 localedata/locales/kk_KZ            |   53 +++----
 localedata/locales/kl_GL            |  100 +++++-------
 localedata/locales/kn_IN            |   46 ++----
 localedata/locales/ko_KR            |   91 +++++------
 localedata/locales/kok_IN           |   39 ++---
 localedata/locales/ks_IN            |   36 ++---
 localedata/locales/ks_IN@devanagari |   45 ++----
 localedata/locales/ku_TR            |  107 ++++++-------
 localedata/locales/kw_GB            |  113 ++++++-------
 localedata/locales/ky_KG            |   57 +++----
 localedata/locales/lb_LU            |  146 ++++++++---------
 localedata/locales/lg_UG            |  133 +++++++---------
 localedata/locales/li_BE            |   24 ++--
 localedata/locales/li_NL            |   88 +++++------
 localedata/locales/lij_IT           |   99 +++++-------
 localedata/locales/ln_CD            |  140 +++++++---------
 localedata/locales/lo_LA            |  105 ++++++-------
 localedata/locales/lt_LT            |  110 ++++++-------
 localedata/locales/lv_LV            |  108 ++++++-------
 localedata/locales/lzh_TW           |   89 ++++------
 localedata/locales/mag_IN           |   43 ++----
 localedata/locales/mai_IN           |   32 ++---
 localedata/locales/mg_MG            |  129 ++++++---------
 localedata/locales/mhr_RU           |   36 ++---
 localedata/locales/mi_NZ            |   89 +++++------
 localedata/locales/mk_MK            |   58 +++----
 localedata/locales/ml_IN            |   35 ++---
 localedata/locales/mn_MN            |  237 +++++++++++++--------------
 localedata/locales/mni_IN           |   38 ++---
 localedata/locales/mr_IN            |   50 ++----
 localedata/locales/ms_MY            |  103 ++++++-------
 localedata/locales/mt_MT            |  129 +++++++---------
 localedata/locales/my_MM            |   55 +++----
 localedata/locales/nan_TW           |   90 ++++------
 localedata/locales/nan_TW@latin     |  119 ++++++--------
 localedata/locales/nb_NO            |  119 ++++++--------
 localedata/locales/nds_DE           |   84 +++++-----
 localedata/locales/nds_NL           |   82 +++++-----
 localedata/locales/ne_NP            |   53 +++----
 localedata/locales/nhn_MX           |   90 +++++------
 localedata/locales/niu_NU           |  126 +++++++--------
 localedata/locales/niu_NZ           |   18 +--
 localedata/locales/nl_AW            |  104 ++++++-------
 localedata/locales/nl_BE            |   89 +++++------
 localedata/locales/nl_NL            |  107 ++++++-------
 localedata/locales/nn_NO            |  110 ++++++--------
 localedata/locales/nr_ZA            |  123 ++++++--------
 localedata/locales/nso_ZA           |  108 ++++++-------
 localedata/locales/oc_FR            |   82 +++++-----
 localedata/locales/om_ET            |   39 ++---
 localedata/locales/om_KE            |   81 +++++-----
 localedata/locales/or_IN            |   54 +++----
 localedata/locales/os_RU            |   29 ++---
 localedata/locales/pa_IN            |   46 ++----
 localedata/locales/pa_PK            |   30 ++---
 localedata/locales/pap_AW           |  103 ++++++-------
 localedata/locales/pap_CW           |  101 ++++++-------
 localedata/locales/pl_PL            |  109 ++++++-------
 localedata/locales/ps_AF            |   72 ++++-----
 localedata/locales/pt_BR            |  114 ++++++-------
 localedata/locales/pt_PT            |  119 +++++++--------
 localedata/locales/quz_PE           |  118 ++++++--------
 localedata/locales/raj_IN           |   19 +--
 localedata/locales/ro_RO            |  134 +++++++---------
 localedata/locales/ru_RU            |   43 ++---
 localedata/locales/ru_UA            |   49 +++----
 localedata/locales/rw_RW            |  111 ++++++-------
 localedata/locales/sa_IN            |   66 +++-----
 localedata/locales/sat_IN           |   39 ++---
 localedata/locales/sc_IT            |   93 +++++------
 localedata/locales/sd_IN            |   38 ++---
 localedata/locales/sd_IN@devanagari |   50 ++----
 localedata/locales/se_NO            |  121 +++++++--------
 localedata/locales/sgs_LT           |   87 +++++------
 localedata/locales/shs_CA           |  105 ++++++-------
 localedata/locales/si_LK            |   61 +++----
 localedata/locales/sid_ET           |  125 +++++++--------
 localedata/locales/sk_SK            |  132 +++++++--------
 localedata/locales/sl_SI            |  108 ++++++-------
 localedata/locales/sm_WS            |  151 +++++++----------
 localedata/locales/so_DJ            |   79 ++++-----
 localedata/locales/so_ET            |  118 ++++++--------
 localedata/locales/so_KE            |  118 ++++++--------
 localedata/locales/so_SO            |  157 +++++++++----------
 localedata/locales/sq_AL            |  131 +++++++---------
 localedata/locales/sq_MK            |   36 ++---
 localedata/locales/sr_ME            |   56 +++----
 localedata/locales/sr_RS            |   66 +++-----
 localedata/locales/sr_RS@latin      |  121 +++++++--------
 localedata/locales/ss_ZA            |  122 ++++++--------
 localedata/locales/st_ZA            |  123 ++++++--------
 localedata/locales/sv_FI            |   98 +++++------
 localedata/locales/sv_SE            |  110 ++++++-------
 localedata/locales/sw_KE            |  145 ++++++-----------
 localedata/locales/sw_TZ            |  142 +++++++----------
 localedata/locales/szl_PL           |   91 +++++------
 localedata/locales/ta_IN            |   47 +++---
 localedata/locales/ta_LK            |   26 ++--
 localedata/locales/tcy_IN           |   32 ++---
 localedata/locales/te_IN            |   49 ++----
 localedata/locales/tg_TJ            |   46 ++----
 localedata/locales/th_TH            |  108 ++++++-------
 localedata/locales/the_NP           |   45 ++----
 localedata/locales/ti_ER            |   70 ++++-----
 localedata/locales/ti_ET            |   75 ++++-----
 localedata/locales/tig_ER           |   48 +++----
 localedata/locales/tk_TM            |  123 +++++++--------
 localedata/locales/tl_PH            |   98 +++++------
 localedata/locales/tn_ZA            |  121 ++++++---------
 localedata/locales/to_TO            |  145 +++++++----------
 localedata/locales/tpi_PG           |    6 +-
 localedata/locales/tr_CY            |   29 ++---
 localedata/locales/tr_TR            |  133 +++++++---------
 localedata/locales/ts_ZA            |  118 ++++++--------
 localedata/locales/tt_RU            |   29 ++--
 localedata/locales/tt_RU@iqtelif    |  130 +++++++--------
 localedata/locales/ug_CN            |   31 ++---
 localedata/locales/uk_UA            |   92 ++++++------
 localedata/locales/unm_US           |  105 ++++++-------
 localedata/locales/ur_IN            |   39 ++---
 localedata/locales/ur_PK            |   50 +++----
 localedata/locales/uz_UZ            |  112 ++++++-------
 localedata/locales/uz_UZ@cyrillic   |   46 +++---
 localedata/locales/ve_ZA            |  108 ++++++-------
 localedata/locales/vi_VN            |  133 +++++++---------
 localedata/locales/wa_BE            |   96 +++++------
 localedata/locales/wae_CH           |  131 +++++++--------
 localedata/locales/wal_ET           |   44 ++----
 localedata/locales/wo_SN            |  121 ++++++--------
 localedata/locales/xh_ZA            |  123 ++++++--------
 localedata/locales/yi_US            |   50 +++----
 localedata/locales/yo_NG            |  113 ++++++-------
 localedata/locales/yue_HK           |   57 +++----
 localedata/locales/yuw_PG           |    2 +-
 localedata/locales/zh_CN            |   66 ++++-----
 localedata/locales/zh_HK            |   65 ++++----
 localedata/locales/zh_SG            |   50 +++---
 localedata/locales/zh_TW            |   88 ++++-------
 localedata/locales/zu_ZA            |  124 +++++++--------
 300 files changed, 11664 insertions(+), 15069 deletions(-)
Comment 23 Mike FABIAN 2017-11-14 08:13:02 IST
Fixed in glibc master.
Comment 24 Claude Paroz 2017-11-14 08:15:04 IST
Awesome, thanks Mike for the commit!
Comment 25 keld@keldix.com 2017-11-14 13:02:38 IST
This commit is highly problematic, damaging the portablilty of glibc locales.
I wish they will be reverted.

Best regards
keld

On Tue, Nov 14, 2017 at 08:15:04AM +0000, claude at 2xlibre dot net wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> 
> --- Comment #24 from Claude Paroz <claude at 2xlibre dot net> ---
> Awesome, thanks Mike for the commit!
> 
> -- 
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 26 keld@keldix.com 2017-11-14 13:07:10 IST
Is there a script to convert these ascii values to <uxxxx> strings?

I would need it for the ISO 30112 standard, as I do not want to publish 
non-portable code in the ISO standard.

best regards
Keld

On Tue, Nov 14, 2017 at 01:02:38PM +0000, keld at keldix dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> 
> --- Comment #25 from keld at keldix dot com <keld at keldix dot com> ---
> This commit is highly problematic, damaging the portablilty of glibc locales.
> I wish they will be reverted.
> 
> Best regards
> keld
> 
> On Tue, Nov 14, 2017 at 08:15:04AM +0000, claude at 2xlibre dot net wrote:
> > https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> > 
> > --- Comment #24 from Claude Paroz <claude at 2xlibre dot net> ---
> > Awesome, thanks Mike for the commit!
> > 
> > -- 
> > You are receiving this mail because:
> > You are on the CC list for the bug.
> 
> -- 
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 27 Egmont Koblinger 2017-11-14 13:19:49 IST
(In reply to keld@keldix.com from comment #25)

> This commit is highly problematic, damaging the portablilty of glibc locales.

If this kind of portability is really a concern, someone could some up with a script that converts from the new version to the old one. It could even be integrated with the build system to the level where these generated files are actually placed under BUILD and then further processed.

I wish the current change even pushed it further, towards raw UTF-8 at least for printable and "non-problematic" (to some vague, arbitrary definition) characters.

I have on a few occasions made some minor edits to effected parts of a locale file, dealing with the <Uxxxx> notation was a nightmare. Working with a string like "h<U00E9>tf<U0151>" is already much better than "<U0068><U00E9><U0074><U0066><U0151>", but seeing "hétfő" would be ideal.

Source code is meant to be human-readable, which all these <Uxxxx>s is most certainly not.

There's a reason people write code like
  printf("Hello world!\n");
and not
  printf("\x48\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64\x21\x0a");

If for whatever reason the latter, hard-to-read (and hard-to-write) form is required, it should be auto-generated from the former, easy-to-read (and easy-to-write) one.
Comment 28 Carlos O'Donell 2017-11-14 23:32:30 IST
(In reply to keld@keldix.com from comment #25)
> This commit is highly problematic, damaging the portablilty of glibc locales.
> I wish they will be reverted.

The glibc community is a consensus driven community. If you have an objection, please raise that objection on libc-alpha, and include the relevant parties to discuss the issue. Consensus discussions should not be held on the bug tracker.
Comment 29 Mike FABIAN 2017-11-15 10:15:31 IST
(In reply to Egmont Koblinger from comment #27)
> (In reply to keld@keldix.com from comment #25)
> 
> > This commit is highly problematic, damaging the portablilty of glibc locales.
> 
> If this kind of portability is really a concern, someone could some up with
> a script that converts from the new version to the old one. It could even be
> integrated with the build system to the level where these generated files
> are actually placed under BUILD and then further processed.

Yes, if that is really a concern, we could easily convert it to different formats.
I really doubt that this can cause problems though. If the file contained
“<a>”, one still has to be able to read the ascii characters “<”, “a”, and “>”
to interpret the file, I don’t see anything which is lost by just writing “a”
instead. If one cannot read an ascii file, one would not be able to read the
keywords in the file either. So if something else than ascii like EBCDIC
is needed, one would need some conversion anyway. Using “a” instead of “<a>”
does not make such conversion any harder.

> I wish the current change even pushed it further, towards raw UTF-8 at least
> for printable and "non-problematic" (to some vague, arbitrary definition)
> characters.

I agree. In the long run this would be even better. Readability of the
source is useful. Let’s see what our experiences with using ascii directly
are, if no problems occur we can think about using UTF-8 for “non-problematic”
characters.

> I have on a few occasions made some minor edits to effected parts of a
> locale file, dealing with the <Uxxxx> notation was a nightmare. Working with
> a string like "h<U00E9>tf<U0151>" is already much better than
> "<U0068><U00E9><U0074><U0066><U0151>", but seeing "hétfő" would be ideal.

Yes, I also found the <Uxxxx> annoying when browsing the files, it
makes it much harder to spot errors.
Comment 30 Andreas Schwab 2017-11-15 10:36:14 IST
> Yes, I also found the <Uxxxx> annoying when browsing the files, it
> makes it much harder to spot errors.

Try this:

  (font-lock-add-keywords nil
     '(("<U\\(....\\)>"
        (0 (progn (compose-region (match-beginning 0) (match-end 0)
                  (string-to-number (match-string 1) 16)))))))
Comment 31 keld@keldix.com 2017-11-16 23:39:37 IST
On Wed, Nov 15, 2017 at 10:15:31AM +0000, maiku.fabian at gmail dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=22387
> 
> --- Comment #29 from Mike FABIAN <maiku.fabian at gmail dot com> ---
> (In reply to Egmont Koblinger from comment #27)
> > (In reply to keld@keldix.com from comment #25)
> > 
> > > This commit is highly problematic, damaging the portablilty of glibc locales.
> > 
> > If this kind of portability is really a concern, someone could some up with
> > a script that converts from the new version to the old one. It could even be
> > integrated with the build system to the level where these generated files
> > are actually placed under BUILD and then further processed.
> 
> Yes, if that is really a concern, we could easily convert it to different
> formats.
> I really doubt that this can cause problems though. If the file contained
> ???<a>???, one still has to be able to read the ascii characters ???<???, ???a???, and ???>???
> to interpret the file, I don???t see anything which is lost by just writing ???a???
> instead. If one cannot read an ascii file, one would not be able to read the
> keywords in the file either. So if something else than ascii like EBCDIC
> is needed, one would need some conversion anyway. Using ???a??? instead of ???<a>???
> does not make such conversion any harder.

I have explained  earlier that not using symbolic character names will generate
wrong results in situations where the source and target coded character set have
different encodings of ascii characters. 

The locales as they have come from my hand even preserves portability when some 
characters in the ascii character set have different encodings, which happens
on EBCDICs with different national ebcdic character sets. These are still in use
on big banking and aviation systems AFAIK. 

As an editor of multiple ISO standards on POSIX/Linux locales I do strive for general specs
and portablility. I can understand that this is not an issue for glibc people. 
I just have been happy that glibc has been using the ISO specs, and that I as 
ISO editor could use the glibc specs in return. This is not the case anymore with the recent
patch. 

I do have a great concern for the readability of the locales. That is why I made
an elaborate set of symbolic character names, that were much easier to proofread
than the <uxxxx> names, such as the <a> and greek <a*> names, japanese kana, arabic,
hebrew etc. Thus the locales were both portable over almost all known platforms, and
readable to some extent.  I was quite happy when I saw that the Arabic name for the 
10th month was something like "octobr" - it meant that I as someone that could not
read arabic at all, could write and maintain an arabic locale, with some confidence.

Also, I cannot edit japanese or arabic characters in utf-8, as I don't know them, and 
I think this is also the case for many mauntainers or glibc locales. They may be fluent
in their own locale, but locales from other cultures may be beyond their capability
to edit in raw utf-8.

I wish that we could have some arrangement so that we can have mutual exchange again
of locale specs.

Best regards
keld