This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/3] localedata: use same comment_char/escape_char in these files
- From: Chris Leonard <cjlhomeaddress at gmail dot com>
- To: Marko Myllynen <myllynen at redhat dot com>
- Cc: libc-alpha <libc-alpha at sourceware dot org>, "Carlos O'Donell" <carlos at redhat dot com>, Florian Weimer <fweimer at redhat dot com>
- Date: Sun, 13 Mar 2016 13:49:46 -0400
- Subject: Re: [PATCH 1/3] localedata: use same comment_char/escape_char in these files
- Authentication-results: sourceware.org; auth=none
- References: <1455954855-26431-1-git-send-email-vapier at gentoo dot org> <20160225201222 dot GK19841 at vapier dot lan> <20160309222439 dot GZ6588 at vapier dot lan> <56E29E8F dot 50209 at redhat dot com> <CAHdAatZoQ-G0Y-m1O0+vjysCT++Nek_kKdfMc5FQoH3MPZL9PA at mail dot gmail dot com> <56E40938 dot 3010708 at redhat dot com> <CAHdAataOaNnE1Sst9F3e-nFjtPxyNUUATx-+DsQUgQFzBye2EA at mail dot gmail dot com>
One fundamental way of looking at this issue is to distill it down to
just unique language codes (ignoring countries, scripts and currency
variants).
Between both projects combined, there are 257 unique language codes*.
115 codes are common between both projects.
79 languages are represented in CLDR only
63 languages are represented in glibc only.
*Note: one exception is the qu/quz conflict between selecting language
codes to represent the Quechua languages of the Andes. I counted this
as in common, although it will require some resolution going forward,
as will the Aymara ay/ayc choice (there is no existing CLDR locale for
Aymara at present).
There are very possibly a few other such code selection issues which I
will look into further, I have a nagging suspicion that something is
going on with the Sotho language codes of Africa, but I need to
confirm that. In any event, those wouldn't change the overall numbers
much.
Overall, I would not declare one project the winner over the other in
terms of best representing languages, clearly some cross-porting
should be done where possible for the sake of language communities
dependent on either locale type.
I'm here because I work with glibc-dependent language communities, so
that has been my focus. I have not tried to work with CLDR on
locales. Anyone here have experience with that? How
welcoming/responsive are they to people who are trying to act as
intermediaries for minority language communities?
cjl
On Sat, Mar 12, 2016 at 2:09 PM, Chris Leonard <cjlhomeaddress@gmail.com> wrote:
> I think there are approximately 83 glibc locales (lang-country pairs),
> not in CLDR. Some are trivial fixes because the lang is in CLDR, in
> most cases, the lang is not in CLDR at all
>
> Data in the attached spreadsheet has directory listings of the
> relevant folders of most the recent releases for both products. Tab
> namesshould be self-exlanatory.
>
> A few glibc locales destined for deletion (pap_AN, iw_IL) are ignored
> as they have relevant replacements.
>
> cjl
>
>
>
> On Sat, Mar 12, 2016 at 7:19 AM, Marko Myllynen <myllynen@redhat.com> wrote:
>> Hi,
>>
>> On 2016-03-11 16:24, Chris Leonard wrote:
>>> I know that there are a number of glibc locales that I have
>>> contributed that are not represented in CLDR. Working with OLPC and
>>> Sugar Labs as I do, we are often on the bleeding edge of supporting
>>> new languages on GNUI/Linux-based systems.
>>
>> Yes, locales only present in glibc but not in CLDR can't be
>> automatically updated (and most definitely won't be tossed away).
>>
>> Do you have any rough estimate how many such locales there might be?
>>
>> Thanks,
>>
>> --
>> Marko Myllynen