This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Locales: Use CLDR matching thousands separator
- From: Carlos O'Donell <carlos at redhat dot com>
- To: Marko Myllynen <myllynen at redhat dot com>, Rafal Luzynski <digitalfreak at lingonborough dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 8 Oct 2018 14:13:34 -0400
- Subject: Re: [PATCH] Locales: Use CLDR matching thousands separator
- References: <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com>
On 10/8/18 7:16 AM, Marko Myllynen wrote:
> On 2018-08-22 00:38, Rafal Luzynski wrote:
>> 21.08.2018 04:18 Carlos O'Donell <firstname.lastname@example.org> wrote:
>>> On 08/17/2018 03:01 PM, Rafal Luzynski wrote:
>>>> 16.08.2018 11:28 Marko Myllynen <email@example.com> wrote:
>>>>> [...] But of course going back and forth on the glibc side is not ideal
>>>>> if CLDR does the change some time in the future.
>>>> That's my reason to oppose against this change but my opposition is weak.
>>>> That means, if other people want to introduce this change I will not
>>>> oppose anymore.
>>> This is called an objection, the strong form is called a sustained objection.
>>> Only sustained objection would block consensus on accepting a patch.
>> Thank you for explaining. Yes, I have an objection.
>>> So it sounds like you don't have a sustained objection, you're just worried
>>> about the changes causing confusion to our users,
>> More or less, that's true.
>>> and I agree that could be
>>> a problem, but being out of sync with CLDR is also a problem.
>> I did not know it was a problem. We were talking many times about the automatic
>> scripts to feed data from CLDR but we don't have any. Therefore I thought it
>> is good to follow CLDR but not a big problem if we don't follow CLDR for some
>> good reasons.
> One perhaps related thing I noticed recently was that neither U+00A0 or
> U+202F are classified as whitespace characters. locales/i18n_ctype has
> this definition (based on ISO/IEC 30112, see
> http://www.open-std.org/jtc1/sc35/wg5/docs/30112d10.pdf document page 30):
> space /
> Looking at pages about whitespace characters
> (https://en.wikipedia.org/wiki/Whitespace_character) and Unicode spaces
> (http://jkorpela.fi/chars/spaces.html) it seems that a couple of other
> Unicode space characters are also omitted from that list.
> Does anyone know is there a particular reason to omit U+00A0 and U+202F
> and few others from the above classification?
The i18n_ctype file is autogenerated from the Unicode release data using
The space data is subsequently derived from the API usage for unicodeutils
232 output_charclass(i18n_file, 'space', unicode_utils.is_space)
Is there a mistake there?