This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [Patch 0/13] [BZ #14095] update collation data from Unicode / ISO 14651
On 01/26/2018 06:40 AM, Mike FABIAN wrote:
> Joseph Myers <joseph@codesourcery.com> さんはかきました:
>
>> On Fri, 26 Jan 2018, Mike FABIAN wrote:
>>
>>> [BZ #14095] - Review / update collation data from Unicode / ISO 14651
>>>
>>> Updating this file alone is not enough, there are problems in the new
>>> file which need to be fixed and the collation rules for many locales
>>> need to be adapted. This is done by the following patches.
>>>
>>> This update also fixes the problem that many characters are treated as
>>> identical when sorting because they were not yet in the old
>>> iso14651_t1_common file, see:
>>
>> To be clear: do you mean it fixes it *for the characters in the Unicode
>> version supported by these updated collation data*? Or globally for all
>> characters including those not yet defined or too new for that collation
>> data?
>
> Yes, it fixes it only for the characters which are in this updated
> collation data, i.e. for all characters up to Unicode 8.0.0. All
> characters added after Unicode 8.0.0 or still undefined will still have
> that problem.
This is OK IMO, though as I finish C.UTF-8 in glibc 2.28 we may be able to
have code-point sorting for all such undefined elements if the locale uses
UTF-8 (one of the C.UTF-8 enhancements is to provide full UTF-8 coverage
of all code points to provide code-point sorting).
--
Cheers,
Carlos.