This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch 0/13] [BZ #14095] update collation data from Unicode / ISO 14651


On 01/26/2018 06:40 AM, Mike FABIAN wrote:
> Joseph Myers <joseph@codesourcery.com> さんはかきました:
> 
>> On Fri, 26 Jan 2018, Mike FABIAN wrote:
>>
>>> [BZ #14095] - Review / update collation data from Unicode / ISO 14651
>>>
>>> Updating this file alone is not enough, there are problems in the new
>>> file which need to be fixed and the collation rules for many locales
>>> need to be adapted. This is done by the following patches.
>>>
>>> This update also fixes the problem that many characters are treated as
>>> identical when sorting because they were not yet in the old
>>> iso14651_t1_common file, see:
>>
>> To be clear: do you mean it fixes it *for the characters in the Unicode 
>> version supported by these updated collation data*?  Or globally for all 
>> characters including those not yet defined or too new for that collation 
>> data?
> 
> Yes, it fixes it only for the characters which are in this updated
> collation data, i.e. for all characters up to Unicode 8.0.0. All
> characters added after Unicode 8.0.0 or still undefined will still have
> that problem.
This is OK IMO, though as I finish C.UTF-8 in glibc 2.28 we may be able to
have code-point sorting for all such undefined elements if the locale uses
UTF-8 (one of the C.UTF-8 enhancements is to provide full UTF-8 coverage
of all code points to provide code-point sorting).

-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]