This is the mail archive of the
mailing list for the glibc project.
Re: [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]
- From: Mike FABIAN <mfabian at redhat dot com>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Date: Thu, 14 Dec 2017 10:07:44 +0100
- Subject: Re: [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]
- Authentication-results: sourceware.org; auth=none
- References: <firstname.lastname@example.org> <20171213203626.GA13829@aurel32.net>
Aurelien Jarno <email@example.com> さんはかきました:
> On 2017-12-08 07:53, Mike FABIAN wrote:
>> [BZ #22524]
>> * localedata/Makefile: Add lt_LT.UTF-8 to test-input
>> and to the list of locales to be built for testing.
>> * localedata/lt_LT.UTF-8.in: New file for testing the collation.
>> * localedata/locales/lt_LT (LC_COLLATE): Use “copy "iso14651_t1"”
>> and build the collation rules upon that.
> The lt_LT locale and a few others ones (et_EE and tr_TR) used to sort
> upper case letters before lower case ones. Basing the collation on
> iso14651_t1 actually changes that. I don't know if the change is
> intentional or not.
Yes, I know. In some locales I kept it, for example in et_EE I kept
it by adding something like this:
% Uppercase first:
% (This is not in the CLDR rules, but the old et_EE locale before I based
% the collation on iso_41651_t1 did uppercase first. I don’t know whether
% there is a good reason for this, but let’s keep it for the moment.
% This reimplementation of the Estonian sorting just reproduces the same
% order as before (except fixing some bugs,
% see: https://sourceware.org/bugzilla/show_bug.cgi?id=22517#c1)).
But actually CLDR sorts upper case first only for 3 languages:
$ grep 'caseFirst upper' *
da.xml: [caseFirst upper]
mt.xml:[caseFirst upper] # DMS MSA 200:2009
So I’ll certainly keep it for these our Danish locale (I am currently
updating the localedata/locales/iso14651_t1_common to the latest
version released from ISO (see: https://www.iso.org/standard/68309.html).
And to do that I have to adapt our collation rules in many locales
If we trust CLDR, I think we should do uppercase first only for the
above 3 languages.
> Debian has a patch from more than 10 years ago that has been very
> loosely ported from version to version to add support for
> preprocessor-like directives. That way it's possible to add ifdef else
> endif directives to support both lower-case-first and upper-case-first
> version of iso14651_t1.
I’ll try to do something like that when I am done with the
update of the localedata/locales/iso14651_t1_common file.
Mike FABIAN <firstname.lastname@example.org>