This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]


Aurelien Jarno <aurelien@aurel32.net> さんはかきました:

> On 2017-12-08 07:53, Mike FABIAN wrote:
>> 
>>             [BZ #22524]
>>             * localedata/Makefile: Add lt_LT.UTF-8 to test-input
>>             and to the list of locales to be built for testing.
>>             * localedata/lt_LT.UTF-8.in: New file for testing the collation.
>>             * localedata/locales/lt_LT (LC_COLLATE): Use “copy "iso14651_t1"”
>>             and build the collation rules upon that.
>
> The lt_LT locale and a few others ones (et_EE and tr_TR) used to sort
> upper case letters before lower case ones. Basing the collation on
> iso14651_t1 actually changes that. I don't know if the change is
> intentional or not.

Yes, I know. In some locales I kept it, for example in et_EE I kept
it by adding something like this:

    % Uppercase first:
    % (This is not in the CLDR rules, but the old et_EE locale before I based
    % the collation on iso_41651_t1 did uppercase first. I don’t know whether
    % there is a good reason for this, but let’s keep it for the moment.
    % This reimplementation of the Estonian sorting just reproduces the same
    % order as before (except fixing some bugs,
    % see: https://sourceware.org/bugzilla/show_bug.cgi?id=22517#c1)).
    reorder-after <RES-1>
    <CAP>
    <MIN>

But actually CLDR sorts upper case first only for 3 languages:

    mfabian@taka:/local/mfabian/src/cldr-svn/trunk/common/collation
    $ grep 'caseFirst upper' *
    cu.xml:[caseFirst upper]
    da.xml:                                 [caseFirst upper]
    mt.xml:[caseFirst upper]  # DMS MSA 200:2009

So I’ll certainly keep it for these our Danish locale (I am currently
updating the localedata/locales/iso14651_t1_common to the latest
version released from ISO (see: https://www.iso.org/standard/68309.html).
And to do that I have to adapt our collation rules in many locales
including da_DK.

If we trust CLDR, I think we should do uppercase first only for the
above 3 languages.

> Debian has a patch from more than 10 years ago that has been very
> loosely ported from version to version to add support for
> preprocessor-like directives. That way it's possible to add ifdef else
> endif directives to support both lower-case-first and upper-case-first
> version of iso14651_t1.

I’ll try to do something like that when I am done with the
update of the localedata/locales/iso14651_t1_common  file.

-- 
Mike FABIAN <mfabian@redhat.com>
睡眠不足はいい仕事の敵だ。


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]