This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/23421] Strange collation rules for A and space with UTF-8 locale when other characters appended


https://sourceware.org/bugzilla/show_bug.cgi?id=23421

Benjamin Cama <b.cama at kerlink dot fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |---

--- Comment #5 from Benjamin Cama <b.cama at kerlink dot fr> ---
Thanks again for the clarification. I understand that this is a POSIX-defined
behavior, and I cannot do much about it. Thanks for the example describing a
situation where using the C locale is mandated.

I know I cannot convince anyone of changing POSIX, but one last *real* example
of “weird” sorting:

ENCR_DES        DES_CBC
ENCR_DES_ECB    DES_ECB
ENCR_DES_IV32   DES_IV32
ENCR_DES_IV64   DES_IV64
ENCR_IDEA       IDEA_CBC
ENCR_NULL_AUTH_AES_GMAC NULL_AES_GMAC
ENCR_NULL       NULL

The “usual” rule (as in “historically in Unix, which for a long time used the
C/POSIX locale everywhere”; I am speaking of 2000's kind of old, not the 80's,
but I am old enough to have lived the Unicode transition in Debian) of having
shorter strings sorted before longer ones does not stand (i.e. ENCR_NULL* looks
sorted the opposite way of ENCR_DES*). This is with tabs instead of spaces
(which have the same ordering rule, it seems), so it stands out more.

It is even stranger in this made up example:

% printf "A\tA\nAA\tA\nA\tD\n"|sort
A       A
AA      A
A       D

I will from now on try not to forget setting the right collation rule before
expecting the C sorting behavior. I hope not to be bitten again.

Sorry for the noise and thanks again.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]