[Bug localedata/21547] New: Tibetan script collation broken (Dzongkha and Tibetan)
elie.roux@telecom-bretagne.eu
sourceware-bugzilla@sourceware.org
Mon Jun 5 10:37:00 GMT 2017
https://sourceware.org/bugzilla/show_bug.cgi?id=21547
Bug ID: 21547
Summary: Tibetan script collation broken (Dzongkha and Tibetan)
Product: glibc
Version: 2.24
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: elie.roux@telecom-bretagne.eu
CC: libc-locales at sourceware dot org
Target Milestone: ---
Hello,
Tibetan or Dzongkha sorting do not work properly with the current locale data.
With the following test file:
$ cat tibt_order_test.txt
ལྔ
ང
ཅ
རྔ
སྔ
བརྔ
བསྔ
I get the following wrong result:
$ LC_COLLATE="dz_BT.utf8" sort tibt_order_test.txt
ང
བརྔ
བསྔ
རྔ
ལྔ
སྔ
ཅ
The correct result would be
ང
རྔ
ལྔ
སྔ
བརྔ
བསྔ
ཅ
Dz and bo have the same collation data in CLDR.
See https://github.com/eroux/tibetan-collation for more on tibetan collation.
Result of locale -a:
bo_CN
bo_CN.utf8
bo_IN
bo_IN.utf8
C
C.UTF-8
dz_BT
dz_BT.utf8
en_GB.utf8
en_US.utf8
fr_FR.utf8
POSIX
Thank you,
--
Elie
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libc-locales
mailing list