This is the mail archive of the glibc-bugs@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/368] localedef fails with coplex LC_COLLATE rules


------- Additional Comments From barbier at linuxfr dot org  2005-01-17 21:38 -------
As this patch only changes the multi-byte sequence, we can check
whether wide-char and multi-byte collations give the same results,
in which case this patch is certainly right.
I created a file containing sequences of 2 Tibetan characters:
  $ for i in `seq 0x0F00 0x0FCF`; do
      for j in `seq 0x0F00 0x0FCF`; do
        printf "0: %08x %08x 0000000a " $i $j | xxd -r -g4
      done
    done | iconv -f ucs4 -t utf8 > input_file
Then ran
  $ LC_ALL=en_US.UTF-8 ./tst-wcscoll < input_file > out.wc-en_US
  $ LC_ALL=en_US.UTF-8 ./tst-strcoll < input_file > out.mb-en_US
  $ cmp out.wc-en_US out.mb-en_US
  $

So results are exactly similar.  But to show that this patch allows
more than 256 collating elements, we need to check with more complex
LC_COLLATE sections.  I took Pablo's locale file, s/^%%%%</</ to have
more than 256 collating elements, and re-ran this test:
  $ export LOCPATH=`mktemp -d /tmp/test.XXXXXX`
  $ localedef.patched -i dz_BT -f UTF-8 $LOCPATH/dz_BT
  $ LC_ALL=dz_BT ./tst-wcscoll < input_file > out.wc-dz_BT
  $ LC_ALL=dz_BT ./tst-strcoll < input_file > out.mb-dz_BT
  $ cmp out.wc-dz_BT out.mb-dz_BT
  $
Looks good.

Note that tst-strcoll is much slower than tst-wcscoll, which seems
quite logical since the primary key is the first UTF-8 byte and does
not change in the range 0x0F00-0x0FCF.


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=368

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]