Collation and sorting for French locales
Denis Barbier
barbier@linuxfr.org
Wed Jan 5 07:15:00 GMT 2005
I was quite surprised to find that sorting for French locales do
not work as described in ISO 14651 and implemented in iso14651_t1.
First let's define an xx_XX locale containing
LC_COLLATE
collating-symbol <a>
collating-symbol <b>
<a>
<b>
script <S1>
order_start <S1>;forward;backward;position
<U001A> IGNORE;IGNORE;<U001A>
<U0061> <a>;<a>;IGNORE
<U0062> <a>;<b>;IGNORE
order_end
END LC_COLLATE
$ cat test.xx_XX
aa
ab
ba
bb
$ LC_ALL=xx_XX sort test.xx_XX
aa
ba
ab
bb
Good, that works as expected, the backward directive does its job.
But now, let's add another script definition:
LC_COLLATE
collating-symbol <a>
collating-symbol <b>
collating-symbol <c>
collating-symbol <d>
<a>
<b>
<c>
<d>
script <S1>
script <S2>
order_start <S1>;forward;backward;position
<U001A> IGNORE;IGNORE;<U001A>
<U0061> <a>;<a>;IGNORE
<U0062> <a>;<b>;IGNORE
order_start <S2>;forward;forward;position
<U0063> <c>;<c>;IGNORE
<U0064> <c>;<d>;IGNORE
order_end
END LC_COLLATE
$ LC_ALL=xx_XX sort test.xx_XX
aa
ab
ba
bb
Now it does not work any more. If I define <S3> script and add
order_start <S3>;forward;backward;position
after <S2>, it works again. Thus it seems that the last order_start
rule wins. I tried to add
reorder-sections-after <S2>
<S1>
reorder-sections-end
at the end of LC_COLLATE definitions, but this does not change anything.
So how can we make sorting work as expected for French locales?
Denis
More information about the Libc-locales
mailing list