the LC_COLLATE definition of the es_ES locale defined collating rules for all and every character; making it very long and hard to grasp, despite most of them are supposed to behave the same as in the default "i18n" LC_COLLATE model. I rewrote it to use the new glibc reordering possibilities so that the LC_COLLATE simply includes "i18n" and only redefines the collating of the only character that behaves differently, the ntilde. It doesn't change behaviour, it's just cosmetic; however, by removing common things and including them automatically, the es_ES locale will benefit of corrections/additions done in the collating definitions in other areas (eg, if some fix is done in "i18n" file for cyrillic collating, then es_ES will benefit from it automatically). The LC_COLLATE definition is now as short as: LC_COLLATE copy "iso14651_t1" collating-symbol <ntilde> reorder-after <n> <ntilde> reorder-after <U006E> <U00F1> <ntilde>;<TIL>;<MIN>;IGNORE reorder-after <U004E> <U00D1> <ntilde>;<TIL>;<CAP>;IGNORE reorder-end END LC_COLLATE while previously it was more than 2,050 lines long.
Created attachment 357 [details] Complete es_ES version 4.5 locale file
Created attachment 358 [details] Patch against current glibc es_ES locale
Pablo, I was in the process of submitting a large patch to include iso14651_t1 in several locales, I already reviewed ca_ES, da_DK, en_CA, es_ES, es_US, et_EE, fi_FI and nb_NO. I hope we are not duplicating a lot of work.
Well, yes, it is duplicate work it seems... But when I queried for "es_ES" there was "zaroo bugs" :) The others locales fixes I commited are actual fixes and not a better rewrite of LC_COLLATE (I have such a rewrite for Turkish but haven't commited it yet; I also have a fix to vi_VN (the one in glibc has LC_COLLATE completly wrong, in fact it doesn't define anything at all) but haven't commited it yet either. If your new es_ES does the same thing as mine (eg, it has: LC_COLLATE copy "iso14651_t1" collating-symbol <ntilde> reorder-after <n> <ntilde> reorder-after <U006E> <U00F1> <ntilde>;<TIL>;<MIN>;IGNORE reorder-after <U004E> <U00D1> <ntilde>;<TIL>;<CAP>;IGNORE reorder-end END LC_COLLATE feel free to close this bug as duplicate.
I do not include es_ES in my bugreport, there are no duplicates. My understanding is that n-tilde is a base letter in Spanish, so its 2nd level weight should be <BAS> and not <TIL>, shouldn't it?
What about comment #5?
No reply in 6+ months. Closing.