664 – rewrite of LC_COLLATE of Spanish (es_ES) locale to make it simpler

Bug 664 - rewrite of LC_COLLATE of Spanish (es_ES) locale to make it simpler

Summary: rewrite of LC_COLLATE of Spanish (es_ES) locale to make it simpler

Status:	RESOLVED WONTFIX

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	localedata (show other bugs)
Version:	2.3.4

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	Petter Reinholdtsen

URL:
Keywords:

Depends on:
Blocks:

Reported:	2005-01-14 15:08 UTC by Pablo Saratxaga
Modified:	2019-04-10 09:20 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Flags:	fweimer: security-

Attachments
Complete es_ES version 4.5 locale file (1.49 KB, text/plain) 2005-01-14 15:09 UTC, Pablo Saratxaga	Details
Patch against current glibc es_ES locale (12.40 KB, patch) 2005-01-14 15:10 UTC, Pablo Saratxaga	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Pablo Saratxaga 2005-01-14 15:08:34 UTC

the LC_COLLATE definition of the es_ES locale defined collating rules
for all and every character; making it very long and hard to grasp,
despite most of them are supposed to behave the same as in the
default "i18n" LC_COLLATE model.

I rewrote it to use the new glibc reordering possibilities
so that the LC_COLLATE simply includes "i18n" and only
redefines the collating of the only character that behaves
differently, the ntilde.

It doesn't change behaviour, it's just cosmetic;
however, by removing common things and including them
automatically, the es_ES locale will benefit of 
corrections/additions done in the collating definitions
in other areas (eg, if some fix is done in "i18n" file
for cyrillic collating, then es_ES will benefit from
it automatically).

The LC_COLLATE definition is now as short as:

LC_COLLATE
copy "iso14651_t1"
collating-symbol  <ntilde>
reorder-after <n>
<ntilde>
reorder-after <U006E>
<U00F1> <ntilde>;<TIL>;<MIN>;IGNORE
reorder-after <U004E>
<U00D1> <ntilde>;<TIL>;<CAP>;IGNORE
reorder-end
END LC_COLLATE

while previously it was more than 2,050 lines long.

Comment 1 Pablo Saratxaga 2005-01-14 15:09:53 UTC

Created attachment 357 [details]
Complete es_ES version 4.5 locale file

Comment 2 Pablo Saratxaga 2005-01-14 15:10:12 UTC

Created attachment 358 [details]
Patch against current glibc es_ES locale

Comment 3 Denis Barbier 2005-01-14 19:40:52 UTC

Pablo, I was in the process of submitting a large patch to include iso14651_t1
in several locales, I already reviewed ca_ES, da_DK, en_CA, es_ES, es_US,
et_EE, fi_FI and nb_NO.  I hope we are not duplicating a lot of work.

Comment 4 Pablo Saratxaga 2005-01-14 21:39:26 UTC

Well, yes, it is duplicate work it seems...
But when I queried for "es_ES" there was "zaroo bugs" :)

The others locales fixes I commited are actual fixes and not
a better rewrite of LC_COLLATE (I have such a rewrite for 
Turkish but haven't commited it yet; I also have a fix to vi_VN
(the one in glibc has LC_COLLATE completly wrong, in fact it doesn't
define anything at all) but haven't commited it yet either.

If your new es_ES does the same thing as mine (eg, it has:

LC_COLLATE
copy "iso14651_t1"
collating-symbol  <ntilde>
reorder-after <n>
<ntilde>
reorder-after <U006E>
<U00F1> <ntilde>;<TIL>;<MIN>;IGNORE
reorder-after <U004E>
<U00D1> <ntilde>;<TIL>;<CAP>;IGNORE
reorder-end
END LC_COLLATE

feel free to close this bug as duplicate.

Comment 5 Denis Barbier 2005-01-16 08:29:37 UTC

I do not include es_ES in my bugreport, there are no duplicates.
My understanding is that n-tilde is a base letter in Spanish, so its
2nd level weight should be <BAS> and not <TIL>, shouldn't it?

Comment 6 Ulrich Drepper 2005-09-24 18:47:19 UTC

What about comment #5?

Comment 7 Ulrich Drepper 2006-04-25 22:55:50 UTC

No reply in 6+ months.  Closing.