Bug 388 - localedata/locales/pl_PL has incorrect LC_COLLATE <space> handling
Summary: localedata/locales/pl_PL has incorrect LC_COLLATE <space> handling
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: localedata (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Petter Reinholdtsen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-09-17 12:10 UTC by Roman Barczynski
Modified: 2006-05-01 17:26 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Last reconfirmed:


Attachments
this patch fixes that issue (181 bytes, patch)
2004-09-17 12:12 UTC, Roman Barczynski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Barczynski 2004-09-17 12:10:28 UTC
In 54th line of localedata/locales/pl_PL is described ordering algorithm as it
shoud be:
%  1. Spaces and hyphen (but not soft
%     hyphen) before punctuation
%     characters, punctuation characters
... and so on

but in 290th line it's written:
"<U0020> IGNORE;IGNORE;IGNORE;<U0020>"
so <space> (U0020) is changed into IGNORE and ignored in sort order as it should
NOT!
Comment 1 Roman Barczynski 2004-09-17 12:12:50 UTC
Created attachment 195 [details]
this patch fixes that issue
Comment 2 Denis Barbier 2005-01-29 14:20:42 UTC
This comment was copied from a template file, so it is not reliable.
If you look into a Polish dictionary, is the space character ignored
or taken into account when sorting entries?
Comment 3 Ulrich Drepper 2005-10-14 23:03:46 UTC
Reporter, reply, please, otherwise I'll close the bug.
Comment 4 Roman Barczynski 2005-10-14 23:54:51 UTC
Polish collate rules defined in PN--80/N--01223 (document
proposed by Polish Ministry of Culture and Arts in 1980)
stands:

"In sort order you should consider alphabetic order of words
 (and so characters inside words). Characters between words
 such as punctuation chars, spaces and hypens should be
 considered as well."

Also you should know that polish TeX community (as we
all know TeX users are obsessive about any norms and
regulations compliance) shows us such example:

#v+
correct               incorrect
- - - - -             - - - - - - -
katalog informatyczny katalogi
katalog przedmiotowy  katalog informatyczny
katalogi              katalog przedmiotowy
#v-
( http://www.ia.pw.edu.pl/~wujek/tex/idx/porzadek.html
  - unfortunatelly link in polish language )
Comment 5 Ulrich Drepper 2006-05-01 17:26:14 UTC
I made the change.

Next time if you reply, change the state back from WAITING.  Otherwise the bug
might not show up on lists.