This is the mail archive of the libc-locales@sources.redhat.com mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Collating bug with period?


On Wed, Apr 06, 2005 at 10:50:43AM +0200, Danilo Segan wrote:
> Today at 10:06, Ole Laursen wrote:
> 
> > The Danish locale is apparently dictated by a standard. But why do you
> > think this is a locale bug? I would expect the locale to support
> > dictionary-like sorting, not filename sorting.
> 
> Of course.  So, what you want is da_DK@filenamesorting instead, where
> a dot would be a sortable character.  You can try playing with that
> yourself, by doing 'copy "da_DK"' inside LC_COLLATE section, and then
> modifying the table to suit your needs.  Or perhaps we want a
> completely new locale based on iso14651_t1 suitable for filename
> sorting?  But, this again wouldn't work for all locales, so it's still
> doomed.
> 
> Of course, if there are any problems they are in the inscalability of
> POSIX locale system, not in any particular application making use of
> it.  Whatever you're thinking of is simply a special case of the more
> general problem, and it needs a more general solution (like having
> LC_FILENAME_COLLATE).  Though, I don't think this is necessary, since
> you either prefer the dictionary collation, or you don't.
> 
> What you need to think about is: where do I want dictionary collation
> and why, and where do I not want it and why?  I suspect that you'll
> end up wanting only one most of the time, though I'm of course only
> guessing.
> 
> >   Another common option is whether to treat punctuation (including
> >   spaces) as base characters or treate them as a level 4 difference.
> >
> > Doesn't this support the idea that an application may need a slighly
> > different sort order?
> 
> Yes, I thought of UCA only as a reference on where all of these issues
> are discussed.  *How* are these customizations done entirely depends on
> the system in use.
> 
> Unfortunately, POSIX (and by extension, GNU libc) doesn't support it
> that easily (it doesn't "scale" on that dimension): you need to have
> separate locale for that, even though most of the data inside
> LC_COLLATE can be reused, and only weights on a few elements need to
> be reassigned.

What do yoy mean by "not scaling"? Glibc has a mechanism "reorder-after"
that can build on an existing LC_COLLATE spec and then just reorder a
few characters, like the PERIOD character. This functionality is also
included in ISO 14651 and TR 14652.

Best regards
keld


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]