Talk about glibc locale format

Keld Jørn Simonsen keld@dkuug.dk
Tue Jul 5 00:02:00 GMT 2005


On Tue, Jul 05, 2005 at 12:06:16AM +0200, Denis Barbier wrote:
> On Mon, Jul 04, 2005 at 11:25:34PM +0200, Keld Jørn Simonsen wrote:
> > On Mon, Jul 04, 2005 at 01:15:09AM +0200, Denis Barbier wrote:
> > > Hi,
> > > 
> > > I will give a talk about glibc locale data format during Debian
> > > conference
> > >   http://www.debconf.org/debconf5/
> > > next week (10-17th July) at Helsinki.
> > > The aim of this talk is to give clues about data format so that
> > > more people are interested in contributing to locale files.
> > > Slides are available at
> > >   http://people.debian.org/~barbier/talks/debconf5/glibc-locale.pdf
> > > I am still polishing them, and will be glad to receive comments.
> > 
> > A few comments:
> 
> > on page 11, you say that 14652 is not always backwards compatible with
> > POSIX. Could you give examples? We did try hard to be backwards
> > compatible.
> 
> This is emphasized when describing LC_TIME around slide 26; day and
> abday are not backward compatible, which is really annoying.
> You will answer that LC_IDENTIFICATION is meant to disambiguate
> those situations, but I disagree, application developers (like cal
> writers) should not have to worry about this field.

As far as I can tell, LC_TIME *is* backwards compatible, in the sense
that conforming POSIX specs will be valid and work as intended with
14652 semantics.

But if there is added functionality, then you need added support.
This is also true for the extensions eg in LC_COLLATE which cannot be
handled with POSIX. In general one cannot expect that at 14652 locale
can be handled by POSIX. However the other way around should work
without any problems, a valid POSIX locale should work unchanged, and as
intended in POSIX, with the 14652 semantics. That is my definition of
backwards compatibility, and what we tried hard to assure was
accomplished in 14652.

Maybe, with some different definition of "backwards compatibility"
you can say that this does not hold. But my understanding of those
definitions, are that they do not make sense. Maybe you can call it
"forward compatible" in asking for conforming POSIX implementations 
to be able to deal with 14652 locales, but we decided that this kind of
compatibility would not be granted, and that is quite normal for
backwards compatible standards. If you introduce new functionality, and
associated new syntax, you cannot expect old programs to cope with
issues they were never designed for. The policy was then that they
better barf over stuff they cannot handle, than silently process it
wrongly.

Maybe that introduced some problems that need not have surfaced,
and maybe it could have been done in a better way. But I do believe we
are backwards compatible per the definition above.

> This "issue" is also mentioned in appendix D of this TR.

Yes, but annex D was just something where everybody had a free
microphone. The text there is not necessarily representing the truth.

> > Page 26: 14652 is designed to be backwards compatible with POSIX.
> > That is, if you just have a LC_TIME spec conformant to POSIX, 
> > it will work the same way in 14652. But you can with 14652 set the 
> > first day of the week (In the USA this is Sunday, in most of Europe
> > this is Monday) and you can set different week numbering behaviours in
> > 14652, which is not possible in POSIX.
> > 
> > Also the issue on the months, 14652 does allow you to specify that you
> > have 13 months in a year.
> 
> Indeed, mon and abmon accept 13 month names, but how is the mapping
> performed since the "week" keyword has no "month" counterpart?

Hmm, that is something that could be added. Anyway there are some Muslim
calendars that have 13 months. It should be investigated how to support
them. What is there now is probably not enough.

> > I think it is wrong to say that 14652 is not backwards compatible here.
> > Whether it is controversial could be true, many misunderstandings can
> > lead to heated discussions:-)
> > 
> > Page 32: Why is .* useles? 
> 
> > People may write "no" or "yes", and the answer will still be recognized.
> > I know, you and I would never do that, but some newbies could.
> 
> Because in a regular expression,
>   ^[nN].*
> matches the exact same expression as ^[nN]

Yes, I acknowledge that.

> > Else it is good to see such talks! And I would like if you get comments
> > on it and probably missing features, if you could post such feedback to
> > the list. 
> 
> Thanks for your support, do not hesitate to harass me if I forget to
> send feedback ;)



More information about the Libc-locales mailing list