This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix decimal_point and thousands_sep in es_MX locale


On Thu, Jun 07, 2012 at 03:45:46AM +0200, Petr Baudis wrote:
>   Hi!
> 
> On Wed, Jun 06, 2012 at 11:53:11PM +0200, Keld Simonsen wrote:
> > On Wed, Jun 06, 2012 at 01:14:25PM -0400, Carlos O'Donell wrote:
> > > (b) A new BZ filed to fix all the locales using <U0020> to use
> > >     thin space <U2009>. Set the target milestone to 2.17 please.
> ..snip..
> > Also the use of Unicode only characters break the universallity of locales,
> > as they cannot be used with 8-bit character sets.
> > And it may break programs that tries to parse numbers. 
> 
>   Can you please elaborate on this point? Surely, if a locale is using
> UTF-8 charset, it should be permitted to include UTF-8 characters? Is
> there a point in making a difference between e.g. LC_TIME and LC_CTYPE?

Yes, of cause it should be possible to use UTF-8 characters. 

But we then may need to have more versions of locales, eg. one with
utf-8 characters, and one with a more restricted character set.

>   (In case a locale is generated with more restrictive charset, e.g.
> ISO-8859-1, the thin space is automatically transliterated to normal
> space.)

Is it really transtliterated to a notrmal space? What wonders
our locales can do:-)

A problem with the normal space is that it is difficult to parse.
A normal space is normally used as a separator. 

There is also a question  on what our locales really are aimed at.
Eg dates, - what we have is partly meant for listing files in long
format (ls -l) and this indicates that the format needs to be
constant width.

The same with number formats: There seems to be several schools
whether to use a space or period/comma as thousands separator in a number of countries.
EG Norway - the linguists says space, but banks allways use period.
Spreadsheets use period too, and all financial software likewise.
For programs where you need to process outputted numbers again, the
numbers better be parsable - there is then a point in having the locales be
more computer-oriented than the linguists want them to be. This could lead
to having more specification possibilities in the locales.

BTW, why thin space? My understanding is that numbers often needs to
be output very aligned, and thus all digits need to have the same width.
The space should then have also the same width as a digit.
(The same goes for comma as  decimal separator)

Best regards
keld


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]