This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Use Unicode code points for country_isbn

From: Mike Frysinger <vapier at gentoo dot org>
To: keld at keldix dot com
Cc: Marko Myllynen <myllynen at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, libc-locales at sourceware dot org
Date: Tue, 21 Jul 2015 05:22:17 -0400
Subject: Re: [PATCH] Use Unicode code points for country_isbn
Authentication-results: sourceware.org; auth=none
References: <5571B8C2 dot 8000108 at redhat dot com> <20150609071130 dot GA26925 at domone> <5576BC13 dot 5020001 at redhat dot com> <20150721081840 dot GE12267 at vapier> <20150721084006 dot GB29742 at www5 dot open-std dot org>

On 21 Jul 2015 10:40, keld@keldix.com wrote:
> On Tue, Jul 21, 2015 at 04:18:40AM -0400, Mike Frysinger wrote:
> > On 09 Jun 2015 13:12, Marko Myllynen wrote:
> > > On 2015-06-09 10:11, Ond??ej BÃlka wrote:
> > > > On Fri, Jun 05, 2015 at 05:57:06PM +0300, Marko Myllynen wrote:
> > > >> make country_isbn definitions consistent across locales by using
> > > >> Unicode code points not numerals everywhere. The code in
> > > >> locale/categories.def and locale/programs/ld-address.c already
> > > >> handles strings.
> > > >>
> > > >> Please apply.
> > > >
> > > > Possible but why, when these are numbers which are easier to read than
> > > > strings?
> > > 
> > > that's true, and I don't feel too strongly about this, but currently
> > > some locales are using numbers and some are using Unicode code points so
> > > there's a bit of inconsistency, also it's not that hard to read these
> > > once one sees that e.g. 12 becomes "<U0031><U0032>" i.e. only the last
> > > digit matters.
> > 
> > i find many of the U markers pointlessly obscure, especially when they're used
> > for characters that are in the ASCII standard.  if we're standardizing on UTF8
> > encodings in general, why can't we convert these files as well ?  keep in mind
> > that i'm ignorant of the tooling around these files ;).
> 
> The use of Unicode points helps making the locales portable, eg.
> when crosscompiling for different architectures, including embedded systems, ebcdic
> systems, utf-16 systems and utf8 systems, when you are on a different host platform.

i'm referring to the tools we use -- either inside of the source repo
(i.e. ones we wrote/maintain), or external ones that operate on our
files directly (i.e. gcc).  what actual problems do you see here ?
vague references like "cross-compiling is magic" aren't really that
interesting.

keep in mind we already use (and agreed to standardize on) UTF8 in
things like *.c and *.h and ChangeLog and READMEs and info pages.
-mike

Attachment: signature.asc
Description: Digital signature

Follow-Ups:
- Re: [PATCH] Use Unicode code points for country_isbn
  - From: Keld Simonsen

References:
- Re: [PATCH] Use Unicode code points for country_isbn
  - From: Mike Frysinger
- Re: [PATCH] Use Unicode code points for country_isbn
  - From: keld

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]