This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Use Unicode code points for country_isbn
- From: Keld Simonsen <keld at keldix dot com>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: Marko Myllynen <myllynen at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, libc-locales at sourceware dot org
- Date: Wed, 22 Jul 2015 21:02:28 +0200
- Subject: Re: [PATCH] Use Unicode code points for country_isbn
- Authentication-results: sourceware.org; auth=none
- References: <5571B8C2 dot 8000108 at redhat dot com> <20150609071130 dot GA26925 at domone> <5576BC13 dot 5020001 at redhat dot com> <20150721081840 dot GE12267 at vapier> <20150721084006 dot GB29742 at www5 dot open-std dot org> <20150721092217 dot GG12267 at vapier> <20150721115852 dot GA24115 at rap dot rap dot dk> <alpine dot DEB dot 2 dot 10 dot 1507221719420 dot 21570 at digraph dot polyomino dot org dot uk>
On Wed, Jul 22, 2015 at 05:25:04PM +0000, Joseph Myers wrote:
> On Tue, 21 Jul 2015, Keld Simonsen wrote:
>
> > It would mean that you cannot use the locale sources for crosscompiling
> > when using some different character sets on the hosting and the target
> > machines. Eg if you are making embedded systems on IOS or Windows or
> > other utf16 machines for an utf8 target, or making stuff for android. Or
> > the other way round if you are omn an utf8 host and generate locales for
> > a utf16 target such as a utf16 embedded system or an iphone or ipad
> > system.
>
> On the build system on which glibc is built, we can always assume that the
> glibc sources are the exact sequences of octets provided by the glibc
> project, not converted into another character set and without any
> conversions of line endings. Furthermore, on any system using glibc and
> executing tools such as localedef with the installed locale source files,
> it can be assumed that those source files are the files shipped with
> glibc, not those files after conversion into another character set. Use
> of glibc source files after conversion into another character set is
> outside the scope of the glibc project - glibc is not expected to build
> with such converted source files.
Sounds strange. glibc is the library for the GNU C language. Standard ISO C
is coded character set independent, as is also POSIX. Why would the glibc project
not follow ISO C and POSIX design goals? Why would glibc exclude itself
from Apple and Microsoft (utf16) and non-utf8 Linux and UNIX systems?
Maybe we should clone glibc to make it available on other platforms
than those using utf8. Or maybe you are not correct. I have not been watching
the glibc project close enough to tell.
> Now, it's true that the installed localedef utility should be usable in
> locale A to generate locale B, for any pair (A, B) of installed locales -
> rather than only being able to generate locales as part of the glibc build
> / install process. If localedef interprets locale sources in the
> character set of the locale in which it runs, that may mean the installed
> locale sources do need to be in ASCII. How does localedef determine the
> character set in which to interpret the textual locale source files?
Yes, that is why we use UCS symbolic code points. I would then rather to be
fully consistent use UCS symbolic code points all the way thru a locale source,
it is a bit more cumbersome, but I would rather be consistent. And it would facilitate
the crosscompiling I wrote about. I don't think there is a mix of locales where it
matters on Linux boxes. Oh well, some thinkable scenarios:
Apple or Windosw users on a linux box, linux users on apple or Windows boxes,
Some mix with EBCDIC - more unlikely, but still thinkable is a big
mainfame and number cruncher environment, the mainframe being IBM mainframe
running VM/CMS and the number cruncher being a linux supercomputer, eg in
a financial institution.
Keld