[PATCH] Use Unicode code points for country_isbn
Keld Simonsen
keld@keldix.com
Wed Jul 22 19:34:00 GMT 2015
On Wed, Jul 22, 2015 at 05:25:04PM +0000, Joseph Myers wrote:
> On Tue, 21 Jul 2015, Keld Simonsen wrote:
>
> > It would mean that you cannot use the locale sources for crosscompiling
> > when using some different character sets on the hosting and the target
> > machines. Eg if you are making embedded systems on IOS or Windows or
> > other utf16 machines for an utf8 target, or making stuff for android. Or
> > the other way round if you are omn an utf8 host and generate locales for
> > a utf16 target such as a utf16 embedded system or an iphone or ipad
> > system.
>
> On the build system on which glibc is built, we can always assume that the
> glibc sources are the exact sequences of octets provided by the glibc
> project, not converted into another character set and without any
> conversions of line endings. Furthermore, on any system using glibc and
> executing tools such as localedef with the installed locale source files,
> it can be assumed that those source files are the files shipped with
> glibc, not those files after conversion into another character set. Use
> of glibc source files after conversion into another character set is
> outside the scope of the glibc project - glibc is not expected to build
> with such converted source files.
Sounds strange. glibc is the library for the GNU C language. Standard ISO C
is coded character set independent, as is also POSIX. Why would the glibc project
not follow ISO C and POSIX design goals? Why would glibc exclude itself
from Apple and Microsoft (utf16) and non-utf8 Linux and UNIX systems?
Maybe we should clone glibc to make it available on other platforms
than those using utf8. Or maybe you are not correct. I have not been watching
the glibc project close enough to tell.
> Now, it's true that the installed localedef utility should be usable in
> locale A to generate locale B, for any pair (A, B) of installed locales -
> rather than only being able to generate locales as part of the glibc build
> / install process. If localedef interprets locale sources in the
> character set of the locale in which it runs, that may mean the installed
> locale sources do need to be in ASCII. How does localedef determine the
> character set in which to interpret the textual locale source files?
Yes, that is why we use UCS symbolic code points. I would then rather to be
fully consistent use UCS symbolic code points all the way thru a locale source,
it is a bit more cumbersome, but I would rather be consistent. And it would facilitate
the crosscompiling I wrote about. I don't think there is a mix of locales where it
matters on Linux boxes. Oh well, some thinkable scenarios:
Apple or Windosw users on a linux box, linux users on apple or Windows boxes,
Some mix with EBCDIC - more unlikely, but still thinkable is a big
mainfame and number cruncher environment, the mainframe being IBM mainframe
running VM/CMS and the number cruncher being a linux supercomputer, eg in
a financial institution.
Keld
More information about the Libc-locales
mailing list