This is the mail archive of the cygwin-developers mailing list for the Cygwin project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: "C" UTF-8 trouble

From: Corinna Vinschen <corinna-cygwin at cygwin dot com>
To: cygwin-developers at cygwin dot com
Date: Wed, 7 Oct 2009 18:17:19 +0200
Subject: Re: "C" UTF-8 trouble
References: <416096c60910060814u108e2193r852796974e002ca7@mail.gmail.com> <20091006153724.GQ12789@calimero.vinschen.de> <20091006181649.GC18135@ednor.casa.cgf.cx> <416096c60910061146m1c4c9aa5ic6b1c55d50233fb5@mail.gmail.com> <4ACBEB43.3080508@byu.net> <416096c60910062307u6c81c82eh790542b72875d7dd@mail.gmail.com> <20091007090317.GV12789@calimero.vinschen.de> <416096c60910070308x387db45au9438462aced8d859@mail.gmail.com> <20091007125427.GW12789@calimero.vinschen.de> <4ACCA074.30301@towo.net>
Reply-to: cygwin-developers at cygwin dot com

On Oct  7 16:06, Thomas Wolff wrote:
> Corinna Vinschen wrote:
>> ...
>>
>> $ ./nll
>> ANSI_X3.4-1968
>>
>> $ LANG=C.UTF-8 ./nll
>> ANSI_X3.4-1968
>>
>> $ LANG=ja_JP ./nll
>> EUC-JP
>>
>> $ LANG=ru_RU ./nll
>> ISO-8859-5
>>
>> $ LANG=ru_UA ./nll
>> KOI8-U
>>
>> $ LANG=zh_CN ./nll
>> GB2312
>>
>> $ LANG=zh_TW ./nll
>> BIG5
>>
>> Sigh.  Do we really need a translation table?
>>   
> Yes (sigh). And yes, that's what I had suggested before. Actually, "locale 
> charmap" (on a system with a locale command) gives you the same information 
> as "nll".
> If you want a table, a fairly complete one is included in my package mined, 
> file src/locales.t (generated from src/locales.cfg).
> (Complete in the sense that all locales without explicit suffix not listed 
> here map to ISO-8859-1; maybe I should also include them to distinguish 
> unknown locales ...)
> And, as becomes clear here, the syntax of charmap/codeset names is 
> different between locale names and nl_langinfo,
> e.g. eucJP vs. EUC-JP.

I agree to the general picture.  However, as I mentioned in the mail
you're partially quoting, we just have to draw the line at one point,
even if the solution might be a bit bumpy for the time being.
Therefore, I think we should go for the value returned by
__locale_charset () *for now*.

If you want to contribute your table and the necessary code to make it
working within Cygwin, please feel free.  I'm very obviously glad for
helpful code which eases the internationalization pain.  As for
contributing, newlib's not a problem, while for Cygwin... <insert
obligatory reference to cygwin copyright assignment here>(*).

Corinna

(*) http://cygwin.com/assign.txt

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

References:
- Re: "C" UTF-8 trouble
  - From: Andy Koppe
- Re: "C" UTF-8 trouble
  - From: Corinna Vinschen
- Re: "C" UTF-8 trouble
  - From: Christopher Faylor
- Re: "C" UTF-8 trouble
  - From: Andy Koppe
- Re: "C" UTF-8 trouble
  - From: Eric Blake
- Re: "C" UTF-8 trouble
  - From: Andy Koppe
- Re: "C" UTF-8 trouble
  - From: Corinna Vinschen
- Re: "C" UTF-8 trouble
  - From: Andy Koppe
- Re: "C" UTF-8 trouble
  - From: Corinna Vinschen
- Re: "C" UTF-8 trouble
  - From: Thomas Wolff

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]