default charset for imlicit locale specificatio

Andy Koppe andy.koppe@gmail.com
Wed Jan 20 12:40:00 GMT 2010


2010/1/20 Corinna Vinschen:
> I implemented that locally.  However...
>
> On Jan 20 11:07, Corinna Vinschen wrote:
>>   874 ANSI/Thai               -> CP874 (== ISO-IR-166 used on Linux)
>>   932 SJIS                    -> SJIS
>
> This should probably better be
>
>    932 SJIS                    -> EUCJP

Yep.

>>   936 GB2312                  -> GBK
>>   949 ANSI/Korean             -> EUCKR
>>   950 Big-5                   -> Big-5
>>  1250 ANSI/Central European   -> ISO-8859-2
>>  1251 ANSI/Cyrillic           -> ISO-8859-5
>>  1252 ANSI/Latin 1            -> ISO-8859-1
>>  1253 ANSI/Greek              -> ISO-8859-7
>>  1254 ANSI/Turkish            -> ISO-8859-9
>>  1255 ANSI/Hebrew             -> ISO-8859-8
>>  1256 ANSI/Arabic             -> ISO-8859-6
>>  1257 ANSI/Baltic             -> ISO-8859-4

ISO-8859-13?


>>  1258 ANSI/Vietnamese         -> UTF-8
>> 65001 UTF-8                   -> UTF-8
>>
>> Is that a valid transition?

Yeah, good enough anyway. There will be special cases, but tough.


>> What's missing is a transition to ISO-8859-15 for languages with the
>> EUR currency letter.  I assume that's by adding the @euro modifier?
>
> I also noticed that on Linux two-letter settings like "de" or "ja" do not
> change the charset from ASCII to something else.

Such locales don't usually exist on Linux, i.e. it's probably that
setlocale is failing, leaving the
program in "C".

Andy



More information about the Cygwin-developers mailing list