default charset for imlicit locale specificatio

Wed Jan 20 12:40:00 GMT 2010

2010/1/20 Corinna Vinschen:
> I implemented that locally.  However...
>
> On Jan 20 11:07, Corinna Vinschen wrote:
>>   874 ANSI/Thai               -> CP874 (== ISO-IR-166 used on Linux)
>>   932 SJIS                    -> SJIS
>
> This should probably better be
>
>    932 SJIS                    -> EUCJP

Yep.

>>   936 GB2312                  -> GBK
>>   949 ANSI/Korean             -> EUCKR
>>   950 Big-5                   -> Big-5
>>  1250 ANSI/Central European   -> ISO-8859-2
>>  1251 ANSI/Cyrillic           -> ISO-8859-5
>>  1252 ANSI/Latin 1            -> ISO-8859-1
>>  1253 ANSI/Greek              -> ISO-8859-7
>>  1254 ANSI/Turkish            -> ISO-8859-9
>>  1255 ANSI/Hebrew             -> ISO-8859-8
>>  1256 ANSI/Arabic             -> ISO-8859-6
>>  1257 ANSI/Baltic             -> ISO-8859-4

ISO-8859-13?

>>  1258 ANSI/Vietnamese         -> UTF-8
>> 65001 UTF-8                   -> UTF-8
>>
>> Is that a valid transition?

Yeah, good enough anyway. There will be special cases, but tough.

>> What's missing is a transition to ISO-8859-15 for languages with the
>> EUR currency letter.  I assume that's by adding the @euro modifier?
>
> I also noticed that on Linux two-letter settings like "de" or "ja" do not
> change the charset from ASCII to something else.

Such locales don't usually exist on Linux, i.e. it's probably that
setlocale is failing, leaving the
program in "C".

Andy