This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: The C locale

On Sep 24 16:03, IWAMURO Motonori wrote:
> 2009/9/22 Andy Koppe <>:
> > Let's use the Windows "ANSI" codepage as the character set for the C
> > locale, for both the conversion functions and filenames. This means
> > CP1252 on Western systems, CP1251 on Cyrillic ones, CP932 on Japanese
> > ones, and so on.
> I oppose the approach (the ANSI codepage is used at C locale) because
> CP932 (the codepage for Japanese) is hostile to the UNIX-like tools.
> The reason is that the CP932 format contains a lot of meta characters
> as follows.
>   single character of CP932:
> /[\x00-\x7F\xA0-\xDF]|[\x81-\x9F\xE0-\xFC][\x40-\x7E\x80-\xFC]/

I don't understand.  Are you saying that the single character in CP932
consists of 12 bytes?  As far as I can see, CP932 is S-JIS, which
is a just a simple double byte character set.  What am I missing.

> This has a ruined influence to the tools that don't see locale.

Can you please try to explain the problem in a bit more detail for
those of us not fluent in eastern asian languages?  What do you
mean with "hostile" and "ruined influence"?


Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

Problem reports:
Unsubscribe info:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]