CYGWIN=codepage? Or LC_CTYPE=foo?

Corinna Vinschen
Thu Apr 3 15:55:00 GMT 2008


I just spent some time to inspect the WIn32 calls used throughout
Cygwin and (knock on wood), it looks like we got rid of practically
all Win32 calls which would suffer from native character sets or
using UTF-8.

That means, in theory there's no reason anymore to keep the
CYGWIN=codepage setting in the environment.  We could use the LC_CTYPE
setting, just as on other systems.  Right now, we need the LC_CTYPE
set to "C-UTF-8" anyway when using the codepage:utf8 setting, otherwise
the wcstombs and mbstowcs conversions in newlib will be broken.

But there's a problem.  The newlib conversion functions don't know
anything about Windows codepages, and the Windows conversion functions
used in the Cygwin functions sys_wcstombs and sys_mbstowcs don't know
anything about LC_CTYPE. 

OTOH, I have not the faintest idea if we could just drop using the
Windows conversion functions and use the newlib functions exclusively.
I'm not *that* fluent with NLS.

However, what we could do even from my dizzy knowledge is to get
rid of the LC_CTYPE-bound handling in newlib and to create matching
replacements in Cygwin which decide about the conversion by using
the codepage setting.

So we can do it one way or the other, but I'm totally unsure what's
the better way...


Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

More information about the Cygwin-developers mailing list