This is the mail archive of the
mailing list for the Cygwin project.
Re: "C" UTF-8 trouble
2009/10/7 Corinna Vinschen:
> Urgh. ÂSo we have to change nl_langinfo in newlib as well. ÂDo we have
> to return "US-ASCII" if charset is "ASCII", or is it sufficient to
> return __locale_charset() as you did, thus returning "ASCII" for "ASCII"?
I'd assume so, but WWLD?
> And what about stuff like "eucJP" vs. "EUCJP"? ÂThe charset in newlib
> is always uppercase right now.
Hmm. There's also the KOI8s, which turn into CP2866.
> As for Emacs, I'm wondering if it shouldn't be changed to set its locale
> according to setlocale(LC_CTYPE,NULL) instead, given what POSIX says.
Well, yes, but good luck with that. When Ken Brown raised the ^? vs ^H
issue, they told him that sending ^H for backspace should be
considered a bug.
> I, too, think this is a good idea. Â__get_locale_env() should be changed
> to return "C.UTF-8".
> It would be nice to check /etc/defaults/locale in __get_locale_env() as
> well, but I'm a bit reluctant to do that. ÂIt means, every invocation of
> a Cygwin process has to open that file if the environment isn't set.
> Talking about performance...
> Alternatively, the first invocation of Cygwin in a process tree could
> try to read this file only.
Agreed with the last point, but I think setenv("LANG",...) at the
first invocation of Cygwin is a better and simpler solution than
changing __get_locale_env(), because:
- it solves the emacs isssue
- applications will get the same result from setlocale(,"") and
reading the environment variables themselves, so apps that do the
latter don't have to be changed- it's more like Linux
- it doesn't require a newlib change