Console codepage setting via chcp?

Andy Koppe andy.koppe@gmail.com
Wed Sep 23 19:57:00 GMT 2009


2009/9/23 Corinna Vinschen:
> ssh simply doesn't need to set a locale.  It just sends byte streams
> back and forth.  It doesn't give a dime for the content and it doesn't
> care for the terminal it's running in.
>
> This shows in a way that the current automatic console mode is not such
> a good idea.  ssh doesn't care for that because in all other existing
> terminals and terminal emulators it's not the application which decides
> what charset the terminal uses, but the user who started the terminal.

I see.


> After so many months of looking into the charset stuff it occured to me
> just a few minutes ago, that there was *always* a way to switch the
> codepage of the console in a fixed manner: chcp.

Hmm, I note it even allows codepage 65001, aka UTF-8, including on XP.

(Which btw means that fhandler_console could go back to simply using
"ANSI" console I/O functions and let Windows take care of the
conversions. Except that CP20932 and eucJP aren't quite the same
thing.)


> If Cygwin uses the
> codepage returned by GetConsoleOutputCP(), then it uses what the user
> chose by running chcp, or the default OEM codepage.  The alternate
> charset, typically only used for the graphical characters anyway,
> could be either CP 437, or what GetOEMCP() returns.
>
> This way the charset used to print characters in the Windows console
> is a nicely encapsulated user setting, just like with mintty, xterm,
> and other terminal emulators.
>
> I tested this on XP and W7 and it works fine.  The documentation
> would just have to be extended to explain to the user how to switch
> the console output codepage using the native chcp tool.
>
> My question is, what do you all think?  Isn't that a much better
> controllable setting then how it's done now?

I agree in principle.


> The only downside from my point of view is that the user has to know the
> codepage numbers.

Yes, there's that, and something else: LC_CTYPE (or LANG or LC_ALL)
has to be in sync with the console/terminal's setting, so that
applications that do care about them (i.e., most) work correctly.

Mintty 0.5 does that by setting LANG according to the locale/charset
fields in its options dialog, or by using the charset specified in the
environment if nothing is set in the options.

How could that be done for the console though? A couple of lines like
this in cygwin.bat?

  chcp 65001
  set LC_CTYPE=C.UTF-8

Andy



More information about the Cygwin-developers mailing list