This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Console codepage setting via chcp?


2009/9/25 Corinna Vinschen:
>> - System objects will always be translated using UTF-8. This includes
>> file names, user names, and initial environment variables (and
>> probably more I'm not aware of).
>
> More than 10 minutes later I'm still thinking that this is the best
> solution in the long run. ÂThere will be no situation in which any
> process running on the system has a different idea of a system object
> than any other process. ÂThat could also help to avoid interoperability
> issues in client/server applications.

Yes, there's a lot to be said for keeping such complications to a
minimum. Here are some further deliberations on the topic:

http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html#utf8

The downside, of course, is that non-ASCII filenames created in a
non-UTF8 locale won't show up correctly in Windows, and vice versa.
But that's the same on Linux if the global setting is UTF-8 while the
terminal is set to something else. And the stock answer to any
complaints will be: Use UTF-8!

In any case, the DCxx scheme will ensure that things work correctly
within any particular locale.

And I guess the ^N scheme can go (or be disabled)?


>> - The "C" locale's charset will be UTF-8.
>Yes.
>> - There'll be language-neutral "C.<charset>" locales.
>Yes.
>> - The user's ANSI codepage will remain the default charset for
>> "language_TERRITORY" locales.
>Yes.

Thanks, this gives me something to work with for mintty. Luckily, due
to the everything-is-UTF-8 approach, no mingw wrapper is actually
needed after all, as it wouldn't make a difference to anything anyway.


>> - ÂThe console charset will be set according to LC_ALL/LC_CTYPE/LANG
>> when cygwin1.dll is initialised. (Or will 'setcons' be needed for
>> that?)
>
> Hmm. ÂUnsure. ÂI know that Thomas dislikes the idea and you are not
> overly convinced either. ÂOne of Thomas arguments is the non-standard
> tool necessary to switch the terminal charset. ÂI think that's not a
> valid argument. ÂThere is no standard how to switch the charset used by
> a terminal.

As far as I know, xterm, rxvt, gnome-terminal and konsole all respect
the locale variables unless a program-specific option is used.


>ÂSo, utilizing the initial setting of LC_ALL/ff. is as good
> as defaulting to UTF-8 and allowing to switch via a setcons tool.

'setcons' requires a wrapper script, whereas the variables don't
necessarily, as they can be set in the Windows environment. This would
allow programs to be invoked directly from a shortcut and still
picking up the user's setting.

Also, one of the locale variables needs to be set anyway if one wants
to use something other than the default locale.


> I have
> found an easy way to allow a setcons tool which only switches the charset
> used by Cygwin. ÂIt doesn't affect the setting in cmd, or made by chcp.

That's a good idea. I've come round to thinking that 'setcons' is
worth having in addition to the initial setting from the environment.


>> - setlocale() will have no effects beyond what's expected in Linux.
>
> Well... probably. ÂI'm not saying yes without asking a lawyer first.

:)  I put that a bit too probingly, didn't I?

Andy


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]