Cygwin zh_CN.GB18030 locale?
Corinna Vinschen
corinna-cygwin@cygwin.com
Thu Nov 21 11:25:39 GMT 2024
On Nov 21 05:57, Dan Shelton via Cygwin wrote:
> On Thu, 21 Nov 2024 at 05:06, Takashi Yano <takashi.yano@nifty.ne.jp> wrote:
> >
> > On Thu, 21 Nov 2024 00:16:41 +0100
> > Dan Shelton wrote:
> > > Does Cygwin have a zh_CN.GB18030 locale?
> >
> > I think so.
> >
> > $ locale
> > LANG=zh_CN.GB18030@cjknarrow
> > LC_CTYPE="zh_CN.GB18030@cjknarrow"
> > LC_NUMERIC="zh_CN.GB18030@cjknarrow"
> > LC_TIME="zh_CN.GB18030@cjknarrow"
> > LC_COLLATE="zh_CN.GB18030@cjknarrow"
> > LC_MONETARY="zh_CN.GB18030@cjknarrow"
> > LC_MESSAGES="zh_CN.GB18030@cjknarrow"
> > LC_ALL=
> > $ cp
> > cp: 缺少了文件操作数
> > 请尝试执行 "cp --help" 来获取更多信息。
> >
> > (maybe garbled due to my mailer)
>
> Looks good, except that on Win10 Enterprise Cygwin 3.5.4 locale -a says:
> locale -a | grep 18030
You're confusing locale with codeset (or charset).
$ locale -a | grep zh_CN
zh_CN
zh_CN.utf8
zh_CN.utf8@cjknarrow
zh_CN@cjknarrow
$ locale -m | grep 18030
GB18030
Note that for compat reasons, the default codeset of zh_CN is gb2312,
so you have to specify a different codeset explicitely.
export LC_MESSAGES=zh_CN.GB18030
But keep in mind that this isn't a safe bet for messages. An
application also has to support this combination! Also, Cygwin does not
support localized error strings for POSIX errno values (yet?).
On Nov 21 09:36, Thomas Wolff via Cygwin wrote:
> As cygwin (unlike the restrictive Linux locale system) allows flexible
> combination of all language indication (before the dot) with all codeset
> indications (after the dot), listing them all would be very very long.
Pretty much like this, yes.
> I don't know the criteria by which cygwin lists locales.
It lists the default combinations in conjunction with various modifiers.
The suported modifiers are only attached to languages where they make
sense.
The Windows locales use a slightly different notation than POSIX. The
function format_proc_locale_proc() in winsup/cygwin/fhandler/proc.cc
converts the strings to POSIX convention and translates them into
Linux-compatible strings, if the Windows and Linux language specifiers
disagree.
For instance, Windows "zgh-Tfng-MA" locale translates to "ber_MA" on
Linux/Cygwin, Windows "tzm-Latn-DZ" translates to "ber_DZ", all other
combinations with "zgh" and "tzm" are suppressed.
Windows also uses different default settings in some languages. For
instance, for the serbian language, Windows defaults to latin, Linux
defaults to cyrillic. Cygwin supports the language specification as on
Linux,
For instance, "sr_RS" defaults to cyrillic, while "sr_RS@latin" uses the
latin script.
Basically Cygwin allows most locales supported by Windows, but the role
model is Linux. That means that some combination are suppressed, but
these are very few.
There's also locale info which is not supported on Windows at all.
These are the LC_MESSAGES strings and the LC_TIME Era strings. Those
are taken from Linux, so only the matching info available on Linux is
supported, otherwise we fall back to the C" locale.
It's quite a bit of tweaking.
Corinna
More information about the Cygwin
mailing list