cygwin started speaking German today

Bruno Haible bruno@clisp.org
Thu Sep 8 20:57:00 GMT 2011


Hello Corinna,

Corinna Vinschen wrote:
> > After Cygwin 1.7 added working locales and defined LANG=C.UTF-8 for all users,
> > libintl could be extended to respect the choices the user has made in the
> > system control panels.
> 
> That's the wrong approach.

Before discussing the technical details, let me remind the goal.

The goal of GNU gettext is to enable localization of programs on all
systems that support that. The benefit for the user is that programs feel
more "friendly". The benefit for the Free Software community is that more
users can use our programs and can contribute to their development, even if
they don't speak English.

Cygwin 1.7 has made major steps towards this goal,
  - by adding working locales,
  - by implementing various internationalization related API in cygwin.dll,
  - by choosing UTF-8 as the default encoding for locales, with automatic
    charset conversion happening in the connection to the console window.

The last cornerstone that is missing is that a user needs to do nothing to
enable the localization of messages to his/her language. It should be
automatic. Gettext 0.18.1.1 fills this gap.

> Cygwin is not Windows but a POSIX system in the first place.

POSIX explicitly allows and foresees such localization without a specific
action of the user. Quoting POSIX:2008
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html>:

     "All implementations shall define a locale as the default locale, to be
      invoked when no environment variables are set, or set to the empty
      string.  This default locale can be the POSIX locale or any other
      implementation-defined locale.  Some implementations may provide
      facilities for local installation administrators to set the default
      locale, customizing it for each location.  POSIX:2008 does not require
      such a facility."

This "facility" for "local installation administrators" is the
"Regional Settings" control panel, in Windows (as well as in MacOS X).

So, what we need is that when the user has set his/her regional settings
to "German", and has not set any environment variables, then a program
that does

      setlocale (LC_ALL, "");

will pick up the German locale. In Cygwin 1.7 it is called "de_DE.UTF-8".

There are three ways to achieve this behaviour:

  a) The system can set environment variables that reflect the regional
     settings. For example, if the user has chosen German, Cygwin's
     login process could set LANG to de_DE.UTF-8.

     This approach is used in Linux desktops like KDE.

  b) The system's setlocale() function can, when the second argument is
     the empty string and the respective environment variables don't
     specify anything, fetch the value from the "Regional settings"
     panel.

     Cygwin could do that.

  c) Programs can call libintl_setlocale(), and libintl_setlocale can,
     when the second argument is the empty string and the respective
     environment variables don't specify anything, fetch the value
     from the "Regional settings" panel.

     This is what's implemented in gettext 0.18.1.1.

> Do NOT call Windows functions in Cygwin libraries, unless
> the lib is doing something very special which isn't provided by POSIX
> functions.  Only call POSIX functions.  Don't mix the Cygwin and the
> Windows environment.  Please leave the interfacing to the underlying OS
> the sole job of Cygwin.

OK, then the following four facilities are needed in Cygwin.

1) We need the name of the locale which is in effect when the user has
   not specified environment variables.

   Either through option a) above. Programs can then do getenv ("LANG").
   Cygwin documentation <http://www.cygwin.com/cygwin-ug-net/setup-locale.html>
   currently says "The default locale in the absence of the aforementioned
   locale environment variables is "C.UTF-8"." This would have to change.

   Or through option b) above. Programs can then peek at the return
   value of  setlocale (LC_ALL, "").

   Or through an API function that calls GetUserDefaultLCID() and
   converts that to a glibc style locale name (e.g. "zh_CN.UTF-8")
   or to an RFC 3066 style locale name (e.g. "zh-Hans").

2) We need the name of the locale of the current thread.

   Either through a function newlocale(), as in POSIX.

   Or through an API function that calls GetThreadLocale() and
   converts that to a glibc style locale name (e.g. "zh_CN.UTF-8")
   or to an RFC 3066 style locale name (e.g. "zh-Hans").

   Locale per thread is mainly needed for web application servers,
   not for GUI programs.

3) Gettext needs the priority list of languages, if the "Regional Settings"
   panel can specify it. MacOS X has this setting customizable, I don't know
   whether newer Windows versions have it has well.

4) Programs that do number or date/time formatting will need to access the
   values that the user has specified. E.g. those set in
   <http://www.sisulizer.de/_img/codepage-problems/codepage-regional.jpg>
   <http://pc-error-free.com/blog/wp-content/uploads/2008/12/regional-settings.gif>
   <http://www.sisulizer.de/_img/codepage-problems/w7-regions-and-languages-formats.jpg>

I believe all of these are available through Win32 or MUI API calls,
and decently internationalized programs will need this.

Bruno
-- 
In memoriam Elisabeth von Thadden <http://en.wikipedia.org/wiki/Elisabeth_von_Thadden>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list