Bug 16621 - C.UTF-8 locales should be regarded like C w.r.t. $LANGUAGE precedence
Summary: C.UTF-8 locales should be regarded like C w.r.t. $LANGUAGE precedence
Alias: None
Product: glibc
Classification: Unclassified
Component: locale (show other bugs)
Version: 2.18
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
Depends on: 17318
  Show dependency treegraph
Reported: 2014-02-21 12:47 UTC by Vincent Lefèvre
Modified: 2015-08-30 06:19 UTC (History)
3 users (show)

See Also:
Last reconfirmed:
fweimer: security-


Note You need to log in before you can comment on or make changes to this bug.
Description Vincent Lefèvre 2014-02-21 12:47:41 UTC
Scripts tend to use LC_ALL=C.UTF-8 instead of LC_ALL=C for UTF-8 support and to behave in a locale-independent manner. However $LANGUAGE is still taken into account by glibc:

xvii% LANGUAGE=fr_FR LC_ALL=C.UTF-8 cp
cp: opérande de fichier manquant
Saisissez « cp --help » pour plus d'informations.
xvii% LANGUAGE=fr_FR LC_ALL=C cp
cp: missing file operand
Try 'cp --help' for more information.

Both should have output in English.

Glibc should apply the same rules with C.UTF-8 as with C locales.

Also reported in Debian:
Comment 1 Andreas Schwab 2014-02-21 12:58:11 UTC
There is no C.UTF-8 locale in glibc.
Comment 2 Vincent Lefèvre 2014-02-21 13:41:23 UTC
(In reply to Andreas Schwab from comment #1)
> There is no C.UTF-8 locale in glibc.

That's strange, because in the Subversion mailing-list, it was regarded as standard. Subversion works well only in UTF-8 locales, and the suggested solution was to use C.UTF-8: http://mail-archives.apache.org/mod_mbox/subversion-users/201307.mbox/%3C51DC54AD.7010601@wandisco.com%3E
Comment 3 Nick Coghlan 2014-08-27 12:59:06 UTC
I have filed bug #17318 requesting the inclusion of a C.UTF-8 locale in upstream glibc (actually prompted by https://bugzilla.redhat.com/show_bug.cgi?id=902094, but I found this bug while looking to see if anyone else had already made the request)
Comment 4 Mike Frysinger 2015-08-29 20:41:37 UTC
glibc doesn't provide a C.UTF-8, so any bug report about it makes no sense
Comment 5 Nick Coghlan 2015-08-30 04:38:16 UTC
While it's true that glibc itself doesn't provide a C.UTF-8 locale, does that really make this bug report invalid?

The Debian-derived family of distros default to adding a C.UTF-8 locale at the distro level, but it doesn't quite work as expected, as it's missing some of the special casing afforded the default C locale. The specific one covered by this BZ is the face that LC_ALL=C will make glibc ignore the LANGUAGE setting, but LC_ALL=C.UTF-8 doesn't.

Another possible way of phrasing the request would be for all "C.*" locales to ignore the LANGUAGE setting the same way the unmodified "C" locale does, rather than special casing "C.UTF-8". I'm not *personally* aware of any such locales in widespread use other than "C.UTF-8", but that doesn't mean there aren't any.
Comment 6 Mike Frysinger 2015-08-30 05:59:32 UTC
(In reply to Nick Coghlan from comment #5)

bugs in distros aren't really the domain of glibc upstream.  if you think the proposal in bug 17318 has limitations or you have concerns, you should post it there or the mailing list thread on the topic.
Comment 7 Nick Coghlan 2015-08-30 06:19:55 UTC
I filed #17318 because Fedora doesn't want to add C.UTF-8 independently of upstream glibc (at least in part to avoid inconsistencies like the one reported here).

However, I also interpret the current bug closure as categorically rejecting the notion of treating C.UTF-8 the same as the C locale when it comes to the LANGUAGE variable, which doesn't seem like the correct outcome.

If I've misunderstood what "CLOSED INVALID" means and the intent is for bug #17318 to include the behaviour requested here, then yes, I would consider that a reasonable way to resolve this issue.