This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/23857] Esperanto has no country


https://sourceware.org/bugzilla/show_bug.cgi?id=23857

--- Comment #7 from Rafal Luzynski <digitalfreak at lingonborough dot com> ---
Hi,

I'm sorry for the delayed reply.

(In reply to Carmen Bianca Bakker from comment #6)
> [...]
> https://gitlab.gnome.org/GNOME/gnome-control-center/issues/260 - Appears
> glibc-related, because the languages and locales/formats map directly to
> glibc options.  I wish I was more competent with C, and I'd try to fix it up
> myself.

Thank you.  I have not looked at the source code yet but my guess is that the
list of territories comes from the list of locales with language part stripped.
 This makes some sense to me: formats, units, etc. depend on the territory
rather than language.  For example, English locale may have different units,
currency, country name etc. for USA, UK, Australia, India, Ireland, and so on. 
On the other hand, people living in one country probably use the same formats,
units, and currency even if they speak different languages.  Therefore, if you
want to select "Esperanto" as the locale for formats then... actually what
would you expect?  Currency, country name, address format, car plate - "as used
in (where?)"  Why "Netherlands" would not work better for you, for example?

I understand you may have some some good reasons to select Esperanto formats
but I'm trying to reflect the reasons of GNOME designers.

> https://bugs.python.org/issue35163 - Some weird obsolete configuration.

My first suggestion is that Python should not map ambiguous locales into
detailed ones but not supported by the current operating system.

Would adding "eo.ISO8859-3" help to fix this issue?  I think the reason is that
historically the locales without the encoding specified used 8-bit encoding
like ISO 8859-1 or ISO 8859-3.  Therefore often the locales map to 8-bit
encodings unless you specify "utf8" explicitly.  Later when Unicode became
popular and widely used, newly added locales in glibc used UTF-8 as their only
encoding.  This is the case of Esperanto: "eo" is an alias of "eo.UTF-8". 
Somehow Python treats it as an alias of "eo_XX.ISO8859-3".

On the other hand I am not sure if adding the old encodings makes sense
nowadays.  Old encodings are preserved only in order not to break existing
systems.  Does any existing Linux system use "eo.ISO8859-3" and rely on it?  Is
it likely to be true if this locale has never existed?

> (In reply to Rafal Luzynski from comment #5)
> > I was not aware of this case with Interlingua.  I would rather go for
> > renaming "ia_FR" to "ia" so that "eo" would not be alone anymore :-) but my
> > knowledge about Interlingua is too little to enforce it now.
> 
> Is it okay to add the author of the original Interlingua bug report to this
> bug report?  Perhaps they can add an original insight, and perhaps their
> motivation for choosing "ia_FR" over "ia".

The bug report is https://sourceware.org/bugzilla/show_bug.cgi?id=14879 but I
wouldn't like to bother the authors of Interlingua patch with the issues of
Esperanto.

By the way, it has been recently considered a bug by CLDR to assign Interlingua
to France:

http://unicode.org/cldr/trac/ticket/11164

This raises my motivation to rename "ia_FR" to "ia" but not to the level
sufficient to actually do it.

> [...]
> CLDR has "Unknown Region" listed under ZZ, which would work sufficiently
> well for country-less languages.  i.e., proposed solution 2, or solution 3
> with "Unknown Region" as country (and "XXX" as currency).
> 
> https://unicode.org/cldr/charts/34/summary/root.html

It is possible as a workaround but I still believe we are able to handle "eo"
without a country name.  Even more: we (the glibc project) are able to handle
it and as there are projects which do not (yet) handle it correctly I think we
should rather approach them and tell them how to fix it.  So far I don't think
we have found any project where the issue exists and cannot be fixed.

> It could also work for Yiddish, where "yi_US" is for the Yiddish population
> inside the US, and "yi_ZZ" could be used by non-US Yiddish populations who
> are spread across many other countries.  Though in the case of Yiddish
> specifically, it might probably make sense to add an Israel entry, but that
> will likely depend on a qualified volunteer doing the work.

Definitely no, Yiddish is not an artificial language and definitely is related
with some territories where it is actually spoken.  It seems to me that Israel
could make sense and I don't mind adding it if needed, probably also USA makes
sense.  I don't think that calling Yiddish "worldwide" or "non-US" or "unknown"
(in terms of territory) makes sense because we can tell the same about any
random language.

And please, if possible let's focus on Esperanto here rather than discussing
possible changes in other languages.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]