This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]


On 07.01.19 21:37, Marko Myllynen wrote:
Hi,

On 05/01/2019 23.12, Egor Kobylkin wrote:
On 05.01.19 15:35, Rafal Luzynski wrote:
2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:

Changelog v12:
[...]

Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.
[...]

I have tested this and, unfortunately, now this transliteration
works *only* in C locale, that is, only when no locale is set or when
it is explicitly set to C (C.UTF8, POSIX).  It does not work when locale
is set to anything different, including en_US, ru_RU, etc.

Good catch! Should we maybe split this into two patches, one for C and
the other for "country" locales? They have different codes and
functionality so it looks like it would be easier to keep focus.

That would probably make sense, the standard C/POSIX locale won't
support System A so it also narrows down solution alternatives with it.


[SNIP]

"Country" locales in localedata/locales/ can then have the exact same
translit table included or they can have any other flavor - I don't see
a problem here.

Indeed, and since those files are not limited to ASCII, perhaps we could
now reconsider the v9 approach for them, i.e., prefer System A if
possible, otherwise use System B / ASCII (just need to make sure that
the ASCII fall-back for them will match the built-in C ASCII rule)?


Happy to hear the split seems to be a clear cut one.
How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]... C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report (number) and title for clarity in communication?

The bug report for [PATCH v9] ("Countries" locales) should then ideally have your (and others) explicit requirements as to the GOST System A/B fall-back, which countries to include etc. Again, myself I have no other req. here but just to have _any_ translit in place.

This way it would probably be easier to have the decision making process tied up for both patches (separately). We may want to get the v12 POSIX out of the door in 2.30 then and can take all the time we need to set up the rules for "Countries" locales as you need them to be.

Bests,
Egor


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]