[PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]

Egor Kobylkin egor@kobylkin.com
Wed Jan 9 00:46:00 GMT 2019


On 07.01.19 21:37, Marko Myllynen wrote:
> Hi,
> 
> On 05/01/2019 23.12, Egor Kobylkin wrote:
>> On 05.01.19 15:35, Rafal Luzynski wrote:
>>> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> Changelog v12:
>>>> [...]
>>>>
>>>> Changelog v11:
>>>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>>>> file for the ASCII translit table.
>>>> * Correspondingly the patch now only contains the additional
>>>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>>>> The 'include "translit_cyrillic";""' directives are not necessary in the
>>>> locale files and they are now all left intact.
>>>> * Also the file translit_cyrillic is not longer needed and is omitted.
>>>> * Edited below email, commit message.
>>>> [...]
>>>
>>> I have tested this and, unfortunately, now this transliteration
>>> works *only* in C locale, that is, only when no locale is set or when
>>> it is explicitly set to C (C.UTF8, POSIX).  It does not work when locale
>>> is set to anything different, including en_US, ru_RU, etc.
>>
>> Good catch! Should we maybe split this into two patches, one for C and
>> the other for "country" locales? They have different codes and
>> functionality so it looks like it would be easier to keep focus.
> 
> That would probably make sense, the standard C/POSIX locale won't
> support System A so it also narrows down solution alternatives with it.
> 

[SNIP]

>> "Country" locales in localedata/locales/ can then have the exact same
>> translit table included or they can have any other flavor - I don't see
>> a problem here.
> 
> Indeed, and since those files are not limited to ASCII, perhaps we could
> now reconsider the v9 approach for them, i.e., prefer System A if
> possible, otherwise use System B / ASCII (just need to make sure that
> the ASCII fall-back for them will match the built-in C ASCII rule)?
> 

Happy to hear the split seems to be a clear cut one.
How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]... 
C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report 
(number) and title for clarity in communication?

The bug report for [PATCH v9] ("Countries" locales) should then ideally 
have your (and others) explicit requirements as to the GOST System A/B 
fall-back, which countries to include etc. Again, myself I have no other 
req. here but just to have _any_ translit in place.

This way it would probably be easier to have the decision making process 
tied up for both patches (separately). We may want to get the v12 POSIX 
out of the door in 2.30 then and can take all the time we need to set up 
the rules for "Countries" locales as you need them to be.

Bests,
Egor



More information about the Libc-locales mailing list