This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]


Hi,

On 05/01/2019 23.12, Egor Kobylkin wrote:
> On 05.01.19 15:35, Rafal Luzynski wrote:
>> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>
>>> Changelog v12:
>>> [...]
>>>
>>> Changelog v11:
>>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>>> file for the ASCII translit table.
>>> * Correspondingly the patch now only contains the additional
>>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>>> The 'include "translit_cyrillic";""' directives are not necessary in the
>>> locale files and they are now all left intact.
>>> * Also the file translit_cyrillic is not longer needed and is omitted.
>>> * Edited below email, commit message.
>>> [...]
>>
>> I have tested this and, unfortunately, now this transliteration
>> works *only* in C locale, that is, only when no locale is set or when
>> it is explicitly set to C (C.UTF8, POSIX).  It does not work when locale
>> is set to anything different, including en_US, ru_RU, etc.
> 
> Good catch! Should we maybe split this into two patches, one for C and
> the other for "country" locales? They have different codes and
> functionality so it looks like it would be easier to keep focus.

That would probably make sense, the standard C/POSIX locale won't
support System A so it also narrows down solution alternatives with it.

(If the C.UTF-8 locale (see
https://sourceware.org/bugzilla/show_bug.cgi?id=17318) materializes one
day I'm not sure would transliteration be applicable in that context.)

> My understanding is that locale/C-translit.h.in is still the proper
> locale for the sole ASCII translit table. It is also the only solution
> for many use cases where there is no locale available (not compiled or
> not set).

Correct, as Siddhesh mentioned those rules will end up to the built-in
C/POSIX locale which is ASCII and will be used if no other locales are
available or set properly. The translit_* files won't affect to it.

> "Country" locales in localedata/locales/ can then have the exact same
> translit table included or they can have any other flavor - I don't see
> a problem here.

Indeed, and since those files are not limited to ASCII, perhaps we could
now reconsider the v9 approach for them, i.e., prefer System A if
possible, otherwise use System B / ASCII (just need to make sure that
the ASCII fall-back for them will match the built-in C ASCII rule)?

Thanks,

-- 
Marko Myllynen


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]