This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29

On 10.10.2018 00:17, Rafal Luzynski wrote:
> 9.10.2018 20:34 Egor Kobylkin <> wrote:
>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>> "<U0443><U0301>" (<U00FA>).
>> It works now with
>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>> [...]
> I wonder why you need Cyrillic U with acute, and why you comment it
> as "undefined" at all.  I know that any Cyrillic vowel may appear with
> an acute accent but "the diacritic is used only in dictionaries, children's
> books, resources for foreign-language learners (...)". [1]  So maybe
> all vowels with an acute accent should be handled (which I think is fine)
> rather than just U.

I have just taken the table and
implemented it on Marko's suggestion. Personally I have no opinion on
what letters should be included and under what name. These funny Us just
happened to be in the ISO9 table.

There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
in Unicode. That’s why its coming through that way from my worksheet as
it does a reverse lookup on the names based on the Unicode codepoints.

Manually we can change it to whatever you’d suggest in the
translit_cyrillic. I just don’t know the right name.

On my side I think I have all outstanding tasks complete for the patch So please let
me know explicitly if you'd like anything changed there.

I was planning to rewrite just the commit message according to your
earlier feedback and resubmit sometime soon.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]