This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
9.10.2018 18:10 Marko Myllynen <firstname.lastname@example.org> wrote:
> On 2018-10-09 01:04, Rafal Luzynski wrote:
> > If you refer to other languages than Russian which also use the Cyrillic
> > alphabet but need a different transliteration rules than Russian for
> > the same characters then it is OK for me now. I am afraid that the iconv
> > algorithm does not handle such case. Of course, we should add this missing
> > feature eventually but I do not volunteer to do it now.
> Yes, this would be needed for correct transliteration of different
> languages, and this might be quite a bit of work. There's also the case
> of transliteration and character sets, consider the transliteration
> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
> Russian: Борис Николаевич Ельцин
> Int'l: Boris Nikolaevič Elʹcin
> Finnish: Boris Nikolajevitš Jeltsin
> French: Boris Nikolaïevitch Ieltsine
> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
No, I did not mean the transcription using the rules of the destination
locale using Latin but that the rules of transliteration may be different
depending on the language of the source text. For example, consider
this Cyrillic string: "нъг" (I'm not telling that it is actually used
in any existing word but still must be handled). By our transliteration
rules it will be transliterated as "n``g". But this is fine for Russian;
if we knew that the source string is Ukrainian it would be transliterated
as "n``h"; if it was Bulgarian it would be transliterated as "năg".
Similarly, if you had to transliterate the Latin letters "sch" to Cyrillic
first you would have to ask what was be the source language.
Unfortunately, I think that distinction of the source language is impossible
at the moment so let's assume that we fall back to Russian if there is