This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug localedata/2872] Transliteration Cyrillic -> ASCII fails
- From: "ekobylkin at paypal dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Tue, 08 Sep 2015 10:20:52 +0000
- Subject: [Bug localedata/2872] Transliteration Cyrillic -> ASCII fails
- Auto-submitted: auto-generated
- References: <bug-2872-131 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=2872
--- Comment #18 from Egor Kobylkin <ekobylkin at paypal dot com> ---
(In reply to Ulrich Drepper from comment #2)
> Transliteration is locale dependend, there is no way around it:
>
> Russian/Cyrillic: Горбачёв
>
> German transliteration: Gorbaschow
>
> English transliteration: Gorbatsov or Gorbatsev
>
> If you want cyrillic transliteration for the locale you use, provide the
> data.
I want to comment on this to clarify my starting point and ask for suggestions
in case somebody decide to take on further development. For now I believe the
issue is solved well however in a most basic way.
>From the Russian speaking person point of view there are various
transliterations possible for Cyrillic depending on the purpose. A good example
of a multiplicity of such transliterations is listed here
http://transliteration.ru/ However having different characters to represent the
Cyrillic letters they have same phonetic meaning for a Russian-speaking person.
So any of them could be used for all the Latin locales. This is what I propose
as a first approximation to solve this issue. My submission above takes this
approach with the GOST 7.79-2000 transliteration chosen as a basis.
For a non-Russian speaking person a yet different transliteration may make
sense to represent their phonetic rules. This is what Ulrich is referring in
his comment above.
One could take my table as a basis and create separate transliteration tables
to specific locales. The one I have proposed could then still serve as a
ASCII//TRANSLIT target or be replaced by a most proper one.
--
You are receiving this mail because:
You are on the CC list for the bug.