This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29

From: Rafal Luzynski <digitalfreak at lingonborough dot com>
To: Marko Myllynen <myllynen at redhat dot com>, Egor Kobylkin <egor at kobylkin dot com>, Keld Simonsen <keld at keldix dot com>
Cc: libc-alpha at sourceware dot org, libc-locales at sourceware dot org, "Dmitry V. Levin" <ldv at altlinux dot org>, Volodymyr Lisivka <vlisivka at gmail dot com>, Carlos O'Donell <carlos at redhat dot com>, Max Kutny <mkutny at gmail dot com>, danilo at gnome dot org
Date: Wed, 10 Oct 2018 00:08:56 +0200 (CEST)
Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
References: <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com> <20180412224352.GB2911@altlinux.org> <16e785f3-2e9f-ceb2-698f-dc33c91a5d5e@kobylkin.com> <ac4c9b3e-aeae-30de-23ef-24d8f53d7bc4@kobylkin.com> <20181003091949.GA21486@rap.rap.dk> <21d872b2-613e-d1f5-26c0-baa4b5721df9@kobylkin.com> <1485772360.805333.1538731225156@poczta.nazwa.pl> <19e29568-e710-535f-4f90-98dbcec930ed@kobylkin.com> <1028447684.826961.1539036295224@poczta.nazwa.pl> <63fb4fae-a93b-7aff-13df-4452cbc8853f@redhat.com>
Reply-to: Rafal Luzynski <digitalfreak at lingonborough dot com>

9.10.2018 18:10 Marko Myllynen <myllynen@redhat.com> wrote:
> On 2018-10-09 01:04, Rafal Luzynski wrote:
> > If you refer to other languages than Russian which also use the Cyrillic
> > alphabet but need a different transliteration rules than Russian for
> > the same characters then it is OK for me now. I am afraid that the iconv
> > algorithm does not handle such case. Of course, we should add this missing
> > feature eventually but I do not volunteer to do it now.
>
> Yes, this would be needed for correct transliteration of different
> languages, and this might be quite a bit of work. There's also the case
> of transliteration and character sets, consider the transliteration
> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>
> Russian: Борис Николаевич Ельцин
> Int'l: Boris Nikolaevič Elʹcin
> Finnish: Boris Nikolajevitš Jeltsin
> French: Boris Nikolaïevitch Ieltsine
> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]

No, I did not mean the transcription using the rules of the destination
locale using Latin but that the rules of transliteration may be different
depending on the language of the source text.  For example, consider
this Cyrillic string: "нъг" (I'm not telling that it is actually used
in any existing word but still must be handled).  By our transliteration
rules it will be transliterated as "n``g".  But this is fine for Russian;
if we knew that the source string is Ukrainian it would be transliterated
as "n``h"; if it was Bulgarian it would be transliterated as "năg".
Similarly, if you had to transliterate the Latin letters "sch" to Cyrillic
first you would have to ask what was be the source language.

Unfortunately, I think that distinction of the source language is impossible
at the moment so let's assume that we fall back to Russian if there is
any ambiguity.

Regards,

Rafal

Follow-Ups:
- Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Marko Myllynen

References:
- Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Egor Kobylkin
- Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Keld Simonsen
- Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Egor Kobylkin
- Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Rafal Luzynski
- [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Egor Kobylkin
- Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Rafal Luzynski
- Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
  - From: Marko Myllynen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]