This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29


8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
> Hi,
>
> Thanks for the update. I have few mostly cosmetic comments below,
> hopefully we'll hear from others whether they agree with this direction.
>
> - Please add the standard glibc locale header (see the existing
> translit_* files for reference)
> - Consider wrapping the header lines at or around column 70-72
> - Consider describing which characters, character ranges, or blocks are
> supported (perhaps also describe why some of those are not included, see
> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
> - Please remove trailing whitespaces and spaces after ;

Thanks for this, Marko.  While at this, in the ChangeLog and in the commit
message these paths:

	* locales/aa_DJ: likewise

1. Should be a relative path starting in the root directory of glibc source,
   that is: "* localedata/locales/aa_DJ".
2. Should be "Likewise." (starting with an uppercase and ending with a dot).

> - No duplicates:
>
> % CYRILLIC SMALL LETTER IE
> <U0435> <U0065>; <U0065>
>
> should become:
>
> % CYRILLIC SMALL LETTER IE
> <U0435> <U0065>
>
> - There are few issues with the definitions:
>
> % CYRILLIC CAPITAL LETTER U
> <U0423> <U0055>; <U0055>
> % CYRILLIC UNDEFINED
> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>
> % CYRILLIC SMALL LETTER U
> <U0443> <U0075>; <U0075>
> % CYRILLIC UNDEFINED
> <U0443><U0443> <U00FA>; "<U0075><U0060>"

Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"?  Can we provide rules for groups of characters instead?

> I wonder would it be possible to automate generation of this file so
> that issues like the above could avoided? But perhaps that could be the
> next step once this initial patch lands.

I agree with this.

Regards,

Rafal


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]