This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]


Freeze ping.

I'd like to ping the list on this patch and to have some discussion on
moving ASCII transliteration to locale/C-translit.h.in before the freeze.

The wiki page for 2.29 [12] is set as "immutable" for newly registered
users, not sure it is so desired. I could not add this patch there as
"desired".
I have added 2.29 keyword to the bug entry.

Bests,
Egor Kobylkin


[12] https://sourceware.org/glibc/wiki/Release/2.29

On 08.12.18 23:28, Egor Kobylkin wrote:
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
> 
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
> 
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
> 
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
> 
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> “copy "tr_TR"”.
> 
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
>   to sequences of all uppercase Latin letters in all languages (whenever
>   a Cyrillic letter is transliterated to more than one Latin letter),
>   for example "Ї" is now transliterated as "YI" rather than "Yi".
> 
> Dear locale maintainers,
> 
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> 
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
> 
> The patch is attached.
> 
> 
> Current bug effect:
> 
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
> 
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
> 
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> 
> - it produces a string of question marks and spaces.
> 
> This is what it should produce and it does so after the patch applied:
> 
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
> 
> 
> The root problem and the fix:
> 
> The root problem is the missing transliteration table that I am
> supplying here.
> 
> 
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> 
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
> 
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
> 
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
> 
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
> 
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
> 
> Links:
> 
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
> 
> Best regards,
> Egor Kobylkin
> 
> 

>From b9cd550028ecf7c875c9d7250c8598433b1fc474 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Sat, 8 Dec 2018 22:08:59 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* locale/C-translit.h.in: Add Cyrillic transliteration.
---
 locale/C-translit.h.in | 170 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)

diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index e27f39e8fe..bd64edc609 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -2,6 +2,7 @@
    Copyright (C) 2000-2018 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Ulrich Drepper <drepper@redhat.com>, 2000.
+   0401-04f9 contributed by Egor Kobylkin <Egor@Kobylkin.com>, 2018.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -56,6 +57,175 @@
 "\x02cd"	"_"	/* <U02CD> MODIFIER LETTER LOW MACRON */
 "\x02d0"	":"	/* <U02D0> MODIFIER LETTER TRIANGULAR COLON */
 "\x02dc"	"~"	/* <U02DC> SMALL TILDE */
+"\x0401"	"YO"	/* <U0401> CYRILLIC CAPITAL LETTER IO */
+"\x0402"	"DJ"	/* <U0402> CYRILLIC CAPITAL LETTER DJE */
+"\x0403"	"G`"	/* <U0403> CYRILLIC CAPITAL LETTER GJE */
+"\x0404"	"YE"	/* <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE */
+"\x0405"	"Z`"	/* <U0405> CYRILLIC CAPITAL LETTER DZE */
+"\x0406"	"I"	/* <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0407"	"YI"	/* <U0407> CYRILLIC CAPITAL LETTER YI */
+"\x0408"	"J"	/* <U0408> CYRILLIC CAPITAL LETTER JE */
+"\x0409"	"L`"	/* <U0409> CYRILLIC CAPITAL LETTER LJE */
+"\x040a"	"N`"	/* <U040A> CYRILLIC CAPITAL LETTER NJE */
+"\x040b"	"TSH"	/* <U040B> CYRILLIC CAPITAL LETTER TSHE */
+"\x040c"	"K`"	/* <U040C> CYRILLIC CAPITAL LETTER KJE */
+"\x040e"	"U`"	/* <U040E> CYRILLIC CAPITAL LETTER SHORT U */
+"\x040f"	"DH"	/* <U040F> CYRILLIC CAPITAL LETTER DZHE */
+"\x0410"	"A"	/* <U0410> CYRILLIC CAPITAL LETTER A */
+"\x0411"	"B"	/* <U0411> CYRILLIC CAPITAL LETTER BE */
+"\x0412"	"V"	/* <U0412> CYRILLIC CAPITAL LETTER VE */
+"\x0413"	"G"	/* <U0413> CYRILLIC CAPITAL LETTER GHE */
+"\x0414"	"D"	/* <U0414> CYRILLIC CAPITAL LETTER DE */
+"\x0415"	"E"	/* <U0415> CYRILLIC CAPITAL LETTER IE */
+"\x0416"	"ZH"	/* <U0416> CYRILLIC CAPITAL LETTER ZHE */
+"\x0417"	"Z"	/* <U0417> CYRILLIC CAPITAL LETTER ZE */
+"\x0418"	"I"	/* <U0418> CYRILLIC CAPITAL LETTER I */
+"\x0419"	"J"	/* <U0419> CYRILLIC CAPITAL LETTER SHORT I */
+"\x041a"	"K"	/* <U041A> CYRILLIC CAPITAL LETTER KA */
+"\x041b"	"L"	/* <U041B> CYRILLIC CAPITAL LETTER EL */
+"\x041c"	"M"	/* <U041C> CYRILLIC CAPITAL LETTER EM */
+"\x041d"	"N"	/* <U041D> CYRILLIC CAPITAL LETTER EN */
+"\x041e"	"O"	/* <U041E> CYRILLIC CAPITAL LETTER O */
+"\x041f"	"P"	/* <U041F> CYRILLIC CAPITAL LETTER PE */
+"\x0420"	"R"	/* <U0420> CYRILLIC CAPITAL LETTER ER */
+"\x0421"	"S"	/* <U0421> CYRILLIC CAPITAL LETTER ES */
+"\x0422"	"T"	/* <U0422> CYRILLIC CAPITAL LETTER TE */
+"\x0423"	"U"	/* <U0423> CYRILLIC CAPITAL LETTER U */
+"\x0424"	"F"	/* <U0424> CYRILLIC CAPITAL LETTER EF */
+"\x0425"	"X"	/* <U0425> CYRILLIC CAPITAL LETTER HA */
+"\x0426"	"CZ"	/* <U0426> CYRILLIC CAPITAL LETTER TSE */
+"\x0427"	"CH"	/* <U0427> CYRILLIC CAPITAL LETTER CHE */
+"\x0428"	"SH"	/* <U0428> CYRILLIC CAPITAL LETTER SHA */
+"\x0429"	"SHH"	/* <U0429> CYRILLIC CAPITAL LETTER SHCHA */
+"\x042a"	"A`"	/* <U042A> CYRILLIC CAPITAL LETTER HARD SIGN */
+"\x042b"	"Y`"	/* <U042B> CYRILLIC CAPITAL LETTER YERU */
+"\x042c"	"`"	/* <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN */
+"\x042d"	"E`"	/* <U042D> CYRILLIC CAPITAL LETTER E */
+"\x042e"	"YU"	/* <U042E> CYRILLIC CAPITAL LETTER YU */
+"\x042f"	"YA"	/* <U042F> CYRILLIC CAPITAL LETTER YA */
+"\x0430"	"a"	/* <U0430> CYRILLIC SMALL LETTER A */
+"\x0431"	"b"	/* <U0431> CYRILLIC SMALL LETTER BE */
+"\x0432"	"v"	/* <U0432> CYRILLIC SMALL LETTER VE */
+"\x0433"	"g"	/* <U0433> CYRILLIC SMALL LETTER GHE */
+"\x0434"	"d"	/* <U0434> CYRILLIC SMALL LETTER DE */
+"\x0435"	"e"	/* <U0435> CYRILLIC SMALL LETTER IE */
+"\x0436"	"zh"	/* <U0436> CYRILLIC SMALL LETTER ZHE */
+"\x0437"	"z"	/* <U0437> CYRILLIC SMALL LETTER ZE */
+"\x0438"	"i"	/* <U0438> CYRILLIC SMALL LETTER I */
+"\x0439"	"j"	/* <U0439> CYRILLIC SMALL LETTER SHORT I */
+"\x043a"	"k"	/* <U043A> CYRILLIC SMALL LETTER KA */
+"\x043b"	"l"	/* <U043B> CYRILLIC SMALL LETTER EL */
+"\x043c"	"m"	/* <U043C> CYRILLIC SMALL LETTER EM */
+"\x043d"	"n"	/* <U043D> CYRILLIC SMALL LETTER EN */
+"\x043e"	"o"	/* <U043E> CYRILLIC SMALL LETTER O */
+"\x043f"	"p"	/* <U043F> CYRILLIC SMALL LETTER PE */
+"\x0440"	"r"	/* <U0440> CYRILLIC SMALL LETTER ER */
+"\x0441"	"s"	/* <U0441> CYRILLIC SMALL LETTER ES */
+"\x0442"	"t"	/* <U0442> CYRILLIC SMALL LETTER TE */
+"\x0443"	"u"	/* <U0443> CYRILLIC SMALL LETTER U */
+"\x0444"	"f"	/* <U0444> CYRILLIC SMALL LETTER EF */
+"\x0445"	"x"	/* <U0445> CYRILLIC SMALL LETTER HA */
+"\x0446"	"cz"	/* <U0446> CYRILLIC SMALL LETTER TSE */
+"\x0447"	"ch"	/* <U0447> CYRILLIC SMALL LETTER CHE */
+"\x0448"	"sh"	/* <U0448> CYRILLIC SMALL LETTER SHA */
+"\x0449"	"shh"	/* <U0449> CYRILLIC SMALL LETTER SHCHA */
+"\x044a"	"``"	/* <U044A> CYRILLIC SMALL LETTER HARD SIGN */
+"\x044b"	"y`"	/* <U044B> CYRILLIC SMALL LETTER YERU */
+"\x044c"	"`"	/* <U044C> CYRILLIC SMALL LETTER SOFT SIGN */
+"\x044d"	"e`"	/* <U044D> CYRILLIC SMALL LETTER E */
+"\x044e"	"yu"	/* <U044E> CYRILLIC SMALL LETTER YU */
+"\x044f"	"ya"	/* <U044F> CYRILLIC SMALL LETTER YA */
+"\x0451"	"yo"	/* <U0451> CYRILLIC SMALL LETTER IO */
+"\x0452"	"dj"	/* <U0452> CYRILLIC SMALL LETTER DJE */
+"\x0453"	"g`"	/* <U0453> CYRILLIC SMALL LETTER GJE */
+"\x0454"	"ye"	/* <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE */
+"\x0455"	"z`"	/* <U0455> CYRILLIC SMALL LETTER DZE */
+"\x0456"	"i"	/* <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0457"	"yi"	/* <U0457> CYRILLIC SMALL LETTER YI */
+"\x0458"	"j"	/* <U0458> CYRILLIC SMALL LETTER JE */
+"\x0459"	"l`"	/* <U0459> CYRILLIC SMALL LETTER LJE */
+"\x045a"	"n`"	/* <U045A> CYRILLIC SMALL LETTER NJE */
+"\x045b"	"tsh"	/* <U045B> CYRILLIC SMALL LETTER TSHE */
+"\x045c"	"k`"	/* <U045C> CYRILLIC SMALL LETTER KJE */
+"\x045e"	"u`"	/* <U045E> CYRILLIC SMALL LETTER SHORT U */
+"\x045f"	"dh"	/* <U045F> CYRILLIC SMALL LETTER DZHE */
+"\x046a"	"O`"	/* <U046A> CYRILLIC CAPITAL LETTER BIG YUS */
+"\x046b"	"o`"	/* <U046B> CYRILLIC SMALL LETTER BIG YUS */
+"\x0472"	"FH"	/* <U0472> CYRILLIC CAPITAL LETTER FITA */
+"\x0473"	"fh"	/* <U0473> CYRILLIC SMALL LETTER FITA */
+"\x0474"	"YH"	/* <U0474> CYRILLIC CAPITAL LETTER IZHITSA */
+"\x0475"	"yh"	/* <U0475> CYRILLIC SMALL LETTER IZHITSA */
+"\x048c"	"E`"	/* <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN */
+"\x048d"	"e`"	/* <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN */
+"\x0490"	"G`"	/* <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN */
+"\x0491"	"g`"	/* <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN */
+"\x0492"	"GH"	/* <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE */
+"\x0493"	"gh"	/* <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE */
+"\x0494"	"GH"	/* <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK */
+"\x0495"	"gh"	/* <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK */
+"\x0496"	"ZH`"	/* <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER */
+"\x0497"	"zh`"	/* <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER */
+"\x049a"	"K`"	/* <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER */
+"\x049b"	"k`"	/* <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER */
+"\x049e"	"K`"	/* <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE */
+"\x049f"	"k`"	/* <U049F> CYRILLIC SMALL LETTER KA WITH STROKE */
+"\x04a2"	"N`"	/* <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER */
+"\x04a3"	"n`"	/* <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER */
+"\x04a4"	"NG"	/* <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE */
+"\x04a5"	"ng"	/* <U04A5> CYRILLIC SMALL LIGATURE EN GHE */
+"\x04a6"	"P`"	/* <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK */
+"\x04a7"	"p`"	/* <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK */
+"\x04a8"	"O`"	/* <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA */
+"\x04a9"	"o`"	/* <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA */
+"\x04aa"	"C`"	/* <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER */
+"\x04ab"	"C`"	/* <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER */
+"\x04ac"	"T`"	/* <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER */
+"\x04ad"	"t`"	/* <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER */
+"\x04ae"	"U"	/* <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U */
+"\x04af"	"u"	/* <U04AF> CYRILLIC SMALL LETTER STRAIGHT U */
+"\x04b2"	"H`"	/* <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER */
+"\x04b3"	"h`"	/* <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER */
+"\x04b4"	"TCZ"	/* <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE */
+"\x04b5"	"tcz"	/* <U04B5> CYRILLIC SMALL LIGATURE TE TSE */
+"\x04ba"	"SH`"	/* <U04BA> CYRILLIC CAPITAL LETTER SHHA */
+"\x04bb"	"SH`"	/* <U04BB> CYRILLIC SMALL LETTER SHHA */
+"\x04bc"	"CH`"	/* <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE */
+"\x04bd"	"ch`"	/* <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE */
+"\x04be"	"CH`"	/* <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04bf"	"ch`"	/* <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04c0"	"i"	/* <U04C0> CYRILLIC LETTER PALOCHKA */
+"\x04c1"	"ZH`"	/* <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE */
+"\x04c2"	"zh`"	/* <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE */
+"\x04cb"	"CH`"	/* <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE */
+"\x04cc"	"ch`"	/* <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE */
+"\x04d0"	"A`"	/* <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE */
+"\x04d1"	"a`"	/* <U04D1> CYRILLIC SMALL LETTER A WITH BREVE */
+"\x04d2"	"A`"	/* <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS */
+"\x04d3"	"a`"	/* <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS */
+"\x04d6"	"E`"	/* <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE */
+"\x04d7"	"e`"	/* <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE */
+"\x04d8"	"A`"	/* <U04D8> CYRILLIC CAPITAL LETTER SCHWA */
+"\x04d9"	"a`"	/* <U04D9> CYRILLIC SMALL LETTER SCHWA */
+"\x04dc"	"ZH`"	/* <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS */
+"\x04dd"	"zh`"	/* <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS */
+"\x04de"	"Z`"	/* <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS */
+"\x04df"	"z`"	/* <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS */
+"\x04e0"	"Z`"	/* <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE */
+"\x04e1"	"z`"	/* <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE */
+"\x04e4"	"I`"	/* <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS */
+"\x04e5"	"i`"	/* <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS */
+"\x04e6"	"O`"	/* <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS */
+"\x04e7"	"o`"	/* <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS */
+"\x04e8"	"O`"	/* <U04E8> CYRILLIC CAPITAL LETTER BARRED O */
+"\x04e9"	"o`"	/* <U04E9> CYRILLIC SMALL LETTER BARRED O */
+"\x04f0"	"U`"	/* <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS */
+"\x04f1"	"u`"	/* <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS */
+"\x04f2"	"U`"	/* <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE */
+"\x04f3"	"u`"	/* <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE */
+"\x04f4"	"CH`"	/* <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS */
+"\x04f5"	"ch`"	/* <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS */
+"\x04f8"	"Y`"	/* <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS */
+"\x04f9"	"y`"	/* <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS */
 "\x2002"	" "	/* <U2002> EN SPACE */
 "\x2003"	" "	/* <U2003> EM SPACE */
 "\x2004"	" "	/* <U2004> THREE-PER-EM SPACE */
-- 
2.17.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]