This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH COMMITTED] locale/C-translit.h.in: Cyrillic -> ASCII transliteration [BZ #2872]
- From: Rafal Luzynski <digitalfreak at lingonborough dot com>
- To: libc-alpha at sourceware dot org
- Date: Sat, 20 Jul 2019 22:01:47 +0200 (CEST)
- Subject: [PATCH COMMITTED] locale/C-translit.h.in: Cyrillic -> ASCII transliteration [BZ #2872]
For the record, this is the patch I have just pushed to master.
The content is exactly the same as Egor's v12 patch, minor changes
include the commit message reworded and the ChangeLog entry added.
I don't yet close the bug in Bugzilla because there may be few
minor updates (e.g., should we add NEWS entry? Now I lean into
saying no.)
--- 8< ---
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 2 Jan 2019 05:50:13 +0100
Subject: [PATCH] locale/C-translit.h.in: Cyrillic -> ASCII transliteration
[BZ #2872]
This patch adds Cyrillic to plain ASCII transliteration table according
to GOST 7.79-2000 System B standard to the C locale.
[BZ #2872]
* locale/C-translit.h.in: Add Cyrillic transliteration.
---
ChangeLog | 5 ++
locale/C-translit.h.in | 169 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 174 insertions(+)
diff --git a/ChangeLog b/ChangeLog
index a606c5fd60..a1fdef9cff 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2019-07-20 Egor Kobylkin <egor@kobylkin.com>
+
+ [BZ #2872]
+ * locale/C-translit.h.in: Add Cyrillic transliteration.
+
2019-07-19 Florian Weimer <fweimer@redhat.com>
* sysdeps/unix/sysv/linux/syscall-names.list: Add system calls
diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index d5f00df0f3..758171c394 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -56,6 +56,175 @@
"\x02cd" "_" # <U02CD> MODIFIER LETTER LOW MACRON
"\x02d0" ":" # <U02D0> MODIFIER LETTER TRIANGULAR COLON
"\x02dc" "~" # <U02DC> SMALL TILDE
+"\x0401" "YO" # <U0401> CYRILLIC CAPITAL LETTER IO
+"\x0402" "DJ" # <U0402> CYRILLIC CAPITAL LETTER DJE
+"\x0403" "G`" # <U0403> CYRILLIC CAPITAL LETTER GJE
+"\x0404" "YE" # <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE
+"\x0405" "Z`" # <U0405> CYRILLIC CAPITAL LETTER DZE
+"\x0406" "I" # <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0407" "YI" # <U0407> CYRILLIC CAPITAL LETTER YI
+"\x0408" "J" # <U0408> CYRILLIC CAPITAL LETTER JE
+"\x0409" "L`" # <U0409> CYRILLIC CAPITAL LETTER LJE
+"\x040a" "N`" # <U040A> CYRILLIC CAPITAL LETTER NJE
+"\x040b" "TSH" # <U040B> CYRILLIC CAPITAL LETTER TSHE
+"\x040c" "K`" # <U040C> CYRILLIC CAPITAL LETTER KJE
+"\x040e" "U`" # <U040E> CYRILLIC CAPITAL LETTER SHORT U
+"\x040f" "DH" # <U040F> CYRILLIC CAPITAL LETTER DZHE
+"\x0410" "A" # <U0410> CYRILLIC CAPITAL LETTER A
+"\x0411" "B" # <U0411> CYRILLIC CAPITAL LETTER BE
+"\x0412" "V" # <U0412> CYRILLIC CAPITAL LETTER VE
+"\x0413" "G" # <U0413> CYRILLIC CAPITAL LETTER GHE
+"\x0414" "D" # <U0414> CYRILLIC CAPITAL LETTER DE
+"\x0415" "E" # <U0415> CYRILLIC CAPITAL LETTER IE
+"\x0416" "ZH" # <U0416> CYRILLIC CAPITAL LETTER ZHE
+"\x0417" "Z" # <U0417> CYRILLIC CAPITAL LETTER ZE
+"\x0418" "I" # <U0418> CYRILLIC CAPITAL LETTER I
+"\x0419" "J" # <U0419> CYRILLIC CAPITAL LETTER SHORT I
+"\x041a" "K" # <U041A> CYRILLIC CAPITAL LETTER KA
+"\x041b" "L" # <U041B> CYRILLIC CAPITAL LETTER EL
+"\x041c" "M" # <U041C> CYRILLIC CAPITAL LETTER EM
+"\x041d" "N" # <U041D> CYRILLIC CAPITAL LETTER EN
+"\x041e" "O" # <U041E> CYRILLIC CAPITAL LETTER O
+"\x041f" "P" # <U041F> CYRILLIC CAPITAL LETTER PE
+"\x0420" "R" # <U0420> CYRILLIC CAPITAL LETTER ER
+"\x0421" "S" # <U0421> CYRILLIC CAPITAL LETTER ES
+"\x0422" "T" # <U0422> CYRILLIC CAPITAL LETTER TE
+"\x0423" "U" # <U0423> CYRILLIC CAPITAL LETTER U
+"\x0424" "F" # <U0424> CYRILLIC CAPITAL LETTER EF
+"\x0425" "X" # <U0425> CYRILLIC CAPITAL LETTER HA
+"\x0426" "CZ" # <U0426> CYRILLIC CAPITAL LETTER TSE
+"\x0427" "CH" # <U0427> CYRILLIC CAPITAL LETTER CHE
+"\x0428" "SH" # <U0428> CYRILLIC CAPITAL LETTER SHA
+"\x0429" "SHH" # <U0429> CYRILLIC CAPITAL LETTER SHCHA
+"\x042a" "A`" # <U042A> CYRILLIC CAPITAL LETTER HARD SIGN
+"\x042b" "Y`" # <U042B> CYRILLIC CAPITAL LETTER YERU
+"\x042c" "`" # <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN
+"\x042d" "E`" # <U042D> CYRILLIC CAPITAL LETTER E
+"\x042e" "YU" # <U042E> CYRILLIC CAPITAL LETTER YU
+"\x042f" "YA" # <U042F> CYRILLIC CAPITAL LETTER YA
+"\x0430" "a" # <U0430> CYRILLIC SMALL LETTER A
+"\x0431" "b" # <U0431> CYRILLIC SMALL LETTER BE
+"\x0432" "v" # <U0432> CYRILLIC SMALL LETTER VE
+"\x0433" "g" # <U0433> CYRILLIC SMALL LETTER GHE
+"\x0434" "d" # <U0434> CYRILLIC SMALL LETTER DE
+"\x0435" "e" # <U0435> CYRILLIC SMALL LETTER IE
+"\x0436" "zh" # <U0436> CYRILLIC SMALL LETTER ZHE
+"\x0437" "z" # <U0437> CYRILLIC SMALL LETTER ZE
+"\x0438" "i" # <U0438> CYRILLIC SMALL LETTER I
+"\x0439" "j" # <U0439> CYRILLIC SMALL LETTER SHORT I
+"\x043a" "k" # <U043A> CYRILLIC SMALL LETTER KA
+"\x043b" "l" # <U043B> CYRILLIC SMALL LETTER EL
+"\x043c" "m" # <U043C> CYRILLIC SMALL LETTER EM
+"\x043d" "n" # <U043D> CYRILLIC SMALL LETTER EN
+"\x043e" "o" # <U043E> CYRILLIC SMALL LETTER O
+"\x043f" "p" # <U043F> CYRILLIC SMALL LETTER PE
+"\x0440" "r" # <U0440> CYRILLIC SMALL LETTER ER
+"\x0441" "s" # <U0441> CYRILLIC SMALL LETTER ES
+"\x0442" "t" # <U0442> CYRILLIC SMALL LETTER TE
+"\x0443" "u" # <U0443> CYRILLIC SMALL LETTER U
+"\x0444" "f" # <U0444> CYRILLIC SMALL LETTER EF
+"\x0445" "x" # <U0445> CYRILLIC SMALL LETTER HA
+"\x0446" "cz" # <U0446> CYRILLIC SMALL LETTER TSE
+"\x0447" "ch" # <U0447> CYRILLIC SMALL LETTER CHE
+"\x0448" "sh" # <U0448> CYRILLIC SMALL LETTER SHA
+"\x0449" "shh" # <U0449> CYRILLIC SMALL LETTER SHCHA
+"\x044a" "``" # <U044A> CYRILLIC SMALL LETTER HARD SIGN
+"\x044b" "y`" # <U044B> CYRILLIC SMALL LETTER YERU
+"\x044c" "`" # <U044C> CYRILLIC SMALL LETTER SOFT SIGN
+"\x044d" "e`" # <U044D> CYRILLIC SMALL LETTER E
+"\x044e" "yu" # <U044E> CYRILLIC SMALL LETTER YU
+"\x044f" "ya" # <U044F> CYRILLIC SMALL LETTER YA
+"\x0451" "yo" # <U0451> CYRILLIC SMALL LETTER IO
+"\x0452" "dj" # <U0452> CYRILLIC SMALL LETTER DJE
+"\x0453" "g`" # <U0453> CYRILLIC SMALL LETTER GJE
+"\x0454" "ye" # <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE
+"\x0455" "z`" # <U0455> CYRILLIC SMALL LETTER DZE
+"\x0456" "i" # <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0457" "yi" # <U0457> CYRILLIC SMALL LETTER YI
+"\x0458" "j" # <U0458> CYRILLIC SMALL LETTER JE
+"\x0459" "l`" # <U0459> CYRILLIC SMALL LETTER LJE
+"\x045a" "n`" # <U045A> CYRILLIC SMALL LETTER NJE
+"\x045b" "tsh" # <U045B> CYRILLIC SMALL LETTER TSHE
+"\x045c" "k`" # <U045C> CYRILLIC SMALL LETTER KJE
+"\x045e" "u`" # <U045E> CYRILLIC SMALL LETTER SHORT U
+"\x045f" "dh" # <U045F> CYRILLIC SMALL LETTER DZHE
+"\x046a" "O`" # <U046A> CYRILLIC CAPITAL LETTER BIG YUS
+"\x046b" "o`" # <U046B> CYRILLIC SMALL LETTER BIG YUS
+"\x0472" "FH" # <U0472> CYRILLIC CAPITAL LETTER FITA
+"\x0473" "fh" # <U0473> CYRILLIC SMALL LETTER FITA
+"\x0474" "YH" # <U0474> CYRILLIC CAPITAL LETTER IZHITSA
+"\x0475" "yh" # <U0475> CYRILLIC SMALL LETTER IZHITSA
+"\x048c" "E`" # <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+"\x048d" "e`" # <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN
+"\x0490" "G`" # <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+"\x0491" "g`" # <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN
+"\x0492" "GH" # <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE
+"\x0493" "gh" # <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE
+"\x0494" "GH" # <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+"\x0495" "gh" # <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+"\x0496" "ZH`" # <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+"\x0497" "zh`" # <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+"\x049a" "K`" # <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+"\x049b" "k`" # <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER
+"\x049e" "K`" # <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE
+"\x049f" "k`" # <U049F> CYRILLIC SMALL LETTER KA WITH STROKE
+"\x04a2" "N`" # <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+"\x04a3" "n`" # <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER
+"\x04a4" "NG" # <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE
+"\x04a5" "ng" # <U04A5> CYRILLIC SMALL LIGATURE EN GHE
+"\x04a6" "P`" # <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+"\x04a7" "p`" # <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+"\x04a8" "O`" # <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA
+"\x04a9" "o`" # <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA
+"\x04aa" "C`" # <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+"\x04ab" "C`" # <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER
+"\x04ac" "T`" # <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+"\x04ad" "t`" # <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER
+"\x04ae" "U" # <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U
+"\x04af" "u" # <U04AF> CYRILLIC SMALL LETTER STRAIGHT U
+"\x04b2" "H`" # <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+"\x04b3" "h`" # <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER
+"\x04b4" "TCZ" # <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE
+"\x04b5" "tcz" # <U04B5> CYRILLIC SMALL LIGATURE TE TSE
+"\x04ba" "SH`" # <U04BA> CYRILLIC CAPITAL LETTER SHHA
+"\x04bb" "sh`" # <U04BB> CYRILLIC SMALL LETTER SHHA
+"\x04bc" "CH`" # <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+"\x04bd" "ch`" # <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE
+"\x04be" "CH`" # <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH
DESCENDER
+"\x04bf" "ch`" # <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04c0" "i" # <U04C0> CYRILLIC LETTER PALOCHKA
+"\x04c1" "ZH`" # <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+"\x04c2" "zh`" # <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE
+"\x04cb" "CH`" # <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+"\x04cc" "ch`" # <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE
+"\x04d0" "A`" # <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE
+"\x04d1" "a`" # <U04D1> CYRILLIC SMALL LETTER A WITH BREVE
+"\x04d2" "A`" # <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+"\x04d3" "a`" # <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS
+"\x04d6" "E`" # <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE
+"\x04d7" "e`" # <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE
+"\x04d8" "A`" # <U04D8> CYRILLIC CAPITAL LETTER SCHWA
+"\x04d9" "a`" # <U04D9> CYRILLIC SMALL LETTER SCHWA
+"\x04dc" "ZH`" # <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+"\x04dd" "zh`" # <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+"\x04de" "Z`" # <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+"\x04df" "z`" # <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+"\x04e0" "Z`" # <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+"\x04e1" "z`" # <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE
+"\x04e4" "I`" # <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+"\x04e5" "i`" # <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS
+"\x04e6" "O`" # <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+"\x04e7" "o`" # <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS
+"\x04e8" "O`" # <U04E8> CYRILLIC CAPITAL LETTER BARRED O
+"\x04e9" "o`" # <U04E9> CYRILLIC SMALL LETTER BARRED O
+"\x04f0" "U`" # <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+"\x04f1" "u`" # <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS
+"\x04f2" "U`" # <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+"\x04f3" "u`" # <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+"\x04f4" "CH`" # <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+"\x04f5" "ch`" # <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+"\x04f8" "Y`" # <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+"\x04f9" "y`" # <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS
"\x2002" " " # <U2002> EN SPACE
"\x2003" " " # <U2003> EM SPACE
"\x2004" " " # <U2004> THREE-PER-EM SPACE
--
2.21.0