[PATCH] locale/C-translit.h.in: Greek -> ASCII transliteration table [BZ #12031]
Diego (Egor) Kobylkin
egor@kobylkin.com
Wed Sep 4 07:31:00 GMT 2019
Dear locale maintainers,
fix the glibc bug 12031 "iconv -t ascii//translit with Greek characters" [1]
add Greek transliteration rows to locale/C-translit.h.in.
This work is done on the heels of the successfully committed patch for the
virtually the same bug [BZ #2872] but concerning Cyrillic characters. [2]
AFAIK there are many versions of transcription tables for Greek to ASCII
transcription. Given that current iconv logic can only translit one to many
but not many to many symbols we take the "Standard" part of
the Romanization_of_Greek#Modern_Greek table [3]
and only keep the one letter Greek graphems. That "standard" seems to be close to
the ELOT 743 indeed but not the same.
So we omit things like M and Μπ being transliterated as M and B accordingly.
Rather Μπ will be treated like two separate graphems and transliterated as Mp.
Here is the list of some standards I have collected so far. There doesn't seem
a way to harmonize them all into one. But if anyone want to propose a solution -
please do.
* ΕΛΟΤ 743 https://www.teicrete.gr/users/kutrulis/Ergalia/ELOT743.htm Passports.
* ISO 843 https://en.wikipedia.org/wiki/ISO_843
* ALA-LC https://www.loc.gov/catdir/cpso/romanization/greek.pdf Book titles.
* BGN/PCGN http://libraries.ucsd.edu/bib/fed/USBGN_romanization.pdf
* http://geonames.nga.mil/gns/html/Romanization/Romanization_Greek.pdf Geographical names.
Furthermore to cover the whole U0370-U03FF Greek/Coptic Unicode range I have
asked around and made a best effort transliteration for the rest of characters
not covered in above standards.
Should you have better sources for the actual translit entries please make sure to
send your feedback!
The patch is attached.
Best regards,
Egor Kobylkin
https://sourceware.org/bugzilla/show_bug.cgi?id=12031 [1]
https://sourceware.org/ml/libc-alpha/2019-07/msg00477.html [2]
https://en.wikipedia.org/wiki/Romanization_of_Greek#Modern_Greek [3]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Locales-Greek-ASCII-transliteration-table-BZ-12031.patch
Type: text/x-patch
Size: 7981 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-locales/attachments/20190904/effb2e79/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - egor@kobylkin.com - 0x01FEB4E8.asc
Type: application/pgp-keys
Size: 657 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-locales/attachments/20190904/effb2e79/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 217 bytes
Desc: OpenPGP digital signature
URL: <http://sourceware.org/pipermail/libc-locales/attachments/20190904/effb2e79/attachment.sig>
More information about the Libc-locales
mailing list