This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: How transliteration works?
- From: Jonathan Nieder <jrnieder at gmail dot com>
- To: Urmas <davian818 at gmail dot com>
- Cc: libc-help at sourceware dot org
- Date: Fri, 1 Apr 2011 03:03:28 -0500
- Subject: Re: How transliteration works?
- References: <5050991AFC074DA19DECDC4944D1F17D@sandy>
Hi,
Urmas wrote:
> What defines which characters are substituted when one uses iconv
> with '//translit' option? Where does it located in source?
That's a good question. So let's see:
| $ git grep -F -e TRANSLIT
[...]
| iconv/gconv_open.c: if (__strcasecmp_l (tok, "TRANSLIT", _nl_C_locobj_ptr) == 0)
[...]
Looking near that line, it seems that "struct trans_struct",
__gconv_translit_find, and __gconv_transliterate should be relevant.
Let's look at __gconv_transliterate in iconv/gconv_trans.c. It says
that the locale should contain a transliteration table:
| /* If there is no transliteration information in the locale don't do
| anything and return the error. */
| size = _NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_TRANSLIT_TAB_SIZE);
| if (size == 0)
| goto no_rules;
So at this point, the obvious thing is to just skip to the locales.
| $ git grep -e transliteration -- localedata
[...]
| localedata/locales/bg_BG:% their transliteration with Bulgarian Cyrillic letters.
| localedata/locales/de_DE:% The following strange first-level transliteration derive from the use
[...]
"The following"? Ah! It seems that transliteration tables come in
the locale definition files between translit_start and translit_end.
The details of how these tables are used are left as an exercise to
the reader.
Hope that helps,
Jonathan