This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How transliteration works?


Hi,

Urmas wrote:

> What defines which characters are substituted when one uses iconv
> with '//translit' option? Where does it located in source? 

That's a good question.  So let's see:

| $ git grep -F -e TRANSLIT
[...]
| iconv/gconv_open.c:           if (__strcasecmp_l (tok, "TRANSLIT", _nl_C_locobj_ptr) == 0)
[...]

Looking near that line, it seems that "struct trans_struct",
__gconv_translit_find, and __gconv_transliterate should be relevant.

Let's look at __gconv_transliterate in iconv/gconv_trans.c.  It says
that the locale should contain a transliteration table:

|   /* If there is no transliteration information in the locale don't do
|      anything and return the error.  */
|   size = _NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_TRANSLIT_TAB_SIZE);
|   if (size == 0)
|     goto no_rules;

So at this point, the obvious thing is to just skip to the locales.

| $ git grep -e transliteration -- localedata
[...]
| localedata/locales/bg_BG:%      their transliteration with Bulgarian Cyrillic letters.
| localedata/locales/de_DE:% The following strange first-level transliteration derive from the use
[...]

"The following"?  Ah!  It seems that transliteration tables come in
the locale definition files between translit_start and translit_end.

The details of how these tables are used are left as an exercise to
the reader.

Hope that helps,
Jonathan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]