KOI8 character sets

Andy Koppe andy.koppe@gmail.com
Mon Aug 24 17:00:00 GMT 2009


The attached patch adds support for the KOI8-R and KOI8-U character
sets. These are the de-facto standard character sets on Unix machines
and the Net in Russia, Ukraine, and other ex-Soviet states.
(ISO-8859-5, designed for all Cyrillic scripts, apparently never found
much acceptance.)

Under Windows they are known as codepages 20866 and 21866. Since they
are single-byte encodings with printable characters in the C1 range
from 0x80 to 0x9F, it seems best to handle them like DOS/Windows
codepages. The conversion tables were adapted from the iconv ones.

Tested on Cygwin 1.7.

ChangeLog:

2009-08-22  Corinna Vinschen  <corinna@vinschen.de>
        * libc/locale/locale.c (loadlocale): Map "KOI8-R" and "KOI8-U" to
        CP20866 and CP21866.

2009-08-22  Andy Koppe  <andy.koppe@gmail.com>
        * libc/stdlib/sb_charsets.c (__cp_conv): Add KOI8-R (Russian, CP20866)
        and KOI8-U (Ukrainian, CP21866) to Windows codepage conversion tables.
        * libc/ctype/ctype_cp.h (__ctype_cp): Likewise for ctype tables.

Andy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: koi8.patch
Type: application/octet-stream
Size: 6943 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/newlib/attachments/20090824/09c5d06d/attachment.obj>


More information about the Newlib mailing list