[PATCH/RFA] Extended wctomb/mbtowc conversion and more stuff

Jeff Johnston jjohnstn@redhat.com
Mon Mar 23 17:52:00 GMT 2009


Corinna Vinschen wrote:
> Ok,
>
> this is the new patch about the extended wctomb_r/mbtowc_r stuff.
>
> It got more complicated because of various requirements in Cygwin.
> One of them is the requirement to be able to call mbtowc for a charset
> other than the current locale charset.
>
> I guess the best I can do is to start to explain what this patch is
> doing and explain the details while going aloing with the flow.
>
> - Set the default chrset to "ASCII", rather than ISO-8859-1.
>
>   This change has two reasons.  First of all, POSIX requires that
>   the default setting for all applications which don't explicitely
>   call setlocale is the "POSIX" or "C" locale.  In this locale,
>   only ASCII characters are supported.  This is also (correctly) the case
>   in the ctype functions in newlib.  Only the charset is wrongly
>   set to "ISO-8859-1".  Wrong in POSIX terms, and wrong because it's
>   not really supported by default.
>
> - Add support for correct ISO-8859-x multibyte<->wide char conversion.
>
> - If the input to setlocale is "C" or "POSIX", set the charset to
>   "ASCII" now.
>
> - Add support for all default ANSI and OEM codepages used on Windows,
>   CP437, CP720, CP737, CP775, CP850, CP852, CP855, CP857, CP858, CP862,
>   CP866, CP874, CP1125, CP1250, CP1251, CP1252, CP1253, CP1254, CP1255,
>   CP1256, CP1257, CP1258.
>
>   This new charset support require a couple of new character conversion
>   tables which I put into a new file called libc/stdlib/sb_charsets.c,
>   and which are only built on _MB_CAPABLE systems.  The tables are now
>   guarded by the defines we talked about, _MB_EXTENDED_CHARSETS_ISO and
>   _MB_EXTENDED_CHARSETS_DOS.  Maybe the latter should be better renamed
>   to _MB_EXTENDED_CHARSETS_WINDOWS, though.
>
> - On Cygwin, add support for the charsets GBK, CP949 (Korean unified Hangul),
>   and BIG5.  My current implementation of these charset conversion requires
>   OS support, so Cygwin needs to be able to set them in setlocale(), but
>   I have no implementation for newlib so far.
>
> - On Cygwin, if no explicit charset is defined as input to setlocale,
>   search for the current ANSI codepage and set it as current charset,
>   if it's one of the supported charsets, otherwise default to ISO-8859-1.
>
>   The change to the former patch is that the function
>   __set_charset_from_codepage is now defined in Cygwin, not in newlib.
>
> - Also on Cygwin, call a function __set_ctype, also defined in Cygwin only
>   for now.  This allows to switch the ctype tables for the various charsets.
>
>   The idea is that this function can also be defined in newlib at one
>   point.  We just have to discuss the implementation.  In Cygwin the
>   ctype data is copied over into the standard ctype array.  This is the
>   only way to do it which allows backward compatible behaviour with
>   existing applications due to the nature of the isXXX functions being
>   mostly used as macros defined in ctype.h.
>
> - Allow "eucJP" additionally to "EUCJP", and "Big5" additionally to "BIG5",
>   to support typical settings of these charsets on other systems.
>
> - The functions _wctomb_r and _mbtowc_r are now split into multiple
>   functions for each supported charset, rather than having to call
>   strcmp multiple times to determine which charset is used.
>
>   To do that, the setlocale() function sets function pointers
>   __wctomb/__mbtowc according to the current charset.  On systems not
>   being _MB_CAPABLE, only two such functions exist, __ascii_wctomb and
>   __ascii_mbtowc.'
>
>   The change in contrast to the former implementation is that the charset
>   is one of the parameters to these functions.  That's necessary to
>   allow Cygwin to call the __iso_mbtowc and __cp_mbtowc functions with
>   an alternate charset.
>
> - On Cygwin, don't use the newlib implementation of SJIS, JIS, and EUCJP
>   mbtowc/wctomb.  The reason is that newlib's implementations don't
>   convert the input multibyte chars to UTF wchars, rather it converts
>   them to a simple self-made form of wchars.  This doesn't work well
>   on Cygwin, because the underlying OS always requires wchars to be UTF-16.
>   Therefore Cygwin has it's own implementations of __sjis_mbtowc, etc.
>
> - Along the same lines, the function __jp2uc now does not convert the
>   incoming character at all on Cygwin, because the incoming char is
>   already UTF on Cygwin.
>
> - All iswXXX and towXXX functions have been changed so that on
>   _MB_CAPABLE systems all wchar_t input is either SJIS/JIS/EUCP, which
>   requires to convert the character to unicode first, or the input is
>   already unicode.  This is the wchar_t representation for all other
>   charsets anyway, and the only wchar_t representation on Cygwin as
>   outlined above.
>
> - The _MB_EXTENDED_CHARSETS_ISO and _MB_EXTENDED_CHARSETS_DOS are
>   defined in libc/include/sys/config.h.  I also added a define
>   _MB_EXTENDED_CHARSETS_ALL which is right now only set on Cygwin.
>   It enables the other two, and I expect them to enable the still
>   missing _MB_EXTENDED_CHARSETS_GBK, _MB_EXTENDED_CHARSETS_KOR,
>   and _MB_EXTENDED_CHARSETS_BIG5, as soon as they are available.
>
> - In libc/include/sys/reent.h, I marked the struct _reent members
>   _current_category and _current_locale as unused.  They are, because
>   they were only (incorrectly) used by the old setlocale implementation.
>   I don't want to remove them to keep the size of struct _reent the
>   same for backward compatibility with existing code.
>
> Again, the patch is split in two.  The first one containing all changes
> except those in ctype, the second one containg the ctype changes.
>
> I have a rather big patch to Cygwin which requires this functionality
> to go in first.  I hope the patch is basically ok to apply.
>
> I have split up the long ChangeLog entry for better readability.
>
>   
Please put the _mbtowc_r and _wctomb_r functions at the top of the files 
plus the default ASCII
versions so people don't have to wade through to the bottom.  I don't 
think the change of the default
charset name is going to affect anybody.  I am ok with you checking in 
the patch.

-- Jeff J.

> Corinna
>
>
> 	* libc/ctype/iswalpha.c: Handle all wchar_t as unicode on
> 	_MB_CAPABLE systems.
> 	* libc/ctype/iswblank.c: Ditto.
> 	* libc/ctype/iswcntrl.c: Ditto.
> 	* libc/ctype/iswprint.c: Ditto.
> 	* libc/ctype/iswpunct.c: Ditto.
> 	* libc/ctype/iswspace.c: Ditto.
> 	* libc/ctype/jp2uc.c (__jp2uc): On Cygwin, just return c.
> 	Explain why.
> 	* libc/ctype/towlower.c: Ditto.
> 	* libc/ctype/towupper.c: Ditto.
>
> 	* libc/include/sys/config.h: Define _MB_EXTENDED_CHARSETS_ISO
> 	and _MB_EXTENDED_CHARSETS_DOS if _MB_EXTENDED_CHARSETS_ALL is
> 	defined.  Define _MB_EXTENDED_CHARSETS_ALL on Cygwin only for now.
> 	* libc/include/sys/reent.h (struct _reent): Mark _current_category
> 	and _current_locale as unused.
>
> 	* libc/locale/locale.c: Add new charset support to documentation.
> 	Include ../stdio/local.h from here.
> 	(lc_ctype_charset): Set to "ASCII" by default.
> 	(lc_message_charset): Ditto.
> 	(_setlocale_r): Don't set _current_category and _current_locale.
> 	(loadlocale): Add Cygwin codepage support.  On _MB_CAPABLE
> 	systems, set __mbtowc and __wctomb function pointers to function
> 	corresponding with current charset.  Don't allow non-existant
> 	ISO-8859-12 charset.  Add support for Windows singlebyte codepages.
> 	On Cygwin, add support for GBK, CP949, and BIG5.  On Cygwin,
> 	call __set_ctype() in case the catorgy is LC_CTYPE.  Don't set
> 	_current_category and _current_locale.
>
> 	* libc/stdlib/Makefile.am (GENERAL_SOURCES): Add sb_charsets.c.
> 	* libc/stdlib/Makefile.in: Regenerate.
> 	* libc/stdlib/local.h: Add prototype for __locale_charset.
> 	Add prototypes for __mbtowc and __wctomb pointers.
> 	Add prototypes for charset-specific _wctomb_r and _mbtowc_r
> 	functions.
> 	Declare tables and functions from sb_charsets.c.
> 	* libc/stdlib/mbtowc_r.c (__mbtowc): Define.  Set to __ascii_mbtowc
> 	by default.
> 	(__iso_mbtowc): New function.
> 	(__cp_mbtowc): New function.
> 	(__utf8_mbtowc): New function.
> 	(__sjis_mbtowc): New function.  Disable on Cygwin.
> 	(__eucjp_mbtowc): New function.  Disable on Cygwin.
> 	(__jis_mbtowc): New function.  Disable on Cygwin.
> 	(__ascii_mbtowc): New function.
> 	(_mbtowc_r): Just call __mbtowc from here.
> 	* libc/stdlib/sb_charsets.c: New file, adding singlebyte to UTF
> 	conversion tables for all ISO and CP charsets.
> 	(__iso_8859_index): New function.
> 	(__cp_index): New function.
> 	* libc/stdlib/wctomb_r.c (__wctomb): Define.  Set to __ascii_wctomb
> 	by default.
> 	(__utf8_wctomb): New function.
> 	(__sjis_wctomb): New function.  Disable on Cygwin.
> 	(__eucjp_wctomb): New function.  Disable on Cygwin.
> 	(__jis_wctomb): New function.  Disable on Cygwin.
> 	(__iso_wctomb): New function.
> 	(__cp_wctomb): New function.
> 	(__ascii_wctomb): New function.
> 	(_wctomb_r): Just call __wctomb from here.
>
>
> Index: libc/include/sys/config.h
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/include/sys/config.h,v
> retrieving revision 1.50
> diff -u -p -r1.50 config.h
> --- libc/include/sys/config.h	20 Mar 2009 20:44:14 -0000	1.50
> +++ libc/include/sys/config.h	22 Mar 2009 16:25:07 -0000
> @@ -179,6 +179,7 @@
>  #if defined(__CYGWIN__)
>  #include <cygwin/config.h>
>  #define __LINUX_ERRNO_EXTENSIONS__ 1
> +#define _MB_EXTENDED_CHARSETS_ALL 1
>  #endif
>  
>  #if defined(__rtems__)
> @@ -211,4 +212,12 @@
>  #endif
>  #endif
>  
> +/* If _MB_EXTENDED_CHARSETS_ALL is set, we want all of the extended
> +   charsets.  The extended charsets add a few functions and a couple
> +   of tables of a few K each. */
> +#ifdef _MB_EXTENDED_CHARSETS_ALL
> +#define _MB_EXTENDED_CHARSETS_ISO 1
> +#define _MB_EXTENDED_CHARSETS_DOS 1
> +#endif
> +
>  #endif /* __SYS_CONFIG_H__ */
> Index: libc/include/sys/reent.h
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/include/sys/reent.h,v
> retrieving revision 1.45
> diff -u -p -r1.45 reent.h
> --- libc/include/sys/reent.h	10 Dec 2008 23:43:12 -0000	1.45
> +++ libc/include/sys/reent.h	22 Mar 2009 16:25:07 -0000
> @@ -371,8 +371,8 @@ struct _reent
>  
>    int __sdidinit;		/* 1 means stdio has been init'd */
>  
> -  int _current_category;	/* used by setlocale */
> -  _CONST char *_current_locale;
> +  int _current_category;	/* unused */
> +  _CONST char *_current_locale;	/* unused */
>  
>    struct _mprec *_mp;
>  
> Index: libc/locale/locale.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/locale/locale.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 locale.c
> --- libc/locale/locale.c	3 Mar 2009 09:28:45 -0000	1.9
> +++ libc/locale/locale.c	22 Mar 2009 16:25:07 -0000
> @@ -47,11 +47,18 @@ and <<"C">> values for <[locale]>; strin
>  honored unless _MB_CAPABLE is defined in which case POSIX locale strings
>  are allowed, plus five extensions supported for backward compatibility with
>  older implementations using newlib: <<"C-UTF-8">>, <<"C-JIS">>, <<"C-EUCJP">>,
> -<<"C-SJIS">>, or <<"C-ISO-8859-x">> with 1 <= x <= 15.  Even when using
> -POSIX locale strings, the only charsets allowed are <<"UTF-8">>, <<"JIS">>,
> -<<"EUCJP">>, <<"SJIS">>, or <<"ISO-8859-x">> with 1 <= x <= 15.  (<<"">> is 
> -also accepted; if given, the settings are read from the corresponding
> -LC_* environment variables and $LANG according to POSIX rules.
> +<<"C-SJIS">>, <<"C-ISO-8859-x">> with 1 <= x <= 15, or <<"C-CPxxx">> with
> +xxx in [437, 720, 737, 775, 850, 852, 855, 857, 858, 862, 866, 874, 1125, 1250,
> +1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258].  Even when using POSIX
> +locale strings, the only charsets allowed are <<"UTF-8">>, <<"JIS">>,
> +<<"EUCJP">>, <<"SJIS">>, <<"ISO-8859-x">> with 1 <= x <= 15, or
> +<<"CPxxx">> with xxx in [437, 720, 737, 775, 850, 852, 855, 857, 858, 862, 866,
> +874, 1125, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258]. 
> +(<<"">> is also accepted; if given, the settings are read from the
> +corresponding LC_* environment variables and $LANG according to POSIX rules.
> +
> +Under Cygwin, this implementation additionally supports the charsets <<"GBK">>,
> +<<"CP949">>, and <<"BIG5">>.
>  
>  If you use <<NULL>> as the <[locale]> argument, <<setlocale>> returns
>  a pointer to the string representing the current locale (always
> @@ -85,6 +92,9 @@ PORTABILITY
>  ANSI C requires <<setlocale>>, but the only locale required across all
>  implementations is the C locale.
>  
> +NOTES
> +There is no ISO-8859-12 codepage.  It's also refused by this implementation.
> +
>  No supporting OS subroutines are required.
>  */
>  
> @@ -129,6 +139,11 @@ No supporting OS subroutines are require
>  #include <limits.h>
>  #include <reent.h>
>  #include <stdlib.h>
> +#include <wchar.h>
> +#include "../stdlib/local.h"
> +#ifdef __CYGWIN__
> +#include <windows.h>
> +#endif
>  
>  #define _LC_LAST      7
>  #define ENCODING_LEN 31
> @@ -190,8 +205,8 @@ static const char *__get_locale_env(stru
>  
>  #endif
>  
> -static char lc_ctype_charset[ENCODING_LEN + 1] = "ISO-8859-1";
> -static char lc_message_charset[ENCODING_LEN + 1] = "ISO-8859-1";
> +static char lc_ctype_charset[ENCODING_LEN + 1] = "ASCII";
> +static char lc_message_charset[ENCODING_LEN + 1] = "ASCII";
>  
>  char *
>  _DEFUN(_setlocale_r, (p, category, locale),
> @@ -205,8 +220,6 @@ _DEFUN(_setlocale_r, (p, category, local
>        if (strcmp (locale, "POSIX") && strcmp (locale, "C")
>  	  && strcmp (locale, ""))
>          return NULL;
> -      p->_current_category = category;  
> -      p->_current_locale = locale;
>      }
>    return "C";
>  #else
> @@ -361,6 +374,11 @@ currentlocale()
>  #endif
>  
>  #ifdef _MB_CAPABLE
> +#ifdef __CYGWIN__
> +extern void *__set_charset_from_codepage (unsigned int, char *charset);
> +extern void __set_ctype (const char *charset);
> +#endif /* __CYGWIN__ */
> +
>  static char *
>  loadlocale(struct _reent *p, int category)
>  {
> @@ -382,7 +400,7 @@ loadlocale(struct _reent *p, int categor
>    if (!strcmp (locale, "POSIX"))
>      strcpy (locale, "C");
>    if (!strcmp (locale, "C"))				/* Default "C" locale */
> -    strcpy (charset, "ISO-8859-1");
> +    strcpy (charset, "ASCII");
>    else if (locale[0] == 'C' && locale[1] == '-')	/* Old newlib style */
>  	strcpy (charset, locale + 2);
>    else							/* POSIX style */
> @@ -414,7 +432,11 @@ loadlocale(struct _reent *p, int categor
>  	}
>        else if (c[0] == '\0' || c[0] == '@')
>  	/* End of string or just a modifier */
> +#ifdef __CYGWIN__
> +	__set_charset_from_codepage (GetACP (), charset);
> +#else
>  	strcpy (charset, "ISO-8859-1");
> +#endif
>        else
>  	/* Invalid string */
>        	return NULL;
> @@ -426,42 +448,155 @@ loadlocale(struct _reent *p, int categor
>        if (strcmp (charset, "UTF-8"))
>  	return NULL;
>        mbc_max = 6;
> +#ifdef _MB_CAPABLE
> +      __wctomb = __utf8_wctomb;
> +      __mbtowc = __utf8_mbtowc;
> +#endif
>      break;
>      case 'J':
>        if (strcmp (charset, "JIS"))
>  	return NULL;
>        mbc_max = 8;
> +#ifdef _MB_CAPABLE
> +      __wctomb = __jis_wctomb;
> +      __mbtowc = __jis_mbtowc;
> +#endif
>      break;
>      case 'E':
> -      if (strcmp (charset, "EUCJP"))
> +      if (strcmp (charset, "EUCJP") && strcmp (charset, "eucJP"))
>  	return NULL;
> +      strcpy (charset, "EUCJP");
>        mbc_max = 2;
> +#ifdef _MB_CAPABLE
> +      __wctomb = __eucjp_wctomb;
> +      __mbtowc = __eucjp_mbtowc;
> +#endif
>      break;
>      case 'S':
>        if (strcmp (charset, "SJIS"))
>  	return NULL;
>        mbc_max = 2;
> +#ifdef _MB_CAPABLE
> +      __wctomb = __sjis_wctomb;
> +      __mbtowc = __sjis_mbtowc;
> +#endif
>      break;
>      case 'I':
> -    default:
> -      /* Must be exactly one of ISO-8859-1, [...] ISO-8859-15. */
> +      /* Must be exactly one of ISO-8859-1, [...] ISO-8859-16, except for
> +         ISO-8859-12. */
>        if (strncmp (charset, "ISO-8859-", 9))
>  	return NULL;
> -      val = strtol (charset + 9, &end, 10);
> -      if (val < 1 || val > 15 || *end)
> +      val = _strtol_r (p, charset + 9, &end, 10);
> +      if (val < 1 || val > 16 || val == 12 || *end)
> +	return NULL;
> +      mbc_max = 1;
> +#ifdef _MB_CAPABLE
> +#ifdef _MB_EXTENDED_CHARSETS_ISO
> +      __wctomb = __iso_wctomb;
> +      __mbtowc = __iso_mbtowc;
> +#else /* !_MB_EXTENDED_CHARSETS_ISO */
> +      __wctomb = __ascii_wctomb;
> +      __mbtowc = __ascii_mbtowc;
> +#endif /* _MB_EXTENDED_CHARSETS_ISO */
> +#endif
> +    break;
> +    case 'C':
> +      if (charset[1] != 'P')
> +	return NULL;
> +      val = _strtol_r (p, charset + 2, &end, 10);
> +      if (*end)
> +	return NULL;
> +      switch (val)
> +	{
> +	case 437:
> +	case 720:
> +	case 737:
> +	case 775:
> +	case 850:
> +	case 852:
> +	case 855:
> +	case 857:
> +	case 858:
> +	case 862:
> +	case 866:
> +	case 874:
> +	case 1125:
> +	case 1250:
> +	case 1251:
> +	case 1252:
> +	case 1253:
> +	case 1254:
> +	case 1255:
> +	case 1256:
> +	case 1257:
> +	case 1258:
> +	  mbc_max = 1;
> +#ifdef _MB_CAPABLE
> +#ifdef _MB_EXTENDED_CHARSETS_DOS
> +	  __wctomb = __cp_wctomb;
> +	  __mbtowc = __cp_mbtowc;
> +#else /* !_MB_EXTENDED_CHARSETS_DOS */
> +	  __wctomb = __ascii_wctomb;
> +	  __mbtowc = __ascii_mbtowc;
> +#endif /* _MB_EXTENDED_CHARSETS_DOS */
> +#endif
> +	  break;
> +#ifdef __CYGWIN__
> +	case 949:
> +	  mbc_max = 2;
> +#ifdef _MB_CAPABLE
> +	  __wctomb = __kr_wctomb;
> +	  __mbtowc = __kr_mbtowc;
> +#endif
> +	  break;
> +#endif
> +	default:
> +	  return NULL;
> +	}
> +    break;
> +    case 'A':
> +      if (strcmp (charset, "ASCII"))
>  	return NULL;
>        mbc_max = 1;
> +#ifdef _MB_CAPABLE
> +      __wctomb = __ascii_wctomb;
> +      __mbtowc = __ascii_mbtowc;
> +#endif
> +      break;
> +#ifdef __CYGWIN__
> +    case 'G':
> +      if (strcmp (charset, "GBK"))
> +      	return NULL;
> +      mbc_max = 2;
> +#ifdef _MB_CAPABLE
> +      __wctomb = __gbk_wctomb;
> +      __mbtowc = __gbk_mbtowc;
> +#endif
>        break;
> +    case 'B':
> +      if (strcmp (charset, "BIG5") && strcmp (charset, "Big5"))
> +      	return NULL;
> +      strcpy (charset, "BIG5");
> +      mbc_max = 2;
> +#ifdef _MB_CAPABLE
> +      __wctomb = __big5_wctomb;
> +      __mbtowc = __big5_mbtowc;
> +#endif
> +      break;
> +#endif /* __CYGWIN__ */
> +    default:
> +      return NULL;
>      }
>    if (category == LC_CTYPE)
>      {
>        strcpy (lc_ctype_charset, charset);
>        __mb_cur_max = mbc_max;
> +#ifdef __CYGWIN__
> +      __set_ctype (charset);
> +#endif
>      }
>    else if (category == LC_MESSAGES)
>      strcpy (lc_message_charset, charset);
> -  p->_current_category = category;  
> -  p->_current_locale = locale;
>    return strcpy(current_categories[category], new_categories[category]);
>  }
>  
> Index: libc/stdlib/Makefile.am
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/Makefile.am,v
> retrieving revision 1.28
> diff -u -p -r1.28 Makefile.am
> --- libc/stdlib/Makefile.am	25 Feb 2009 21:33:17 -0000	1.28
> +++ libc/stdlib/Makefile.am	22 Mar 2009 16:25:07 -0000
> @@ -48,6 +48,7 @@ GENERAL_SOURCES = \
>  	rand_r.c	\
>  	realloc.c	\
>  	reallocf.c	\
> +	sb_charsets.c	\
>  	strtod.c	\
>  	strtol.c	\
>  	strtoul.c	\
> Index: libc/stdlib/local.h
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/local.h,v
> retrieving revision 1.1.1.1
> diff -u -p -r1.1.1.1 local.h
> --- libc/stdlib/local.h	17 Feb 2000 19:39:47 -0000	1.1.1.1
> +++ libc/stdlib/local.h	22 Mar 2009 16:25:07 -0000
> @@ -5,4 +5,61 @@
>  
>  char *	_EXFUN(_gcvt,(struct _reent *, double , int , char *, char, int));
>  
> +char *__locale_charset ();
> +
> +#ifndef __mbstate_t_defined
> +#include <wchar.h>
> +#endif
> +
> +int (*__wctomb) (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +int __ascii_wctomb (struct _reent *, char *, wchar_t, const char *,
> +		    mbstate_t *);
> +#ifdef _MB_CAPABLE
> +int __utf8_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +int __sjis_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +int __eucjp_wctomb (struct _reent *, char *, wchar_t, const char *,
> +		    mbstate_t *);
> +int __jis_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +int __iso_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +int __cp_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +#ifdef __CYGWIN__
> +int __gbk_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +int __kr_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +int __big5_wctomb (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
> +#endif
> +#endif
> +
> +int (*__mbtowc) (struct _reent *, wchar_t *, const char *, size_t,
> +                 const char *, mbstate_t *);
> +int __ascii_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		    const char *, mbstate_t *);
> +#ifdef _MB_CAPABLE
> +int __utf8_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		   const char *, mbstate_t *);
> +int __sjis_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		   const char *, mbstate_t *);
> +int __eucjp_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		    const char *, mbstate_t *);
> +int __jis_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		  const char *, mbstate_t *);
> +int __iso_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		  const char *, mbstate_t *);
> +int __cp_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		 const char *, mbstate_t *);
> +#ifdef __CYGWIN__
> +int __gbk_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		  const char *, mbstate_t *);
> +int __kr_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		  const char *, mbstate_t *);
> +int __big5_mbtowc (struct _reent *, wchar_t *, const char *, size_t,
> +		 const char *, mbstate_t *);
> +#endif
> +#endif
> +
> +wchar_t __iso_8859_conv[14][0x60];
> +int __iso_8859_index (const char *);
> +
> +wchar_t __cp_conv[12][0x80];
> +int __cp_index (const char *);
> +
>  #endif
> Index: libc/stdlib/mbtowc_r.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/mbtowc_r.c,v
> retrieving revision 1.11
> diff -u -p -r1.11 mbtowc_r.c
> --- libc/stdlib/mbtowc_r.c	19 Mar 2009 19:47:52 -0000	1.11
> +++ libc/stdlib/mbtowc_r.c	22 Mar 2009 16:25:07 -0000
> @@ -5,10 +5,13 @@
>  #include <wchar.h>
>  #include <string.h>
>  #include <errno.h>
> +#include "local.h"
>  
> -#ifdef _MB_CAPABLE
> -extern char *__locale_charset ();
> +int (*__mbtowc) (struct _reent *, wchar_t *, const char *, size_t,
> +		 const char *, mbstate_t *)
> +   = __ascii_mbtowc;
>  
> +#ifdef _MB_CAPABLE
>  typedef enum { ESCAPE, DOLLAR, BRACKET, AT, B, J, 
>                 NUL, JIS_CHAR, OTHER, JIS_C_NUM } JIS_CHAR_TYPE;
>  typedef enum { ASCII, JIS, A_ESC, A_ESC_DL, JIS_1, J_ESC, J_ESC_BR,
> @@ -43,17 +46,18 @@ static JIS_ACTION JIS_action_table[JIS_S
>  /* J_ESC */   { ERROR,   ERROR,    NOOP,     ERROR,   ERROR,   ERROR,   ERROR,   ERROR,   ERROR },
>  /* J_ESC_BR */{ ERROR,   ERROR,    ERROR,    ERROR,   MAKE_A,  MAKE_A,  ERROR,   ERROR,   ERROR },
>  };
> -#endif /* _MB_CAPABLE */
>  
>  /* we override the mbstate_t __count field for more complex encodings and use it store a state value */
>  #define __state __count
>  
> +#ifdef _MB_EXTENDED_CHARSETS_ISO
>  int
> -_DEFUN (_mbtowc_r, (r, pwc, s, n, state),
> -        struct _reent *r   _AND
> -        wchar_t       *pwc _AND 
> -        const char    *s   _AND        
> -        size_t         n   _AND
> +_DEFUN (__iso_mbtowc, (r, pwc, s, n, charset, state),
> +        struct _reent *r       _AND
> +        wchar_t       *pwc     _AND 
> +        const char    *s       _AND        
> +        size_t         n       _AND
> +	const char    *charset _AND
>          mbstate_t      *state)
>  {
>    wchar_t dummy;
> @@ -62,190 +66,384 @@ _DEFUN (_mbtowc_r, (r, pwc, s, n, state)
>    if (pwc == NULL)
>      pwc = &dummy;
>  
> -  if (s != NULL && n == 0)
> +  if (s == NULL)
> +    return 0;
> +
> +  if (n == 0)
>      return -2;
>  
> -#ifdef _MB_CAPABLE
> -  if (strlen (__locale_charset ()) <= 1)
> -    { /* fall-through */ }
> -  else if (!strcmp (__locale_charset (), "UTF-8"))
> -    {
> -      int ch;
> -      int i = 0;
> -
> -      if (s == NULL)
> -        return 0; /* UTF-8 character encodings are not state-dependent */
> -
> -      if (state->__count == 4)
> -	{
> -	  /* Create the second half of the surrogate pair.  For a description
> -	     see the comment below. */
> -	  wint_t tmp = (wchar_t)((state->__value.__wchb[0] & 0x07) << 18)
> -	    |   (wchar_t)((state->__value.__wchb[1] & 0x3f) << 12)
> -	    |   (wchar_t)((state->__value.__wchb[2] & 0x3f) << 6)
> -	    |   (wchar_t)(state->__value.__wchb[3] & 0x3f);
> -	  state->__count = 0;
> -	  *pwc = 0xdc00 | ((tmp - 0x10000) & 0x3ff);
> -	  return 2;
> -	}
> -      if (state->__count == 0)
> -	ch = t[i++];
> -      else
> +  if (*t >= 0xa0)
> +    {
> +      int iso_idx = __iso_8859_index (charset + 9);
> +      if (iso_idx >= 0)
>  	{
> -	  if (n < (size_t)-1)
> -	    ++n;
> -	  ch = state->__value.__wchb[0];
> +	  *pwc = __iso_8859_conv[iso_idx][*t - 0xa0];
> +	  if (*pwc == 0) /* Invalid character */
> +	    {
> +	      r->_errno = EILSEQ;
> +	      return -1;
> +	    }
> +	  return 1;
>  	}
> +    }
> +
> +  *pwc = (wchar_t) *t;
> +  
> +  if (*t == '\0')
> +    return 0;
> +
> +  return 1;
> +}
> +#endif /* _MB_EXTENDED_CHARSETS_ISO */
> +
> +#ifdef _MB_EXTENDED_CHARSETS_DOS
> +int
> +_DEFUN (__cp_mbtowc, (r, pwc, s, n, charset, state),
> +        struct _reent *r       _AND
> +        wchar_t       *pwc     _AND 
> +        const char    *s       _AND        
> +        size_t         n       _AND
> +	const char    *charset _AND
> +        mbstate_t      *state)
> +{
> +  wchar_t dummy;
> +  unsigned char *t = (unsigned char *)s;
> +
> +  if (pwc == NULL)
> +    pwc = &dummy;
> +
> +  if (s == NULL)
> +    return 0;
> +
> +  if (n == 0)
> +    return -2;
>  
> -      if (ch == '\0')
> +  if (*t >= 0x80)
> +    {
> +      int cp_idx = __cp_index (charset + 2);
> +      if (cp_idx >= 0)
>  	{
> -	  *pwc = 0;
> -	  state->__count = 0;
> -	  return 0; /* s points to the null character */
> +	  *pwc = __cp_conv[cp_idx][*t - 0x80];
> +	  if (*pwc == 0) /* Invalid character */
> +	    {
> +	      r->_errno = EILSEQ;
> +	      return -1;
> +	    }
> +	  return 1;
>  	}
> +    }
> +
> +  *pwc = (wchar_t)*t;
> +  
> +  if (*t == '\0')
> +    return 0;
> +
> +  return 1;
> +}
> +#endif /* _MB_EXTENDED_CHARSETS_DOS */
> +
> +int
> +_DEFUN (__utf8_mbtowc, (r, pwc, s, n, charset, state),
> +        struct _reent *r       _AND
> +        wchar_t       *pwc     _AND 
> +        const char    *s       _AND        
> +        size_t         n       _AND
> +	const char    *charset _AND
> +        mbstate_t      *state)
> +{
> +  wchar_t dummy;
> +  unsigned char *t = (unsigned char *)s;
> +  int ch;
> +  int i = 0;
> +
> +  if (pwc == NULL)
> +    pwc = &dummy;
> +
> +  if (s == NULL)
> +    return 0;
> +
> +  if (n == 0)
> +    return -2;
> +
> +  if (state->__count == 4)
> +    {
> +      /* Create the second half of the surrogate pair.  For a description
> +	 see the comment below. */
> +      wint_t tmp = (wchar_t)((state->__value.__wchb[0] & 0x07) << 18)
> +	|   (wchar_t)((state->__value.__wchb[1] & 0x3f) << 12)
> +	|   (wchar_t)((state->__value.__wchb[2] & 0x3f) << 6)
> +	|   (wchar_t)(state->__value.__wchb[3] & 0x3f);
> +      state->__count = 0;
> +      *pwc = 0xdc00 | ((tmp - 0x10000) & 0x3ff);
> +      return 2;
> +    }
> +  if (state->__count == 0)
> +    ch = t[i++];
> +  else
> +    {
> +      if (n < (size_t)-1)
> +	++n;
> +      ch = state->__value.__wchb[0];
> +    }
> +
> +  if (ch == '\0')
> +    {
> +      *pwc = 0;
> +      state->__count = 0;
> +      return 0; /* s points to the null character */
> +    }
>  
> -      if (ch >= 0x0 && ch <= 0x7f)
> +  if (ch >= 0x0 && ch <= 0x7f)
> +    {
> +      /* single-byte sequence */
> +      state->__count = 0;
> +      *pwc = ch;
> +      return 1;
> +    }
> +  if (ch >= 0xc0 && ch <= 0xdf)
> +    {
> +      /* two-byte sequence */
> +      state->__value.__wchb[0] = ch;
> +      state->__count = 1;
> +      if (n < 2)
> +	return -2;
> +      ch = t[i++];
> +      if (ch < 0x80 || ch > 0xbf)
>  	{
> -	  /* single-byte sequence */
> -	  state->__count = 0;
> -	  *pwc = ch;
> -	  return 1;
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      if (state->__value.__wchb[0] < 0xc2)
> +	{
> +	  /* overlong UTF-8 sequence */
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      state->__count = 0;
> +      *pwc = (wchar_t)((state->__value.__wchb[0] & 0x1f) << 6)
> +	|    (wchar_t)(ch & 0x3f);
> +      return i;
> +    }
> +  if (ch >= 0xe0 && ch <= 0xef)
> +    {
> +      /* three-byte sequence */
> +      wchar_t tmp;
> +      state->__value.__wchb[0] = ch;
> +      if (state->__count == 0)
> +	state->__count = 1;
> +      else if (n < (size_t)-1)
> +	++n;
> +      if (n < 2)
> +	return -2;
> +      ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
> +      if (state->__value.__wchb[0] == 0xe0 && ch < 0xa0)
> +	{
> +	  /* overlong UTF-8 sequence */
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      if (ch < 0x80 || ch > 0xbf)
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      state->__value.__wchb[1] = ch;
> +      state->__count = 2;
> +      if (n < 3)
> +	return -2;
> +      ch = t[i++];
> +      if (ch < 0x80 || ch > 0xbf)
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      state->__count = 0;
> +      tmp = (wchar_t)((state->__value.__wchb[0] & 0x0f) << 12)
> +	|    (wchar_t)((state->__value.__wchb[1] & 0x3f) << 6)
> +	|     (wchar_t)(ch & 0x3f);
> +    
> +      if (tmp >= 0xd800 && tmp <= 0xdfff)
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      *pwc = tmp;
> +      return i;
> +    }
> +  if (ch >= 0xf0 && ch <= 0xf7)
> +    {
> +      /* four-byte sequence */
> +      wint_t tmp;
> +      state->__value.__wchb[0] = ch;
> +      if (state->__count == 0)
> +	state->__count = 1;
> +      else if (n < (size_t)-1)
> +	++n;
> +      if (n < 2)
> +	return -2;
> +      ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
> +      if (state->__value.__wchb[0] == 0xf0 && ch < 0x90)
> +	{
> +	  /* overlong UTF-8 sequence */
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      if (ch < 0x80 || ch > 0xbf)
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
>  	}
> -      else if (ch >= 0xc0 && ch <= 0xdf)
> +      state->__value.__wchb[1] = ch;
> +      if (state->__count == 1)
> +	state->__count = 2;
> +      else if (n < (size_t)-1)
> +	++n;
> +      if (n < 3)
> +	return -2;
> +      ch = (state->__count == 2) ? t[i++] : state->__value.__wchb[2];
> +      if (ch < 0x80 || ch > 0xbf)
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      state->__value.__wchb[2] = ch;
> +      state->__count = 3;
> +      if (n < 4)
> +	return -2;
> +      ch = t[i++];
> +      if (ch < 0x80 || ch > 0xbf)
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +      tmp = (wint_t)((state->__value.__wchb[0] & 0x07) << 18)
> +	|   (wint_t)((state->__value.__wchb[1] & 0x3f) << 12)
> +	|   (wint_t)((state->__value.__wchb[2] & 0x3f) << 6)
> +	|   (wint_t)(ch & 0x3f);
> +      if (tmp > 0xffff && sizeof(wchar_t) == 2)
> +	{
> +	  /* On systems which have wchar_t being UTF-16 values, the value
> +	     doesn't fit into a single wchar_t in this case.  So what we
> +	     do here is to store the state with a special value of __count
> +	     and return the first half of a surrogate pair.  As return
> +	     value we choose to return the half of the actual UTF-8 char.
> +	     The second half is returned in case we recognize the special
> +	     __count value above. */
> +	  state->__value.__wchb[3] = ch;
> +	  state->__count = 4;
> +	  *pwc = 0xd800 | (((tmp - 0x10000) >> 10) & 0x3ff);
> +	  return 2;
> +	}
> +      *pwc = tmp;
> +      state->__count = 0;
> +      return i;
> +    }
> +
> +  r->_errno = EILSEQ;
> +  return -1;
> +}
> +
> +/* Cygwin defines its own doublebyte charset conversion functions 
> +   because the underlying OS requires wchar_t == UTF-16. */
> +#ifndef  __CYGWIN__
> +int
> +_DEFUN (__sjis_mbtowc, (r, pwc, s, n, charset, state),
> +        struct _reent *r       _AND
> +        wchar_t       *pwc     _AND 
> +        const char    *s       _AND        
> +        size_t         n       _AND
> +	const char    *charset _AND
> +        mbstate_t      *state)
> +{
> +  wchar_t dummy;
> +  unsigned char *t = (unsigned char *)s;
> +  int ch;
> +  int i = 0;
> +
> +  if (pwc == NULL)
> +    pwc = &dummy;
> +
> +  if (s == NULL)
> +    return 0;  /* not state-dependent */
> +
> +  if (n == 0)
> +    return -2;
> +
> +  ch = t[i++];
> +  if (state->__count == 0)
> +    {
> +      if (_issjis1 (ch))
>  	{
> -	  /* two-byte sequence */
>  	  state->__value.__wchb[0] = ch;
>  	  state->__count = 1;
> -	  if (n < 2)
> +	  if (n <= 1)
>  	    return -2;
>  	  ch = t[i++];
> -	  if (ch < 0x80 || ch > 0xbf)
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  if (state->__value.__wchb[0] < 0xc2)
> -	    {
> -	      /* overlong UTF-8 sequence */
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  state->__count = 0;
> -	  *pwc = (wchar_t)((state->__value.__wchb[0] & 0x1f) << 6)
> -	    |    (wchar_t)(ch & 0x3f);
> -	  return i;
>  	}
> -      else if (ch >= 0xe0 && ch <= 0xef)
> +    }
> +  if (state->__count == 1)
> +    {
> +      if (_issjis2 (ch))
>  	{
> -	  /* three-byte sequence */
> -	  wchar_t tmp;
> -	  state->__value.__wchb[0] = ch;
> -	  if (state->__count == 0)
> -	    state->__count = 1;
> -	  else if (n < (size_t)-1)
> -	    ++n;
> -	  if (n < 2)
> -	    return -2;
> -	  ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
> -	  if (state->__value.__wchb[0] == 0xe0 && ch < 0xa0)
> -	    {
> -	      /* overlong UTF-8 sequence */
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  if (ch < 0x80 || ch > 0xbf)
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  state->__value.__wchb[1] = ch;
> -	  state->__count = 2;
> -	  if (n < 3)
> -	    return -2;
> -	  ch = t[i++];
> -	  if (ch < 0x80 || ch > 0xbf)
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> +	  *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
>  	  state->__count = 0;
> -	  tmp = (wchar_t)((state->__value.__wchb[0] & 0x0f) << 12)
> -	    |    (wchar_t)((state->__value.__wchb[1] & 0x3f) << 6)
> -	    |     (wchar_t)(ch & 0x3f);
> -	
> -	  if (tmp >= 0xd800 && tmp <= 0xdfff)
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  *pwc = tmp;
>  	  return i;
>  	}
> -      else if (ch >= 0xf0 && ch <= 0xf7)
> +      else  
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +    }
> +
> +  *pwc = (wchar_t)*t;
> +  
> +  if (*t == '\0')
> +    return 0;
> +
> +  return 1;
> +}
> +
> +int
> +_DEFUN (__eucjp_mbtowc, (r, pwc, s, n, charset, state),
> +        struct _reent *r       _AND
> +        wchar_t       *pwc     _AND 
> +        const char    *s       _AND        
> +        size_t         n       _AND
> +	const char    *charset _AND
> +        mbstate_t      *state)
> +{
> +  wchar_t dummy;
> +  unsigned char *t = (unsigned char *)s;
> +  int ch;
> +  int i = 0;
> +
> +  if (pwc == NULL)
> +    pwc = &dummy;
> +
> +  if (s == NULL)
> +    return 0;
> +
> +  if (n == 0)
> +    return -2;
> +
> +  ch = t[i++];
> +  if (state->__count == 0)
> +    {
> +      if (_iseucjp (ch))
>  	{
> -	  /* four-byte sequence */
> -	  wint_t tmp;
>  	  state->__value.__wchb[0] = ch;
> -	  if (state->__count == 0)
> -	    state->__count = 1;
> -	  else if (n < (size_t)-1)
> -	    ++n;
> -	  if (n < 2)
> -	    return -2;
> -	  ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
> -	  if (state->__value.__wchb[0] == 0xf0 && ch < 0x90)
> -	    {
> -	      /* overlong UTF-8 sequence */
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  if (ch < 0x80 || ch > 0xbf)
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  state->__value.__wchb[1] = ch;
> -	  if (state->__count == 1)
> -	    state->__count = 2;
> -	  else if (n < (size_t)-1)
> -	    ++n;
> -	  if (n < 3)
> -	    return -2;
> -	  ch = (state->__count == 2) ? t[i++] : state->__value.__wchb[2];
> -	  if (ch < 0x80 || ch > 0xbf)
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  state->__value.__wchb[2] = ch;
> -	  state->__count = 3;
> -	  if (n < 4)
> +	  state->__count = 1;
> +	  if (n <= 1)
>  	    return -2;
>  	  ch = t[i++];
> -	  if (ch < 0x80 || ch > 0xbf)
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	  tmp = (wint_t)((state->__value.__wchb[0] & 0x07) << 18)
> -	    |   (wint_t)((state->__value.__wchb[1] & 0x3f) << 12)
> -	    |   (wint_t)((state->__value.__wchb[2] & 0x3f) << 6)
> -	    |   (wint_t)(ch & 0x3f);
> -	  if (tmp > 0xffff && sizeof(wchar_t) == 2)
> -	    {
> -	      /* On systems which have wchar_t being UTF-16 values, the value
> -		 doesn't fit into a single wchar_t in this case.  So what we
> -		 do here is to store the state with a special value of __count
> -		 and return the first half of a surrogate pair.  As return
> -		 value we choose to return the half of the actual UTF-8 char.
> -		 The second half is returned in case we recognize the special
> -		 __count value above. */
> -	      state->__value.__wchb[3] = ch;
> -	      state->__count = 4;
> -	      *pwc = 0xd800 | (((tmp - 0x10000) >> 10) & 0x3ff);
> -	      return 2;
> -	    }
> -	  *pwc = tmp;
> +	}
> +    }
> +  if (state->__count == 1)
> +    {
> +      if (_iseucjp (ch))
> +	{
> +	  *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
>  	  state->__count = 0;
>  	  return i;
>  	}
> @@ -254,165 +452,141 @@ _DEFUN (_mbtowc_r, (r, pwc, s, n, state)
>  	  r->_errno = EILSEQ;
>  	  return -1;
>  	}
> -    }      
> -  else if (!strcmp (__locale_charset (), "SJIS"))
> +    }
> +
> +  *pwc = (wchar_t)*t;
> +  
> +  if (*t == '\0')
> +    return 0;
> +
> +  return 1;
> +}
> +
> +int
> +_DEFUN (__jis_mbtowc, (r, pwc, s, n, charset, state),
> +        struct _reent *r       _AND
> +        wchar_t       *pwc     _AND 
> +        const char    *s       _AND        
> +        size_t         n       _AND
> +	const char    *charset _AND
> +        mbstate_t      *state)
> +{
> +  wchar_t dummy;
> +  unsigned char *t = (unsigned char *)s;
> +  JIS_STATE curr_state;
> +  JIS_ACTION action;
> +  JIS_CHAR_TYPE ch;
> +  unsigned char *ptr;
> +  unsigned int i;
> +  int curr_ch;
> +
> +  if (pwc == NULL)
> +    pwc = &dummy;
> +
> +  if (s == NULL)
>      {
> -      int ch;
> -      int i = 0;
> -      if (s == NULL)
> -        return 0;  /* not state-dependent */
> -      ch = t[i++];
> -      if (state->__count == 0)
> -	{
> -	  if (_issjis1 (ch))
> -	    {
> -	      state->__value.__wchb[0] = ch;
> -	      state->__count = 1;
> -	      if (n <= 1)
> -		return -2;
> -	      ch = t[i++];
> -	    }
> -	}
> -      if (state->__count == 1)
> -	{
> -	  if (_issjis2 (ch))
> -	    {
> -	      *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
> -	      state->__count = 0;
> -	      return i;
> -	    }
> -	  else  
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -	}
> +      state->__state = ASCII;
> +      return 1;  /* state-dependent */
>      }
> -  else if (!strcmp (__locale_charset (), "EUCJP"))
> +
> +  if (n == 0)
> +    return -2;
> +
> +  curr_state = state->__state;
> +  ptr = t;
> +
> +  for (i = 0; i < n; ++i)
>      {
> -      int ch;
> -      int i = 0;
> -      if (s == NULL)
> -        return 0;  /* not state-dependent */
> -      ch = t[i++];
> -      if (state->__count == 0)
> +      curr_ch = t[i];
> +      switch (curr_ch)
>  	{
> -	  if (_iseucjp (ch))
> -	    {
> -	      state->__value.__wchb[0] = ch;
> -	      state->__count = 1;
> -	      if (n <= 1)
> -		return -2;
> -	      ch = t[i++];
> -	    }
> -	}
> -      if (state->__count == 1)
> -	{
> -	  if (_iseucjp (ch))
> -	    {
> -	      *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
> -	      state->__count = 0;
> -	      return i;
> -	    }
> +	case ESC_CHAR:
> +	  ch = ESCAPE;
> +	  break;
> +	case '$':
> +	  ch = DOLLAR;
> +	  break;
> +	case '@':
> +	  ch = AT;
> +	  break;
> +	case '(':
> +	  ch = BRACKET;
> +	  break;
> +	case 'B':
> +	  ch = B;
> +	  break;
> +	case 'J':
> +	  ch = J;
> +	  break;
> +	case '\0':
> +	  ch = NUL;
> +	  break;
> +	default:
> +	  if (_isjis (curr_ch))
> +	    ch = JIS_CHAR;
>  	  else
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> +	    ch = OTHER;
> +	}
> +
> +      action = JIS_action_table[curr_state][ch];
> +      curr_state = JIS_state_table[curr_state][ch];
> +    
> +      switch (action)
> +	{
> +	case NOOP:
> +	  break;
> +	case EMPTY:
> +	  state->__state = ASCII;
> +	  *pwc = (wchar_t)0;
> +	  return 0;
> +	case COPY_A:
> +	  state->__state = ASCII;
> +	  *pwc = (wchar_t)*ptr;
> +	  return (i + 1);
> +	case COPY_J1:
> +	  state->__value.__wchb[0] = t[i];
> +	  break;
> +	case COPY_J2:
> +	  state->__state = JIS;
> +	  *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)(t[i]);
> +	  return (i + 1);
> +	case MAKE_A:
> +	  ptr = (unsigned char *)(t + i + 1);
> +	  break;
> +	case ERROR:
> +	default:
> +	  r->_errno = EILSEQ;
> +	  return -1;
>  	}
> +
>      }
> -  else if (!strcmp (__locale_charset (), "JIS"))
> -    {
> -      JIS_STATE curr_state;
> -      JIS_ACTION action;
> -      JIS_CHAR_TYPE ch;
> -      unsigned char *ptr;
> -      unsigned int i;
> -      int curr_ch;
> - 
> -      if (s == NULL)
> -        {
> -          state->__state = ASCII;
> -          return 1;  /* state-dependent */
> -        }
> -
> -      curr_state = state->__state;
> -      ptr = t;
> -
> -      for (i = 0; i < n; ++i)
> -        {
> -          curr_ch = t[i];
> -          switch (curr_ch)
> -            {
> -	    case ESC_CHAR:
> -              ch = ESCAPE;
> -              break;
> -	    case '$':
> -              ch = DOLLAR;
> -              break;
> -            case '@':
> -              ch = AT;
> -              break;
> -            case '(':
> -	      ch = BRACKET;
> -              break;
> -            case 'B':
> -              ch = B;
> -              break;
> -            case 'J':
> -              ch = J;
> -              break;
> -            case '\0':
> -              ch = NUL;
> -              break;
> -            default:
> -              if (_isjis (curr_ch))
> -                ch = JIS_CHAR;
> -              else
> -                ch = OTHER;
> -	    }
>  
> -          action = JIS_action_table[curr_state][ch];
> -          curr_state = JIS_state_table[curr_state][ch];
> -        
> -          switch (action)
> -            {
> -            case NOOP:
> -              break;
> -            case EMPTY:
> -              state->__state = ASCII;
> -              *pwc = (wchar_t)0;
> -              return 0;
> -            case COPY_A:
> -	      state->__state = ASCII;
> -              *pwc = (wchar_t)*ptr;
> -              return (i + 1);
> -            case COPY_J1:
> -              state->__value.__wchb[0] = t[i];
> -	      break;
> -            case COPY_J2:
> -              state->__state = JIS;
> -              *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)(t[i]);
> -              return (i + 1);
> -            case MAKE_A:
> -              ptr = (unsigned char *)(t + i + 1);
> -              break;
> -            case ERROR:
> -            default:
> -	      r->_errno = EILSEQ;
> -              return -1;
> -            }
> +  state->__state = curr_state;
> +  return -2;  /* n < bytes needed */
> +}
> +#endif /* !__CYGWIN__*/
> +#endif /* _MB_CAPABLE */
>  
> -        }
> +int
> +_DEFUN (__ascii_mbtowc, (r, pwc, s, n, charset, state),
> +        struct _reent *r       _AND
> +        wchar_t       *pwc     _AND 
> +        const char    *s       _AND        
> +        size_t         n       _AND
> +	const char    *charset _AND
> +        mbstate_t      *state)
> +{
> +  wchar_t dummy;
> +  unsigned char *t = (unsigned char *)s;
>  
> -      state->__state = curr_state;
> -      return -2;  /* n < bytes needed */
> -    }
> -#endif /* _MB_CAPABLE */               
> +  if (pwc == NULL)
> +    pwc = &dummy;
>  
> -  /* otherwise this must be the "C" locale or unknown locale */
>    if (s == NULL)
> -    return 0;  /* not state-dependent */
> +    return 0;
> +
> +  if (n == 0)
> +    return -2;
>  
>    *pwc = (wchar_t)*t;
>    
> @@ -421,3 +595,14 @@ _DEFUN (_mbtowc_r, (r, pwc, s, n, state)
>  
>    return 1;
>  }
> +
> +int
> +_DEFUN (_mbtowc_r, (r, pwc, s, n, state),
> +        struct _reent *r   _AND
> +        wchar_t       *pwc _AND 
> +        const char    *s   _AND        
> +        size_t         n   _AND
> +        mbstate_t      *state)
> +{
> +  return __mbtowc (r, pwc, s, n, __locale_charset (), state);
> +}
> Index: libc/stdlib/sb_charsets.c
> ===================================================================
> RCS file: libc/stdlib/sb_charsets.c
> diff -N libc/stdlib/sb_charsets.c
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ libc/stdlib/sb_charsets.c	22 Mar 2009 16:25:07 -0000
> @@ -0,0 +1,697 @@
> +#include <newlib.h>
> +#include <wchar.h>
> +
> +#ifdef _MB_CAPABLE
> +extern char *__locale_charset ();
> +
> +#ifdef _MB_EXTENDED_CHARSETS_ISO
> +/* Tables for the ISO-8859-x to UTF conversion.  The first index into the
> +   table is a value computed from the value x (function __iso_8859_index),
> +   the second index is the value of the incoming character - 0xa0.
> +   Values < 0xa0 don't have to be converted anyway. */
> +wchar_t __iso_8859_conv[14][0x60] = {
> +  /* ISO-8859-2 */
> +  { 0xa0, 0x104, 0x2d8, 0x141, 0xa4, 0x13d, 0x15a, 0xa7,
> +    0xa8, 0x160, 0x15e, 0x164, 0x179, 0xad, 0x17d, 0x17b,
> +    0xb0, 0x105, 0x2db, 0x142, 0xb4, 0x13e, 0x15b, 0x2c7,
> +    0xb8, 0x161, 0x15f, 0x165, 0x17a, 0x2dd, 0x17e, 0x17c,
> +    0x154, 0xc1, 0xc2, 0x102, 0xc4, 0x139, 0x106, 0xc7,
> +    0x10c, 0xc9, 0x118, 0xcb, 0x11a, 0xcd, 0xce, 0x10e,
> +    0x110, 0x143, 0x147, 0xd3, 0xd4, 0x150, 0xd6, 0xd7,
> +    0x158, 0x16e, 0xda, 0x170, 0xdc, 0xdd, 0x162, 0xdf,
> +    0x155, 0xe1, 0xe2, 0x103, 0xe4, 0x13a, 0x107, 0xe7,
> +    0x10d, 0xe9, 0x119, 0xeb, 0x11b, 0xed, 0xee, 0x10f,
> +    0x111, 0x144, 0x148, 0xf3, 0xf4, 0x151, 0xf6, 0xf7,
> +    0x159, 0x16f, 0xfa, 0x171, 0xfc, 0xfd, 0x163, 0x2d9 },
> +  /* ISO-8859-3 */
> +  { 0xa0, 0x126, 0x2d8, 0xa3, 0xa4, 0x0, 0x124, 0xa7,
> +    0xa8, 0x130, 0x15e, 0x11e, 0x134, 0xad, 0x0, 0x17b,
> +    0xb0, 0x127, 0xb2, 0xb3, 0xb4, 0xb5, 0x125, 0xb7,
> +    0xb8, 0x131, 0x15f, 0x11f, 0x135, 0xbd, 0x0, 0x17c,
> +    0xc0, 0xc1, 0xc2, 0x0, 0xc4, 0x10a, 0x108, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
> +    0x0, 0xd1, 0xd2, 0xd3, 0xd4, 0x120, 0xd6, 0xd7,
> +    0x11c, 0xd9, 0xda, 0xdb, 0xdc, 0x16c, 0x15c, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0x0, 0xe4, 0x10b, 0x109, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
> +    0x0, 0xf1, 0xf2, 0xf3, 0xf4, 0x121, 0xf6, 0xf7,
> +    0x11d, 0xf9, 0xfa, 0xfb, 0xfc, 0x16d, 0x15d, 0x2d9 },
> +  /* ISO-8859-4 */
> +  { 0xa0, 0x104, 0x138, 0x156, 0xa4, 0x128, 0x13b, 0xa7,
> +    0xa8, 0x160, 0x112, 0x122, 0x166, 0xad, 0x17d, 0xaf,
> +    0xb0, 0x105, 0x2db, 0x157, 0xb4, 0x129, 0x13c, 0x2c7,
> +    0xb8, 0x161, 0x113, 0x123, 0x167, 0x14a, 0x17e, 0x14b,
> +    0x100, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0x12e,
> +    0x10c, 0xc9, 0x118, 0xcb, 0x116, 0xcd, 0xce, 0x12a,
> +    0x110, 0x145, 0x14c, 0x136, 0xd4, 0xd5, 0xd6, 0xd7,
> +    0xd8, 0x172, 0xda, 0xdb, 0xdc, 0x168, 0x16a, 0xdf,
> +    0x101, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0x12f,
> +    0x10d, 0xe9, 0x119, 0xeb, 0x117, 0xed, 0xee, 0x12b,
> +    0x111, 0x146, 0x14d, 0x137, 0xf4, 0xf5, 0xf6, 0xf7,
> +    0xf8, 0x173, 0xfa, 0xfb, 0xfc, 0x169, 0x16b, 0x2d9 },
> +  /* ISO-8859-5 */
> +  { 0xa0, 0x401, 0x402, 0x403, 0x404, 0x405, 0x406, 0x407,
> +    0x408, 0x409, 0x40a, 0x40b, 0x40c, 0xad, 0x40e, 0x40f,
> +    0x410, 0x411, 0x412, 0x413, 0x414, 0x415, 0x416, 0x417,
> +    0x418, 0x419, 0x41a, 0x41b, 0x41c, 0x41d, 0x41e, 0x41f,
> +    0x420, 0x421, 0x422, 0x423, 0x424, 0x425, 0x426, 0x427,
> +    0x428, 0x429, 0x42a, 0x42b, 0x42c, 0x42d, 0x42e, 0x42f,
> +    0x430, 0x431, 0x432, 0x433, 0x434, 0x435, 0x436, 0x437,
> +    0x438, 0x439, 0x43a, 0x43b, 0x43c, 0x43d, 0x43e, 0x43f,
> +    0x440, 0x441, 0x442, 0x443, 0x444, 0x445, 0x446, 0x447,
> +    0x448, 0x449, 0x44a, 0x44b, 0x44c, 0x44d, 0x44e, 0x44f,
> +    0x2116, 0x451, 0x452, 0x453, 0x454, 0x455, 0x456, 0x457,
> +    0x458, 0x459, 0x45a, 0x45b, 0x45c, 0xa7, 0x45e, 0x45f },
> +  /* ISO-8859-6 */
> +  { 0xa0, 0x0, 0x0, 0x0, 0xa4, 0x0, 0x0, 0x0,
> +    0x0, 0x0, 0x0, 0x0, 0x60c, 0xad, 0x0, 0x0,
> +    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0x0, 0x0, 0x0, 0x61b, 0x0, 0x0, 0x0, 0x61f,
> +    0x0, 0x621, 0x622, 0x623, 0x624, 0x625, 0x626, 0x627,
> +    0x628, 0x629, 0x62a, 0x62b, 0x62c, 0x62d, 0x62e, 0x62f,
> +    0x630, 0x631, 0x632, 0x633, 0x634, 0x635, 0x636, 0x637,
> +    0x638, 0x639, 0x63a, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0x640, 0x641, 0x642, 0x643, 0x644, 0x645, 0x646, 0x647,
> +    0x648, 0x649, 0x64a, 0x64b, 0x64c, 0x64d, 0x64e, 0x64f,
> +    0x650, 0x651, 0x652, 0x64b, 0xf4, 0xf5, 0xf6, 0xf7,
> +    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff },
> +  /* ISO-8859-7 */
> +  { 0xa0, 0x2018, 0x2019, 0xa3, 0x20ac, 0x20af, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0x37a, 0xab, 0xac, 0xad, 0x0, 0x2015,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0x384, 0x385, 0x386, 0xb7,
> +    0x388, 0x389, 0x38a, 0xbb, 0x38c, 0xbd, 0x38e, 0x38f,
> +    0x390, 0x391, 0x392, 0x393, 0x394, 0x395, 0x396, 0x397,
> +    0x398, 0x399, 0x39a, 0x39b, 0x39c, 0x39d, 0x39e, 0x39f,
> +    0x3a0, 0x3a1, 0x0, 0x3a3, 0x3a4, 0x3a5, 0x3a6, 0x3a7,
> +    0x3a8, 0x3a9, 0x3aa, 0x3ab, 0x3ac, 0x3ad, 0x3ae, 0x3af,
> +    0x3b0, 0x3b1, 0x3b2, 0x3b3, 0x3b4, 0x3b5, 0x3b6, 0x3b7,
> +    0x3b8, 0x3b9, 0x3ba, 0x3bb, 0x3bc, 0x3bd, 0x3be, 0x3bf,
> +    0x3c0, 0x3c1, 0x3c2, 0x3c3, 0x3c4, 0x3c5, 0x3c6, 0x3c7,
> +    0x3c8, 0x3c9, 0x3ca, 0x3cb, 0x3cc, 0x3cd, 0x3ce, 0xff },
> +  /* ISO-8859-8 */
> +  { 0xa0, 0x0, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0xd7, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0xb9, 0xf7, 0xbb, 0xbc, 0xbd, 0xbe, 0x0,
> +    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2017,
> +    0x5d0, 0x5d1, 0x5d2, 0x5d3, 0x5d4, 0x5d5, 0x5d6, 0x5d7,
> +    0x5d8, 0x5d9, 0x5da, 0x5db, 0x5dc, 0x5dd, 0x5de, 0x5df,
> +    0x5e0, 0x5e1, 0x5e2, 0x5e3, 0x5e4, 0x5e5, 0x5e6, 0x5e7,
> +    0x5e8, 0x5e9, 0x5ea, 0x0, 0x0, 0x200e, 0x200f, 0x200e },
> +  /* ISO-8859-9 */
> +  { 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
> +    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
> +    0x11e, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
> +    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0x130, 0x15e, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
> +    0x11f, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
> +    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0x131, 0x15f, 0xff },
> +  /* ISO-8859-10 */
> +  { 0xa0, 0x104, 0x112, 0x122, 0x12a, 0x128, 0x136, 0xa7,
> +    0x13b, 0x110, 0x160, 0x166, 0x17d, 0xad, 0x16a, 0x14a,
> +    0xb0, 0x105, 0x113, 0x123, 0x12b, 0x129, 0x137, 0xb7,
> +    0x13c, 0x111, 0x161, 0x167, 0x17e, 0x2015, 0x16b, 0x14b,
> +    0x100, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0x12e,
> +    0x10c, 0xc9, 0x118, 0xcb, 0x116, 0xcd, 0xce, 0xcf,
> +    0xd0, 0x145, 0x14c, 0xd3, 0xd4, 0xd5, 0xd6, 0x168,
> +    0xd8, 0x172, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf,
> +    0x101, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0x12f,
> +    0x10d, 0xe9, 0x119, 0xeb, 0x117, 0xed, 0xee, 0xef,
> +    0xf0, 0x146, 0x14d, 0xf3, 0xf4, 0xf5, 0xf6, 0x169,
> +    0xf8, 0x173, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0x138 },
> +  /* ISO-8859-11 */
> +  { 0xa0, 0xe01, 0xe02, 0xe03, 0xe04, 0xe05, 0xe06, 0xe07,
> +    0xe08, 0xe09, 0xe0a, 0xe0b, 0xe0c, 0xe0d, 0xe0e, 0xe0f,
> +    0xe10, 0xe11, 0xe12, 0xe13, 0xe14, 0xe15, 0xe16, 0xe17,
> +    0xe18, 0xe19, 0xe1a, 0xe1b, 0xe1c, 0xe1d, 0xe1e, 0xe1f,
> +    0xe20, 0xe21, 0xe22, 0xe23, 0xe24, 0xe25, 0xe26, 0xe27,
> +    0xe28, 0xe29, 0xe2a, 0xe2b, 0xe2c, 0xe2d, 0xe2e, 0xe2f,
> +    0xe30, 0xe31, 0xe32, 0xe33, 0xe34, 0xe35, 0xe36, 0xe37,
> +    0xe38, 0xe39, 0xe3a, 0x0, 0x0, 0x0, 0x0, 0xe3f,
> +    0xe40, 0xe41, 0xe42, 0xe43, 0xe44, 0xe45, 0xe46, 0xe47,
> +    0xe48, 0xe49, 0xe4a, 0xe4b, 0xe4c, 0xe4d, 0xe4e, 0xe4f,
> +    0xe50, 0xe51, 0xe52, 0xe53, 0xe54, 0xe55, 0xe56, 0xe57,
> +    0xe58, 0xe59, 0xe5a, 0xe5b, 0xe31, 0xe34, 0xe47, 0xff },
> +  /* ISO-8859-12 doesn't exist.  The below code decrements the index
> +     into the table by one for ISO numbers > 12. */
> +  /* ISO-8859-13 */
> +  { 0xa0, 0x201d, 0xa2, 0xa3, 0xa4, 0x201e, 0xa6, 0xa7,
> +    0xd8, 0xa9, 0x156, 0xab, 0xac, 0xad, 0xae, 0xc6,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0x201c, 0xb5, 0xb6, 0xb7,
> +    0xf8, 0xb9, 0x157, 0xbb, 0xbc, 0xbd, 0xbe, 0xe6,
> +    0x104, 0x12e, 0x100, 0x106, 0xc4, 0xc5, 0x118, 0x112,
> +    0x10c, 0xc9, 0x179, 0x116, 0x122, 0x136, 0x12a, 0x13b,
> +    0x160, 0x143, 0x145, 0xd3, 0x14c, 0xd5, 0xd6, 0xd7,
> +    0x172, 0x141, 0x15a, 0x16a, 0xdc, 0x17b, 0x17d, 0xdf,
> +    0x105, 0x12f, 0x101, 0x107, 0xe4, 0xe5, 0x119, 0x113,
> +    0x10d, 0xe9, 0x17a, 0x117, 0x123, 0x137, 0x12b, 0x13c,
> +    0x161, 0x144, 0x146, 0xf3, 0x14d, 0xf5, 0xf6, 0xf7,
> +    0x173, 0x142, 0x15b, 0x16b, 0xfc, 0x17c, 0x17e, 0x2019 },
> +  /* ISO-8859-14 */
> +  { 0xa0, 0x1e02, 0x1e03, 0xa3, 0x10a, 0x10b, 0x1e0a, 0xa7,
> +    0x1e80, 0xa9, 0x1e82, 0x1e0b, 0x1ef2, 0xad, 0xae, 0x178,
> +    0x1e1e, 0x1e1f, 0x120, 0x121, 0x1e40, 0x1e41, 0xb6, 0x1e56,
> +    0x1e81, 0x1e57, 0x1e83, 0x1e60, 0x1ef3, 0x1e84, 0x1e85, 0x1e61,
> +    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
> +    0x174, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0x1e6a,
> +    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0x176, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
> +    0x175, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0x1e6b,
> +    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0x177, 0xff },
> +  /* ISO-8859-15 */
> +  { 0xa0, 0xa1, 0xa2, 0xa3, 0x20ac, 0xa5, 0x160, 0xa7,
> +    0x161, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0x17d, 0xb5, 0xb6, 0xb7,
> +    0x17e, 0xb9, 0xba, 0xbb, 0x152, 0x153, 0x178, 0xbf,
> +    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
> +    0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
> +    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
> +    0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
> +    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff },
> +  /* ISO-8859-16 */
> +  { 0xa0, 0x104, 0x105, 0x141, 0x20ac, 0x201e, 0x160, 0xa7,
> +    0x161, 0xa9, 0x218, 0xab, 0x179, 0xad, 0x17a, 0x17b,
> +    0xb0, 0xb1, 0x10c, 0x142, 0x17d, 0x201d, 0xb6, 0xb7,
> +    0x17e, 0x10d, 0x219, 0xbb, 0x152, 0x153, 0x178, 0x17c,
> +    0xc0, 0xc1, 0xc2, 0x102, 0xc4, 0x106, 0xc6, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
> +    0x110, 0x143, 0xd2, 0xd3, 0xd4, 0x150, 0xd6, 0x15a,
> +    0x170, 0xd9, 0xda, 0xdb, 0xdc, 0x118, 0x21a, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0x103, 0xe4, 0x107, 0xe6, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
> +    0x111, 0x144, 0xf2, 0xf3, 0xf4, 0x151, 0xf6, 0x15b,
> +    0x171, 0xf9, 0xfa, 0xfb, 0xfc, 0x119, 0x21b, 0xff }
> +};
> +#endif /* _MB_EXTENDED_CHARSETS_ISO */
> +
> +#ifdef _MB_EXTENDED_CHARSETS_DOS
> +/* Tables for the Windows default singlebyte ANSI codepage conversion. 
> +   The first index into the table is a value computed from the codepage
> +   value (function __cp_index), the second index is the value of the
> +   incoming character - 0x80.
> +   Values < 0x80 don't have to be converted anyway. */
> +wchar_t __cp_conv[22][0x80] = {
> +  /* CP437 */
> +  { 0xc7, 0xfc, 0xe9, 0xe2, 0xe4, 0xe0, 0xe5, 0xe7,
> +    0xea, 0xeb, 0xe8, 0xef, 0xee, 0xec, 0xc4, 0xc5,
> +    0xc9, 0xe6, 0xc6, 0xf4, 0xf6, 0xf2, 0xfb, 0xf9,
> +    0xff, 0xd6, 0xdc, 0xa2, 0xa3, 0xa5, 0x20a7, 0x192,
> +    0xe1, 0xed, 0xf3, 0xfa, 0xf1, 0xd1, 0xaa, 0xba,
> +    0xbf, 0x2310, 0xac, 0xbd, 0xbc, 0xa1, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
> +    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
> +    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
> +    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
> +    0x3b1, 0xdf, 0x393, 0x3c0, 0x3a3, 0x3c3, 0xb5, 0x3c4,
> +    0x3a6, 0x398, 0x3a9, 0x3b4, 0x221e, 0x3c6, 0x3b5, 0x2229,
> +    0x2261, 0xb1, 0x2265, 0x2264, 0x2320, 0x2321, 0xf7, 0x2248,
> +    0xb0, 0x2219, 0xb7, 0x221a, 0x207f, 0xb2, 0x25a0, 0xa0 },
> +  /* CP720 */
> +  { 0x0, 0x0, 0xe9, 0xe2, 0x0, 0xe0, 0x0, 0xe7,
> +    0xea, 0xeb, 0xe8, 0xef, 0xee, 0x0, 0x0, 0x0,
> +    0x0, 0x651, 0x652, 0xf4, 0xa4, 0x640, 0xfb, 0xf9,
> +    0x621, 0x622, 0x623, 0x624, 0xa3, 0x625, 0x626, 0x627,
> +    0x628, 0x629, 0x62a, 0x62b, 0x62c, 0x62d, 0x62e, 0x62f,
> +    0x630, 0x631, 0x632, 0x633, 0x634, 0x635, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
> +    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
> +    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
> +    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
> +    0x636, 0x637, 0x638, 0x639, 0x63a, 0x641, 0xb5, 0x642,
> +    0x643, 0x644, 0x645, 0x646, 0x647, 0x648, 0x649, 0x64a,
> +    0x2261, 0x64b, 0x64c, 0x64d, 0x64e, 0x64f, 0x650, 0x2248,
> +    0xb0, 0x2219, 0xb7, 0x221a, 0x207f, 0xb2, 0x25a0, 0xa0 },
> +  /* CP737 */
> +  { 0x391, 0x392, 0x393, 0x394, 0x395, 0x396, 0x397, 0x398,
> +    0x399, 0x39a, 0x39b, 0x39c, 0x39d, 0x39e, 0x39f, 0x3a0,
> +    0x3a1, 0x3a3, 0x3a4, 0x3a5, 0x3a6, 0x3a7, 0x3a8, 0x3a9,
> +    0x3b1, 0x3b2, 0x3b3, 0x3b4, 0x3b5, 0x3b6, 0x3b7, 0x3b8,
> +    0x3b9, 0x3ba, 0x3bb, 0x3bc, 0x3bd, 0x3be, 0x3bf, 0x3c0,
> +    0x3c1, 0x3c3, 0x3c2, 0x3c4, 0x3c5, 0x3c6, 0x3c7, 0x3c8,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
> +    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
> +    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
> +    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
> +    0x3c9, 0x3ac, 0x3ad, 0x3ae, 0x3ca, 0x3af, 0x3cc, 0x3cd,
> +    0x3cb, 0x3ce, 0x386, 0x388, 0x389, 0x38a, 0x38c, 0x38e,
> +    0x38f, 0xb1, 0x2265, 0x2264, 0x3aa, 0x3ab, 0xf7, 0x2248,
> +    0xb0, 0x2219, 0xb7, 0x221a, 0x207f, 0xb2, 0x25a0, 0xa0 },
> +  /* CP775 */
> +  { 0x106, 0xfc, 0xe9, 0x101, 0xe4, 0x123, 0xe5, 0x107,
> +    0x142, 0x113, 0x156, 0x157, 0x12b, 0x179, 0xc4, 0xc5,
> +    0xc9, 0xe6, 0xc6, 0x14d, 0xf6, 0x122, 0xa2, 0x15a,
> +    0x15b, 0xd6, 0xdc, 0xf8, 0xa3, 0xd8, 0xd7, 0xa4,
> +    0x100, 0x12a, 0xf3, 0x17b, 0x17c, 0x17a, 0x201d, 0xa6,
> +    0xa9, 0xae, 0xac, 0xbd, 0xbc, 0x141, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x104, 0x10c, 0x118,
> +    0x116, 0x2563, 0x2551, 0x2557, 0x255d, 0x12e, 0x160, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x172, 0x16a,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x17d,
> +    0x105, 0x10d, 0x119, 0x117, 0x12f, 0x161, 0x173, 0x16b,
> +    0x17e, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
> +    0xd3, 0xdf, 0x14c, 0x143, 0xf5, 0xd5, 0xb5, 0x144,
> +    0x136, 0x137, 0x13b, 0x13c, 0x146, 0x112, 0x145, 0x2019,
> +    0xad, 0xb1, 0x201c, 0xbe, 0xb6, 0xa7, 0xf7, 0x201e,
> +    0xb0, 0x2219, 0xb7, 0xb9, 0xb3, 0xb2, 0x25a0, 0xa0 },
> +  /* CP850 */
> +  { 0xc7, 0xfc, 0xe9, 0xe2, 0xe4, 0xe0, 0xe5, 0xe7,
> +    0xea, 0xeb, 0xe8, 0xef, 0xee, 0xec, 0xc4, 0xc5,
> +    0xc9, 0xe6, 0xc6, 0xf4, 0xf6, 0xf2, 0xfb, 0xf9,
> +    0xff, 0xd6, 0xdc, 0xf8, 0xa3, 0xd8, 0xd7, 0x192,
> +    0xe1, 0xed, 0xf3, 0xfa, 0xf1, 0xd1, 0xaa, 0xba,
> +    0xbf, 0xae, 0xac, 0xbd, 0xbc, 0xa1, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0xc1, 0xc2, 0xc0,
> +    0xa9, 0x2563, 0x2551, 0x2557, 0x255d, 0xa2, 0xa5, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0xe3, 0xc3,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0xa4,
> +    0xf0, 0xd0, 0xca, 0xcb, 0xc8, 0x131, 0xcd, 0xce,
> +    0xcf, 0x2518, 0x250c, 0x2588, 0x2584, 0xa6, 0xcc, 0x2580,
> +    0xd3, 0xdf, 0xd4, 0xd2, 0xf5, 0xd5, 0xb5, 0xfe,
> +    0xde, 0xda, 0xdb, 0xd9, 0xfd, 0xdd, 0xaf, 0xb4,
> +    0xad, 0xb1, 0x2017, 0xbe, 0xb6, 0xa7, 0xf7, 0xb8,
> +    0xb0, 0xa8, 0xb7, 0xb9, 0xb3, 0xb2, 0x25a0, 0xa0 },
> +  /* CP852 */
> +  { 0xc7, 0xfc, 0xe9, 0xe2, 0xe4, 0x16f, 0x107, 0xe7,
> +    0x142, 0xeb, 0x150, 0x151, 0xee, 0x179, 0xc4, 0x106,
> +    0xc9, 0x139, 0x13a, 0xf4, 0xf6, 0x13d, 0x13e, 0x15a,
> +    0x15b, 0xd6, 0xdc, 0x164, 0x165, 0x141, 0xd7, 0x10d,
> +    0xe1, 0xed, 0xf3, 0xfa, 0x104, 0x105, 0x17d, 0x17e,
> +    0x118, 0x119, 0xac, 0x17a, 0x10c, 0x15f, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0xc1, 0xc2, 0x11a,
> +    0x15e, 0x2563, 0x2551, 0x2557, 0x255d, 0x17b, 0x17c, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x102, 0x103,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0xa4,
> +    0x111, 0x110, 0x10e, 0xcb, 0x10f, 0x147, 0xcd, 0xce,
> +    0x11b, 0x2518, 0x250c, 0x2588, 0x2584, 0x162, 0x16e, 0x2580,
> +    0xd3, 0xdf, 0xd4, 0x143, 0x144, 0x148, 0x160, 0x161,
> +    0x154, 0xda, 0x155, 0x170, 0xfd, 0xdd, 0x163, 0xb4,
> +    0xad, 0x2dd, 0x2db, 0x2c7, 0x2d8, 0xa7, 0xf7, 0xb8,
> +    0xb0, 0xa8, 0x2d9, 0x171, 0x158, 0x159, 0x25a0, 0xa0 },
> +  /* CP855 */
> +  { 0x452, 0x402, 0x453, 0x403, 0x451, 0x401, 0x454, 0x404,
> +    0x455, 0x405, 0x456, 0x406, 0x457, 0x407, 0x458, 0x408,
> +    0x459, 0x409, 0x45a, 0x40a, 0x45b, 0x40b, 0x45c, 0x40c,
> +    0x45e, 0x40e, 0x45f, 0x40f, 0x44e, 0x42e, 0x44a, 0x42a,
> +    0x430, 0x410, 0x431, 0x411, 0x446, 0x426, 0x434, 0x414,
> +    0x435, 0x415, 0x444, 0x424, 0x433, 0x413, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x445, 0x425, 0x438,
> +    0x418, 0x2563, 0x2551, 0x2557, 0x255d, 0x439, 0x419, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x43a, 0x41a,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0xa4,
> +    0x43b, 0x41b, 0x43c, 0x41c, 0x43d, 0x41d, 0x43e, 0x41e,
> +    0x43f, 0x2518, 0x250c, 0x2588, 0x2584, 0x41f, 0x44f, 0x2580,
> +    0x42f, 0x440, 0x420, 0x441, 0x421, 0x442, 0x422, 0x443,
> +    0x423, 0x436, 0x416, 0x432, 0x412, 0x44c, 0x42c, 0x2116,
> +    0xad, 0x44b, 0x42b, 0x437, 0x417, 0x448, 0x428, 0x44d,
> +    0x42d, 0x449, 0x429, 0x447, 0x427, 0xa7, 0x25a0, 0xa0 },
> +  /* CP857 */
> +  { 0xc7, 0xfc, 0xe9, 0xe2, 0xe4, 0xe0, 0xe5, 0xe7,
> +    0xea, 0xeb, 0xe8, 0xef, 0xee, 0x131, 0xc4, 0xc5,
> +    0xc9, 0xe6, 0xc6, 0xf4, 0xf6, 0xf2, 0xfb, 0xf9,
> +    0x130, 0xd6, 0xdc, 0xf8, 0xa3, 0xd8, 0x15e, 0x15f,
> +    0xe1, 0xed, 0xf3, 0xfa, 0xf1, 0xd1, 0x11e, 0x11f,
> +    0xbf, 0xae, 0xac, 0xbd, 0xbc, 0xa1, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0xc1, 0xc2, 0xc0,
> +    0xa9, 0x2563, 0x2551, 0x2557, 0x255d, 0xa2, 0xa5, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0xe3, 0xc3,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0xa4,
> +    0xba, 0xaa, 0xca, 0xcb, 0xc8, 0x0, 0xcd, 0xce,
> +    0xcf, 0x2518, 0x250c, 0x2588, 0x2584, 0xa6, 0xcc, 0x2580,
> +    0xd3, 0xdf, 0xd4, 0xd2, 0xf5, 0xd5, 0xb5, 0x0,
> +    0xd7, 0xda, 0xdb, 0xd9, 0xec, 0xff, 0xaf, 0xb4,
> +    0xad, 0xb1, 0x0, 0xbe, 0xb6, 0xa7, 0xf7, 0xb8,
> +    0xb0, 0xa8, 0xb7, 0xb9, 0xb3, 0xb2, 0x25a0, 0xa0 },
> +  /* CP858 */
> +  { 0xc7, 0xfc, 0xe9, 0xe2, 0xe4, 0xe0, 0xe5, 0xe7,
> +    0xea, 0xeb, 0xe8, 0xef, 0xee, 0xec, 0xc4, 0xc5,
> +    0xc9, 0xe6, 0xc6, 0xf4, 0xf6, 0xf2, 0xfb, 0xf9,
> +    0xff, 0xd6, 0xdc, 0xf8, 0xa3, 0xd8, 0xd7, 0x192,
> +    0xe1, 0xed, 0xf3, 0xfa, 0xf1, 0xd1, 0xaa, 0xba,
> +    0xbf, 0xae, 0xac, 0xbd, 0xbc, 0xa1, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0xc1, 0xc2, 0xc0,
> +    0xa9, 0x2563, 0x2551, 0x2557, 0x255d, 0xa2, 0xa5, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0xe3, 0xc3,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0xa4,
> +    0xf0, 0xd0, 0xca, 0xcb, 0xc8, 0x20ac, 0xcd, 0xce,
> +    0xcf, 0x2518, 0x250c, 0x2588, 0x2584, 0xa6, 0xcc, 0x2580,
> +    0xd3, 0xdf, 0xd4, 0xd2, 0xf5, 0xd5, 0xb5, 0xfe,
> +    0xde, 0xda, 0xdb, 0xd9, 0xfd, 0xdd, 0xaf, 0xb4,
> +    0xad, 0xb1, 0x2017, 0xbe, 0xb6, 0xa7, 0xf7, 0xb8,
> +    0xb0, 0xa8, 0xb7, 0xb9, 0xb3, 0xb2, 0x25a0, 0xa0 },
> +  /* CP862 */
> +  { 0x5d0, 0x5d1, 0x5d2, 0x5d3, 0x5d4, 0x5d5, 0x5d6, 0x5d7,
> +    0x5d8, 0x5d9, 0x5da, 0x5db, 0x5dc, 0x5dd, 0x5de, 0x5df,
> +    0x5e0, 0x5e1, 0x5e2, 0x5e3, 0x5e4, 0x5e5, 0x5e6, 0x5e7,
> +    0x5e8, 0x5e9, 0x5ea, 0xa2, 0xa3, 0xa5, 0x20a7, 0x192,
> +    0xe1, 0xed, 0xf3, 0xfa, 0xf1, 0xd1, 0xaa, 0xba,
> +    0xbf, 0x2310, 0xac, 0xbd, 0xbc, 0xa1, 0xab, 0xbb,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
> +    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
> +    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
> +    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
> +    0x3b1, 0xdf, 0x393, 0x3c0, 0x3a3, 0x3c3, 0xb5, 0x3c4,
> +    0x3a6, 0x398, 0x3a9, 0x3b4, 0x221e, 0x3c6, 0x3b5, 0x2229,
> +    0x2261, 0xb1, 0x2265, 0x2264, 0x2320, 0x2321, 0xf7, 0x2248,
> +    0xb0, 0x2219, 0xb7, 0x221a, 0x207f, 0xb2, 0x25a0, 0xa0 },
> +  /* CP866 */
> +  { 0x410, 0x411, 0x412, 0x413, 0x414, 0x415, 0x416, 0x417,
> +    0x418, 0x419, 0x41a, 0x41b, 0x41c, 0x41d, 0x41e, 0x41f,
> +    0x420, 0x421, 0x422, 0x423, 0x424, 0x425, 0x426, 0x427,
> +    0x428, 0x429, 0x42a, 0x42b, 0x42c, 0x42d, 0x42e, 0x42f,
> +    0x430, 0x431, 0x432, 0x433, 0x434, 0x435, 0x436, 0x437,
> +    0x438, 0x439, 0x43a, 0x43b, 0x43c, 0x43d, 0x43e, 0x43f,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
> +    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
> +    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
> +    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
> +    0x440, 0x441, 0x442, 0x443, 0x444, 0x445, 0x446, 0x447,
> +    0x448, 0x449, 0x44a, 0x44b, 0x44c, 0x44d, 0x44e, 0x44f,
> +    0x401, 0x451, 0x404, 0x454, 0x407, 0x457, 0x40e, 0x45e,
> +    0xb0, 0x2219, 0xb7, 0x221a, 0x2116, 0xa4, 0x25a0, 0xa0 },
> +  /* CP874 */
> +  { 0x20ac, 0x0, 0x0, 0x0, 0x0, 0x2026, 0x0, 0x0,
> +    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0xa0, 0xe01, 0xe02, 0xe03, 0xe04, 0xe05, 0xe06, 0xe07,
> +    0xe08, 0xe09, 0xe0a, 0xe0b, 0xe0c, 0xe0d, 0xe0e, 0xe0f,
> +    0xe10, 0xe11, 0xe12, 0xe13, 0xe14, 0xe15, 0xe16, 0xe17,
> +    0xe18, 0xe19, 0xe1a, 0xe1b, 0xe1c, 0xe1d, 0xe1e, 0xe1f,
> +    0xe20, 0xe21, 0xe22, 0xe23, 0xe24, 0xe25, 0xe26, 0xe27,
> +    0xe28, 0xe29, 0xe2a, 0xe2b, 0xe2c, 0xe2d, 0xe2e, 0xe2f,
> +    0xe30, 0xe31, 0xe32, 0xe33, 0xe34, 0xe35, 0xe36, 0xe37,
> +    0xe38, 0xe39, 0xe3a, 0x0, 0x0, 0x0, 0x0, 0xe3f,
> +    0xe40, 0xe41, 0xe42, 0xe43, 0xe44, 0xe45, 0xe46, 0xe47,
> +    0xe48, 0xe49, 0xe4a, 0xe4b, 0xe4c, 0xe4d, 0xe4e, 0xe4f,
> +    0xe50, 0xe51, 0xe52, 0xe53, 0xe54, 0xe55, 0xe56, 0xe57,
> +    0xe58, 0xe59, 0xe5a, 0xe5b, 0xfc, 0xfd, 0xfe, 0xff },
> +  /* CP1125 */
> +  { 0x410, 0x411, 0x412, 0x413, 0x414, 0x415, 0x416, 0x417,
> +    0x418, 0x419, 0x41a, 0x41b, 0x41c, 0x41d, 0x41e, 0x41f,
> +    0x420, 0x421, 0x422, 0x423, 0x424, 0x425, 0x426, 0x427,
> +    0x428, 0x429, 0x42a, 0x42b, 0x42c, 0x42d, 0x42e, 0x42f,
> +    0x430, 0x431, 0x432, 0x433, 0x434, 0x435, 0x436, 0x437,
> +    0x438, 0x439, 0x43a, 0x43b, 0x43c, 0x43d, 0x43e, 0x43f,
> +    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
> +    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
> +    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
> +    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
> +    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
> +    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
> +    0x440, 0x441, 0x442, 0x443, 0x444, 0x445, 0x446, 0x447,
> +    0x448, 0x449, 0x44a, 0x44b, 0x44c, 0x44d, 0x44e, 0x44f,
> +    0x401, 0x451, 0x490, 0x491, 0x404, 0x454, 0x406, 0x456,
> +    0x407, 0x457, 0xb7, 0x221a, 0x2116, 0xa4, 0x25a0, 0xa0 },
> +  /* CP1250 */
> +  { 0x20ac, 0x0, 0x201a, 0x0, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x0, 0x2030, 0x160, 0x2039, 0x15a, 0x164, 0x17d, 0x179,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x0, 0x2122, 0x161, 0x203a, 0x15b, 0x165, 0x17e, 0x17a,
> +    0xa0, 0x2c7, 0x2d8, 0x141, 0xa4, 0x104, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0x15e, 0xab, 0xac, 0xad, 0xae, 0x17b,
> +    0xb0, 0xb1, 0x2db, 0x142, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0x105, 0x15f, 0xbb, 0x13d, 0x2dd, 0x13e, 0x17c,
> +    0x154, 0xc1, 0xc2, 0x102, 0xc4, 0x139, 0x106, 0xc7,
> +    0x10c, 0xc9, 0x118, 0xcb, 0x11a, 0xcd, 0xce, 0x10e,
> +    0x110, 0x143, 0x147, 0xd3, 0xd4, 0x150, 0xd6, 0xd7,
> +    0x158, 0x16e, 0xda, 0x170, 0xdc, 0xdd, 0x162, 0xdf,
> +    0x155, 0xe1, 0xe2, 0x103, 0xe4, 0x13a, 0x107, 0xe7,
> +    0x10d, 0xe9, 0x119, 0xeb, 0x11b, 0xed, 0xee, 0x10f,
> +    0x111, 0x144, 0x148, 0xf3, 0xf4, 0x151, 0xf6, 0xf7,
> +    0x159, 0x16f, 0xfa, 0x171, 0xfc, 0xfd, 0x163, 0x2d9 },
> +  /* CP1251 */
> +  { 0x402, 0x403, 0x201a, 0x453, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x20ac, 0x2030, 0x409, 0x2039, 0x40a, 0x40c, 0x40b, 0x40f,
> +    0x452, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x0, 0x2122, 0x459, 0x203a, 0x45a, 0x45c, 0x45b, 0x45f,
> +    0xa0, 0x40e, 0x45e, 0x408, 0xa4, 0x490, 0xa6, 0xa7,
> +    0x401, 0xa9, 0x404, 0xab, 0xac, 0xad, 0xae, 0x407,
> +    0xb0, 0xb1, 0x406, 0x456, 0x491, 0xb5, 0xb6, 0xb7,
> +    0x451, 0x2116, 0x454, 0xbb, 0x458, 0x405, 0x455, 0x457,
> +    0x410, 0x411, 0x412, 0x413, 0x414, 0x415, 0x416, 0x417,
> +    0x418, 0x419, 0x41a, 0x41b, 0x41c, 0x41d, 0x41e, 0x41f,
> +    0x420, 0x421, 0x422, 0x423, 0x424, 0x425, 0x426, 0x427,
> +    0x428, 0x429, 0x42a, 0x42b, 0x42c, 0x42d, 0x42e, 0x42f,
> +    0x430, 0x431, 0x432, 0x433, 0x434, 0x435, 0x436, 0x437,
> +    0x438, 0x439, 0x43a, 0x43b, 0x43c, 0x43d, 0x43e, 0x43f,
> +    0x440, 0x441, 0x442, 0x443, 0x444, 0x445, 0x446, 0x447,
> +    0x448, 0x449, 0x44a, 0x44b, 0x44c, 0x44d, 0x44e, 0x44f },
> +  /* CP1252 */
> +  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x2c6, 0x2030, 0x160, 0x2039, 0x152, 0x0, 0x17d, 0x0,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x2dc, 0x2122, 0x161, 0x203a, 0x153, 0x0, 0x17e, 0x178,
> +    0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
> +    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
> +    0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
> +    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
> +    0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
> +    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff },
> +  /* CP1253 */
> +  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x0, 0x2030, 0x0, 0x2039, 0x0, 0x0, 0x0, 0x0,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x0, 0x2122, 0x0, 0x203a, 0x0, 0x0, 0x0, 0x0,
> +    0xa0, 0x385, 0x386, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0x0, 0xab, 0xac, 0xad, 0xae, 0x2015,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0x384, 0xb5, 0xb6, 0xb7,
> +    0x388, 0x389, 0x38a, 0xbb, 0x38c, 0xbd, 0x38e, 0x38f,
> +    0x390, 0x391, 0x392, 0x393, 0x394, 0x395, 0x396, 0x397,
> +    0x398, 0x399, 0x39a, 0x39b, 0x39c, 0x39d, 0x39e, 0x39f,
> +    0x3a0, 0x3a1, 0x0, 0x3a3, 0x3a4, 0x3a5, 0x3a6, 0x3a7,
> +    0x3a8, 0x3a9, 0x3aa, 0x3ab, 0x3ac, 0x3ad, 0x3ae, 0x3af,
> +    0x3b0, 0x3b1, 0x3b2, 0x3b3, 0x3b4, 0x3b5, 0x3b6, 0x3b7,
> +    0x3b8, 0x3b9, 0x3ba, 0x3bb, 0x3bc, 0x3bd, 0x3be, 0x3bf,
> +    0x3c0, 0x3c1, 0x3c2, 0x3c3, 0x3c4, 0x3c5, 0x3c6, 0x3c7,
> +    0x3c8, 0x3c9, 0x3ca, 0x3cb, 0x3cc, 0x3cd, 0x3ce, 0xff },
> +  /* CP1254 */
> +  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x2c6, 0x2030, 0x160, 0x2039, 0x152, 0x0, 0x0, 0x0,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x2dc, 0x2122, 0x161, 0x203a, 0x153, 0x0, 0x0, 0x178,
> +    0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
> +    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
> +    0x11e, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
> +    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0x130, 0x15e, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
> +    0x11f, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
> +    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0x131, 0x15f, 0xff },
> +  /* CP1255 */
> +  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x2c6, 0x2030, 0x0, 0x2039, 0x0, 0x0, 0x0, 0x0,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x2dc, 0x2122, 0x0, 0x203a, 0x0, 0x0, 0x0, 0x0,
> +    0xa0, 0xa1, 0xa2, 0xa3, 0x20aa, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0xd7, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0xb9, 0xf7, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
> +    0x5b0, 0x5b1, 0x5b2, 0x5b3, 0x5b4, 0x5b5, 0x5b6, 0x5b7,
> +    0x5b8, 0x5b9, 0x0, 0x5bb, 0x5bc, 0x5bd, 0x5be, 0x5bf,
> +    0x5c0, 0x5c1, 0x5c2, 0x5c3, 0x5f0, 0x5f1, 0x5f2, 0x5f3,
> +    0x5f4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> +    0x5d0, 0x5d1, 0x5d2, 0x5d3, 0x5d4, 0x5d5, 0x5d6, 0x5d7,
> +    0x5d8, 0x5d9, 0x5da, 0x5db, 0x5dc, 0x5dd, 0x5de, 0x5df,
> +    0x5e0, 0x5e1, 0x5e2, 0x5e3, 0x5e4, 0x5e5, 0x5e6, 0x5e7,
> +    0x5e8, 0x5e9, 0x5ea, 0x0, 0x0, 0x200e, 0x200f, 0xff },
> +  /* CP1256 */
> +  { 0x20ac, 0x67e, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x2c6, 0x2030, 0x679, 0x2039, 0x152, 0x686, 0x698, 0x688,
> +    0x6af, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x6a9, 0x2122, 0x691, 0x203a, 0x153, 0x200c, 0x200d, 0x6ba,
> +    0xa0, 0x60c, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0x6be, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0xb9, 0x61b, 0xbb, 0xbc, 0xbd, 0xbe, 0x61f,
> +    0x6c1, 0x621, 0x622, 0x623, 0x624, 0x625, 0x626, 0x627,
> +    0x628, 0x629, 0x62a, 0x62b, 0x62c, 0x62d, 0x62e, 0x62f,
> +    0x630, 0x631, 0x632, 0x633, 0x634, 0x635, 0x636, 0xd7,
> +    0x637, 0x638, 0x639, 0x63a, 0x640, 0x641, 0x642, 0x643,
> +    0xe0, 0x644, 0xe2, 0x645, 0x646, 0x647, 0x648, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0x649, 0x64a, 0xee, 0xef,
> +    0x64b, 0x64c, 0x64d, 0x64e, 0xf4, 0x64f, 0x650, 0xf7,
> +    0x651, 0xf9, 0x652, 0xfb, 0xfc, 0x200e, 0x200f, 0x6d2 },
> +  /* CP1257 */
> +  { 0x20ac, 0x0, 0x201a, 0x0, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x0, 0x2030, 0x0, 0x2039, 0x0, 0xa8, 0x2c7, 0xb8,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x0, 0x2122, 0x0, 0x203a, 0x0, 0xaf, 0x2db, 0x0,
> +    0xa0, 0x0, 0xa2, 0xa3, 0xa4, 0x0, 0xa6, 0xa7,
> +    0xd8, 0xa9, 0x156, 0xab, 0xac, 0xad, 0xae, 0xc6,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xf8, 0xb9, 0x157, 0xbb, 0xbc, 0xbd, 0xbe, 0xe6,
> +    0x104, 0x12e, 0x100, 0x106, 0xc4, 0xc5, 0x118, 0x112,
> +    0x10c, 0xc9, 0x179, 0x116, 0x122, 0x136, 0x12a, 0x13b,
> +    0x160, 0x143, 0x145, 0xd3, 0x14c, 0xd5, 0xd6, 0xd7,
> +    0x172, 0x141, 0x15a, 0x16a, 0xdc, 0x17b, 0x17d, 0xdf,
> +    0x105, 0x12f, 0x101, 0x107, 0xe4, 0xe5, 0x119, 0x113,
> +    0x10d, 0xe9, 0x17a, 0x117, 0x123, 0x137, 0x12b, 0x13c,
> +    0x161, 0x144, 0x146, 0xf3, 0x14d, 0xf5, 0xf6, 0xf7,
> +    0x173, 0x142, 0x15b, 0x16b, 0xfc, 0x17c, 0x17e, 0x2d9 },
> +  /* CP1258 */
> +  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
> +    0x2c6, 0x2030, 0x0, 0x2039, 0x152, 0x0, 0x0, 0x0,
> +    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
> +    0x2dc, 0x2122, 0x0, 0x203a, 0x153, 0x0, 0x0, 0x178,
> +    0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
> +    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
> +    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
> +    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
> +    0xc0, 0xc1, 0xc2, 0x102, 0xc4, 0xc5, 0xc6, 0xc7,
> +    0xc8, 0xc9, 0xca, 0xcb, 0x300, 0xcd, 0xce, 0xcf,
> +    0x110, 0xd1, 0x309, 0xd3, 0xd4, 0x1a0, 0xd6, 0xd7,
> +    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0x1af, 0x303, 0xdf,
> +    0xe0, 0xe1, 0xe2, 0x103, 0xe4, 0xe5, 0xe6, 0xe7,
> +    0xe8, 0xe9, 0xea, 0xeb, 0x301, 0xed, 0xee, 0xef,
> +    0x111, 0xf1, 0x323, 0xf3, 0xf4, 0x1a1, 0xf6, 0xf7,
> +    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0x1b0, 0x20ab, 0xff }
> +};
> +#endif /* _MB_EXTENDED_CHARSETS_DOS */
> +
> +/* Handle one to five decimal digits.  Return -1 in any other case. */
> +static int
> +__micro_atoi (const char *s)
> +{
> +  int ret = 0;
> +
> +  if (!*s)
> +    return -1;
> +  while (*s)
> +    {
> +      if (*s < '0' || *s > '9' || ret >= 10000)
> +	return -1;
> +      ret = 10 * ret + (*s++ - '0');
> +    }
> +  return ret;
> +}
> +
> +#ifdef _MB_EXTENDED_CHARSETS_ISO
> +int
> +__iso_8859_index (const char *charset_ext)
> +{
> +  int iso_idx = __micro_atoi (charset_ext);
> +  if (iso_idx >= 2 && iso_idx <= 16)
> +    {
> +      iso_idx -= 2;
> +      if (iso_idx > 10)
> +	--iso_idx;
> +      return iso_idx;
> +    }
> +  return -1;
> +}
> +#endif /* _MB_EXTENDED_CHARSETS_ISO */
> +
> +#ifdef _MB_EXTENDED_CHARSETS_DOS
> +int
> +__cp_index (const char *charset_ext)
> +{
> +  int cp_idx = __micro_atoi (charset_ext);
> +  switch (cp_idx)
> +    {
> +    case 437:
> +      cp_idx = 0;
> +      break;
> +    case 720:
> +      cp_idx = 1;
> +      break;
> +    case 737:
> +      cp_idx = 2;
> +      break;
> +    case 775:
> +      cp_idx = 3;
> +      break;
> +    case 850:
> +      cp_idx = 4;
> +      break;
> +    case 852:
> +      cp_idx = 5;
> +      break;
> +    case 855:
> +      cp_idx = 6;
> +      break;
> +    case 857:
> +      cp_idx = 7;
> +      break;
> +    case 858:
> +      cp_idx = 8;
> +      break;
> +    case 862:
> +      cp_idx = 9;
> +      break;
> +    case 866:
> +      cp_idx = 10;
> +      break;
> +    case 874:
> +      cp_idx = 11;
> +      break;
> +    case 1125:
> +      cp_idx = 12;
> +      break;
> +    case 1250:
> +      cp_idx = 13;
> +      break;
> +    case 1251:
> +      cp_idx = 14;
> +      break;
> +    case 1252:
> +      cp_idx = 15;
> +      break;
> +    case 1253:
> +      cp_idx = 16;
> +      break;
> +    case 1254:
> +      cp_idx = 17;
> +      break;
> +    case 1255:
> +      cp_idx = 18;
> +      break;
> +    case 1256:
> +      cp_idx = 19;
> +      break;
> +    case 1257:
> +      cp_idx = 20;
> +      break;
> +    case 1258:
> +      cp_idx = 21;
> +      break;
> +    default:
> +      cp_idx = -1;
> +      break;
> +    }
> +  return cp_idx;
> +}
> +#endif /* _MB_EXTENDED_CHARSETS_DOS */
> +#endif /* _MB_CAPABLE */
> Index: libc/stdlib/wctomb_r.c
> ===================================================================
> RCS file: /cvs/src/src/newlib/libc/stdlib/wctomb_r.c,v
> retrieving revision 1.12
> diff -u -p -r1.12 wctomb_r.c
> --- libc/stdlib/wctomb_r.c	19 Mar 2009 19:47:52 -0000	1.12
> +++ libc/stdlib/wctomb_r.c	22 Mar 2009 16:25:07 -0000
> @@ -4,209 +4,338 @@
>  #include <wchar.h>
>  #include <locale.h>
>  #include "mbctype.h"
> +#include "local.h"
>  
> -extern char *__locale_charset ();
> +int (*__wctomb) (struct _reent *, char *, wchar_t, const char *charset,
> +		 mbstate_t *)
> +    = __ascii_wctomb;
>  
> +#ifdef _MB_CAPABLE
>  /* for some conversions, we use the __count field as a place to store a state value */
>  #define __state __count
>  
>  int
> -_DEFUN (_wctomb_r, (r, s, wchar, state),
> -        struct _reent *r     _AND 
> -        char          *s     _AND
> -        wchar_t        _wchar _AND
> +_DEFUN (__utf8_wctomb, (r, s, wchar, charset, state),
> +        struct _reent *r       _AND 
> +        char          *s       _AND
> +        wchar_t        _wchar  _AND
> +	const char    *charset _AND
>          mbstate_t     *state)
>  {
> -  /* Avoids compiler warnings about comparisons that are always false
> -     due to limited range when sizeof(wchar_t) is 2 but sizeof(wint_t)
> -     is 4, as is the case on cygwin.  */
>    wint_t wchar = _wchar;
>  
> -  if (strlen (__locale_charset ()) <= 1)
> -    { /* fall-through */ }
> -  else if (!strcmp (__locale_charset (), "UTF-8"))
> -    {
> -      if (s == NULL)
> -        return 0; /* UTF-8 encoding is not state-dependent */
> +  if (s == NULL)
> +    return 0; /* UTF-8 encoding is not state-dependent */
>  
> -      if (state->__count == -4 && (wchar < 0xdc00 || wchar >= 0xdfff))
> +  if (state->__count == -4 && (wchar < 0xdc00 || wchar >= 0xdfff))
> +    {
> +      /* At this point only the second half of a surrogate pair is valid. */
> +      r->_errno = EILSEQ;
> +      return -1;
> +    }
> +  if (wchar <= 0x7f)
> +    {
> +      *s = wchar;
> +      return 1;
> +    }
> +  if (wchar >= 0x80 && wchar <= 0x7ff)
> +    {
> +      *s++ = 0xc0 | ((wchar & 0x7c0) >> 6);
> +      *s   = 0x80 |  (wchar &  0x3f);
> +      return 2;
> +    }
> +  if (wchar >= 0x800 && wchar <= 0xffff)
> +    {
> +      if (wchar >= 0xd800 && wchar <= 0xdfff)
>  	{
> -	  /* At this point only the second half of a surrogate pair is valid. */
> -	  r->_errno = EILSEQ;
> -	  return -1;
> -	}
> -      if (wchar <= 0x7f)
> -        {
> -          *s = wchar;
> -          return 1;
> -        }
> -      else if (wchar >= 0x80 && wchar <= 0x7ff)
> -        {
> -          *s++ = 0xc0 | ((wchar & 0x7c0) >> 6);
> -          *s   = 0x80 |  (wchar &  0x3f);
> -          return 2;
> -        }
> -      else if (wchar >= 0x800 && wchar <= 0xffff)
> -        {
> -          if (wchar >= 0xd800 && wchar <= 0xdfff)
> +	  wint_t tmp;
> +	  /* UTF-16 surrogates -- must not occur in normal UCS-4 data */
> +	  if (sizeof (wchar_t) != 2)
> +	    {
> +	      r->_errno = EILSEQ;
> +	      return -1;
> +	    }
> +	  if (wchar >= 0xdc00)
>  	    {
> -	      wint_t tmp;
> -	      /* UTF-16 surrogates -- must not occur in normal UCS-4 data */
> -	      if (sizeof (wchar_t) != 2)
> +	      /* Second half of a surrogate pair. It's not valid if
> +		 we don't have already read a first half of a surrogate
> +		 before. */
> +	      if (state->__count != -4)
>  		{
>  		  r->_errno = EILSEQ;
>  		  return -1;
>  		}
> -	      if (wchar >= 0xdc00)
> -		{
> -		  /* Second half of a surrogate pair. It's not valid if
> -		     we don't have already read a first half of a surrogate
> -		     before. */
> -		  if (state->__count != -4)
> -		    {
> -		      r->_errno = EILSEQ;
> -		      return -1;
> -		    }
> -		  /* If it's valid, reconstruct the full Unicode value and
> -		     return the trailing three bytes of the UTF-8 char. */
> -		  tmp = (state->__value.__wchb[0] << 16)
> -			| (state->__value.__wchb[1] << 8)
> -			| (wchar & 0x3ff);
> -		  state->__count = 0;
> -		  *s++ = 0x80 | ((tmp &  0x3f000) >> 12);
> -		  *s++ = 0x80 | ((tmp &    0xfc0) >> 6);
> -		  *s   = 0x80 |  (tmp &     0x3f);
> -		  return 3;
> -	      	}
> -	      /* First half of a surrogate pair.  Store the state and return
> -	         the first byte of the UTF-8 char. */
> -	      tmp = ((wchar & 0x3ff) << 10) + 0x10000;
> -	      state->__value.__wchb[0] = (tmp >> 16) & 0xff;
> -	      state->__value.__wchb[1] = (tmp >> 8) & 0xff;
> -	      state->__count = -4;
> -	      *s = (0xf0 | ((tmp & 0x1c0000) >> 18));
> -	      return 1;
> +	      /* If it's valid, reconstruct the full Unicode value and
> +		 return the trailing three bytes of the UTF-8 char. */
> +	      tmp = (state->__value.__wchb[0] << 16)
> +		    | (state->__value.__wchb[1] << 8)
> +		    | (wchar & 0x3ff);
> +	      state->__count = 0;
> +	      *s++ = 0x80 | ((tmp &  0x3f000) >> 12);
> +	      *s++ = 0x80 | ((tmp &    0xfc0) >> 6);
> +	      *s   = 0x80 |  (tmp &     0x3f);
> +	      return 3;
>  	    }
> -          *s++ = 0xe0 | ((wchar & 0xf000) >> 12);
> -          *s++ = 0x80 | ((wchar &  0xfc0) >> 6);
> -          *s   = 0x80 |  (wchar &   0x3f);
> -          return 3;
> -        }
> -      else if (wchar >= 0x10000 && wchar <= 0x10ffff)
> -        {
> -          *s++ = 0xf0 | ((wchar & 0x1c0000) >> 18);
> -          *s++ = 0x80 | ((wchar &  0x3f000) >> 12);
> -          *s++ = 0x80 | ((wchar &    0xfc0) >> 6);
> -          *s   = 0x80 |  (wchar &     0x3f);
> -          return 4;
> -        }
> +	  /* First half of a surrogate pair.  Store the state and return
> +	     the first byte of the UTF-8 char. */
> +	  tmp = ((wchar & 0x3ff) << 10) + 0x10000;
> +	  state->__value.__wchb[0] = (tmp >> 16) & 0xff;
> +	  state->__value.__wchb[1] = (tmp >> 8) & 0xff;
> +	  state->__count = -4;
> +	  *s = (0xf0 | ((tmp & 0x1c0000) >> 18));
> +	  return 1;
> +	}
> +      *s++ = 0xe0 | ((wchar & 0xf000) >> 12);
> +      *s++ = 0x80 | ((wchar &  0xfc0) >> 6);
> +      *s   = 0x80 |  (wchar &   0x3f);
> +      return 3;
> +    }
> +  if (wchar >= 0x10000 && wchar <= 0x10ffff)
> +    {
> +      *s++ = 0xf0 | ((wchar & 0x1c0000) >> 18);
> +      *s++ = 0x80 | ((wchar &  0x3f000) >> 12);
> +      *s++ = 0x80 | ((wchar &    0xfc0) >> 6);
> +      *s   = 0x80 |  (wchar &     0x3f);
> +      return 4;
> +    }
> +
> +  r->_errno = EILSEQ;
> +  return -1;
> +}
> +
> +/* Cygwin defines its own doublebyte charset conversion functions 
> +   because the underlying OS requires wchar_t == UTF-16. */
> +#ifndef __CYGWIN__
> +int
> +_DEFUN (__sjis_wctomb, (r, s, wchar, charset, state),
> +        struct _reent *r       _AND 
> +        char          *s       _AND
> +        wchar_t        _wchar  _AND
> +	const char    *charset _AND
> +        mbstate_t     *state)
> +{
> +  wint_t wchar = _wchar;
> +
> +  unsigned char char2 = (unsigned char)wchar;
> +  unsigned char char1 = (unsigned char)(wchar >> 8);
> +
> +  if (s == NULL)
> +    return 0;  /* not state-dependent */
> +
> +  if (char1 != 0x00)
> +    {
> +    /* first byte is non-zero..validate multi-byte char */
> +      if (_issjis1(char1) && _issjis2(char2)) 
> +	{
> +	  *s++ = (char)char1;
> +	  *s = (char)char2;
> +	  return 2;
> +	}
>        else
>  	{
>  	  r->_errno = EILSEQ;
>  	  return -1;
>  	}
>      }
> -  else if (!strcmp (__locale_charset (), "SJIS"))
> +  *s = (char) wchar;
> +  return 1;
> +}
> +
> +int
> +_DEFUN (__eucjp_wctomb, (r, s, wchar, charset, state),
> +        struct _reent *r       _AND 
> +        char          *s       _AND
> +        wchar_t        _wchar  _AND
> +	const char    *charset _AND
> +        mbstate_t     *state)
> +{
> +  wint_t wchar = _wchar;
> +  unsigned char char2 = (unsigned char)wchar;
> +  unsigned char char1 = (unsigned char)(wchar >> 8);
> +
> +  if (s == NULL)
> +    return 0;  /* not state-dependent */
> +
> +  if (char1 != 0x00)
>      {
> -      unsigned char char2 = (unsigned char)wchar;
> -      unsigned char char1 = (unsigned char)(wchar >> 8);
> +    /* first byte is non-zero..validate multi-byte char */
> +      if (_iseucjp (char1) && _iseucjp (char2)) 
> +	{
> +	  *s++ = (char)char1;
> +	  *s = (char)char2;
> +	  return 2;
> +	}
> +      else
> +	{
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +    }
> +  *s = (char) wchar;
> +  return 1;
> +}
>  
> -      if (s == NULL)
> -        return 0;  /* not state-dependent */
> +int
> +_DEFUN (__jis_wctomb, (r, s, wchar, charset, state),
> +        struct _reent *r       _AND 
> +        char          *s       _AND
> +        wchar_t        _wchar  _AND
> +	const char    *charset _AND
> +        mbstate_t     *state)
> +{
> +  wint_t wchar = _wchar;
> +  int cnt = 0; 
> +  unsigned char char2 = (unsigned char)wchar;
> +  unsigned char char1 = (unsigned char)(wchar >> 8);
>  
> -      if (char1 != 0x00)
> -        {
> -        /* first byte is non-zero..validate multi-byte char */
> -          if (_issjis1(char1) && _issjis2(char2)) 
> -            {
> -              *s++ = (char)char1;
> -              *s = (char)char2;
> -              return 2;
> -            }
> -          else
> +  if (s == NULL)
> +    return 1;  /* state-dependent */
> +
> +  if (char1 != 0x00)
> +    {
> +    /* first byte is non-zero..validate multi-byte char */
> +      if (_isjis (char1) && _isjis (char2)) 
> +	{
> +	  if (state->__state == 0)
>  	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> +	      /* must switch from ASCII to JIS state */
> +	      state->__state = 1;
> +	      *s++ = ESC_CHAR;
> +	      *s++ = '$';
> +	      *s++ = 'B';
> +	      cnt = 3;
>  	    }
> -        }
> +	  *s++ = (char)char1;
> +	  *s = (char)char2;
> +	  return cnt + 2;
> +	}
> +      r->_errno = EILSEQ;
> +      return -1;
>      }
> -  else if (!strcmp (__locale_charset (), "EUCJP"))
> +  if (state->__state != 0)
>      {
> -      unsigned char char2 = (unsigned char)wchar;
> -      unsigned char char1 = (unsigned char)(wchar >> 8);
> +      /* must switch from JIS to ASCII state */
> +      state->__state = 0;
> +      *s++ = ESC_CHAR;
> +      *s++ = '(';
> +      *s++ = 'B';
> +      cnt = 3;
> +    }
> +  *s = (char)char2;
> +  return cnt + 1;
> +}
> +#endif /* !__CYGWIN__ */
>  
> -      if (s == NULL)
> -        return 0;  /* not state-dependent */
> -
> -      if (char1 != 0x00)
> -        {
> -        /* first byte is non-zero..validate multi-byte char */
> -          if (_iseucjp (char1) && _iseucjp (char2)) 
> -            {
> -              *s++ = (char)char1;
> -              *s = (char)char2;
> -              return 2;
> -            }
> -          else
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -        }
> +#ifdef _MB_EXTENDED_CHARSETS_ISO
> +int
> +_DEFUN (__iso_wctomb, (r, s, wchar, charset, state),
> +        struct _reent *r       _AND 
> +        char          *s       _AND
> +        wchar_t        _wchar  _AND
> +	const char    *charset _AND
> +        mbstate_t     *state)
> +{
> +  wint_t wchar = _wchar;
> +
> +  if (s == NULL)
> +    return 0;
> +
> +  /* wchars <= 0x9f translate to all ISO charsets directly. */
> +  if (wchar >= 0xa0)
> +    {
> +      int iso_idx = __iso_8859_index (charset + 9);
> +      if (iso_idx >= 0)
> +	{
> +	  unsigned char mb;
> +
> +	  if (s == NULL)
> +	    return 0;
> +
> +	  for (mb = 0; mb < 0x60; ++mb)
> +	    if (__iso_8859_conv[iso_idx][mb] == wchar)
> +	      {
> +		*s = (char) (mb + 0xa0);
> +		return 1;
> +	      }
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
>      }
> -  else if (!strcmp (__locale_charset (), "JIS"))
> + 
> +  if ((size_t)wchar >= 0x100)
>      {
> -      int cnt = 0; 
> -      unsigned char char2 = (unsigned char)wchar;
> -      unsigned char char1 = (unsigned char)(wchar >> 8);
> -
> -      if (s == NULL)
> -        return 1;  /* state-dependent */
> -
> -      if (char1 != 0x00)
> -        {
> -        /* first byte is non-zero..validate multi-byte char */
> -          if (_isjis (char1) && _isjis (char2)) 
> -            {
> -              if (state->__state == 0)
> -                {
> -                  /* must switch from ASCII to JIS state */
> -                  state->__state = 1;
> -                  *s++ = ESC_CHAR;
> -                  *s++ = '$';
> -                  *s++ = 'B';
> -                  cnt = 3;
> -                }
> -              *s++ = (char)char1;
> -              *s = (char)char2;
> -              return cnt + 2;
> -            }
> -          else
> -	    {
> -	      r->_errno = EILSEQ;
> -	      return -1;
> -	    }
> -        }
> -      else
> -        {
> -          if (state->__state != 0)
> -            {
> -              /* must switch from JIS to ASCII state */
> -              state->__state = 0;
> -              *s++ = ESC_CHAR;
> -              *s++ = '(';
> -              *s++ = 'B';
> -              cnt = 3;
> -            }
> -          *s = (char)char2;
> -          return cnt + 1;
> -        }
> +      r->_errno = EILSEQ;
> +      return -1;
> +    }
> +
> +  *s = (char) wchar;
> +  return 1;
> +}
> +#endif /* _MB_EXTENDED_CHARSETS_ISO */
> +
> +#ifdef _MB_EXTENDED_CHARSETS_DOS
> +int
> +_DEFUN (__cp_wctomb, (r, s, wchar, charset, state),
> +        struct _reent *r       _AND 
> +        char          *s       _AND
> +        wchar_t        _wchar  _AND
> +	const char    *charset _AND
> +        mbstate_t     *state)
> +{
> +  wint_t wchar = _wchar;
> +
> +  if (s == NULL)
> +    return 0;
> +
> +  if (wchar >= 0x80)
> +    {
> +      int cp_idx = __cp_index (charset + 2);
> +      if (cp_idx >= 0)
> +	{
> +	  unsigned char mb;
> +
> +	  if (s == NULL)
> +	    return 0;
> +
> +	  for (mb = 0; mb < 0x80; ++mb)
> +	    if (__cp_conv[cp_idx][mb] == wchar)
> +	      {
> +		*s = (char) (mb + 0x80);
> +		return 1;
> +	      }
> +	  r->_errno = EILSEQ;
> +	  return -1;
> +	}
> +    }
> +
> +  if ((size_t)wchar >= 0x100)
> +    {
> +      r->_errno = EILSEQ;
> +      return -1;
>      }
>  
> +  *s = (char) wchar;
> +  return 1;
> +}
> +#endif /* _MB_EXTENDED_CHARSETS_DOS */
> +#endif /* _MB_CAPABLE */
> +
> +int
> +_DEFUN (__ascii_wctomb, (r, s, wchar, charset, state),
> +        struct _reent *r       _AND 
> +        char          *s       _AND
> +        wchar_t        _wchar  _AND
> +	const char    *charset _AND
> +        mbstate_t     *state)
> +{
> +  /* Avoids compiler warnings about comparisons that are always false
> +     due to limited range when sizeof(wchar_t) is 2 but sizeof(wint_t)
> +     is 4, as is the case on cygwin.  */
> +  wint_t wchar = _wchar;
> +
>    if (s == NULL)
>      return 0;
>   
> -  /* otherwise we are dealing with a single byte character */
>    if ((size_t)wchar >= 0x100)
>      {
>        r->_errno = EILSEQ;
> @@ -216,4 +345,13 @@ _DEFUN (_wctomb_r, (r, s, wchar, state),
>    *s = (char) wchar;
>    return 1;
>  }
> -    
> +
> +int
> +_DEFUN (_wctomb_r, (r, s, wchar, state),
> +        struct _reent *r     _AND 
> +        char          *s     _AND
> +        wchar_t        _wchar _AND
> +        mbstate_t     *state)
> +{
> +  return __wctomb (r, s, _wchar, __locale_charset (), state);
> +}
>
>
>   



More information about the Newlib mailing list