[PATCH/RFA] Revamp wctomb/mbtowc conversion, add more charset support

Corinna Vinschen vinschen@redhat.com
Fri Mar 20 14:18:00 GMT 2009


Hi,

this is a rather big patch which is split in two due to the 100K
mail size restriction on sourceware.  What it does is this:

- Add support for correct ISO-8859-x charset handling.

- Add support for a couple of default ANSI codepages used on Windows,
  CP737, CP775, CP1125, CP1250, CP1251, CP1252, CP1253, CP1254, CP1255,
  CP1256, CP1257, CP1258.

These new charsets require a couple of new character conversion tables
which I put into a new file called libc/stdlib/sb_charsets.c, and which
are only built on _MB_CAPABLE systems.  The complete size of these new
tables is 5760 bytes on systems with sizeof(wchar_t)==2, 11520 bytes on
sizeof(wchar_t)==4 systems.  I think that's not too much.
Size-constraint systems will not build with _MB_CAPABLE anyway.

As for the question "why do you support Windows-specific charsets on
all systems", the answer is, other systems like Linux support them, too.

- On Cygwin, if no explicit charset is defined as input to setlocale,
  search for the current ANSI codepage and set it as current charset,
  if it's one of the supported charsets, otherwise default to ISO-8859-1.

- The functions _wctomb_r and _mbtowc_r are now split into multiple
  functions for each supported charset, rather than having to call
  strcmp multiple times to determine which charset is used.

  To do that, the setlocale() function sets function pointers
  __wctomb/__mbtowc according to the current charset.  On systems not
  being _MB_CAPABLE, only two such functions exist, __ascii_wctomb and
  __ascii_mbtowc.'

  This change also allows easy support for more charsets by simply
  adding new __FOO_wctomb/__FOO_mbtowc functions and tweaking setlocale.
  I'm planning to add more supported codepages over time.

- All iswXXX and towXXX functions have been changed so that on
  _MB_CAPABLE systems all wchar_t input is either SJIS/JIS/EUCP, which
  requires to convert the character to unicode first, or the input is
  already unicode.  This is the wchar_t representation for all other
  charsets anyway.

Part 1 of the patch below, everything except the changes in libc/ctype.

Ok to apply?

And here's a question.  Would it generally be an appreciated idea to add
correctly working support for all supported singlebyte charsets to the
singlebyte ctype functions, like isalpha, ispunct, etc?  I'm willing to
do that.  In theory I would just add new ctype arrays and resetting the
__ctype_ptr to the new array in setlocale().


Corinna


	* libc/ctype/iswalpha.c: Handle all wchar_t as unicode on
	_MB_CAPABLE systems.
	* libc/ctype/iswblank.c: Ditto.
	* libc/ctype/iswcntrl.c: Ditto.
	* libc/ctype/iswprint.c: Ditto.
	* libc/ctype/iswpunct.c: Ditto.
	* libc/ctype/iswspace.c: Ditto.
	* libc/ctype/towlower.c: Ditto.
	* libc/ctype/towupper.c: Ditto.
	* libc/locale/locale.c: Add new charset support to documentation.
	Include ../stdio/local.h from here.
	(set_charset_from_codepage): New Cygwin-specific function.
	(loadlocale): Add Cygwin codepage support.  On _MB_CAPABLE
	systems, set __mbtowc and __wctomb function pointers to function
	corresponding with current charset.  Don't allow non-existant
	ISO-8859-12 charset.  Add support for Windows singlebyte codepages.
	* libc/stdlib/Makefile.am (GENERAL_SOURCES): Add sb_charsets.c.
	* libc/stdlib/Makefile.in: Regenerate.
	* libc/stdlib/local.h: Add prototype for __locale_charset.
	Add prototypes for __mbtowc and __wctomb pointers.
	Add prototypes for charset-specific _wctomb_r and _mbtowc_r
	functions.
	Declare tables and functions from sb_charsets.c.
	* libc/stdlib/mbtowc_r.c (__mbtowc): Define.  Set to __iso_mbtowc,
	or __ascii_mbtowc on not _MB_CAPABLE systems.
	(__iso_mbtowc): New function.
	(__cp_mbtowc): New function.
	(__utf8_mbtowc): New function.
	(__sjis_mbtowc): New function.
	(__eucjp_mbtowc): New function.
	(__jis_mbtowc): New function.
	(__ascii_mbtowc): New function.
	(_mbtowc_r): Just call __mbtowc from here.
	* libc/stdlib/sb_charsets.c: New file.
	* libc/stdlib/wctomb_r.c (__wctomb): Define.  Set to __iso_wctomb,
	or __ascii_wctomb on not _MB_CAPABLE systems.
	(__utf8_wctomb): New function.
	(__sjis_wctomb): New function.
	(__eucjp_wctomb): New function.
	(__jis_wctomb): New function.
	(__iso_wctomb): New function.
	(__cp_wctomb): New function.
	(__ascii_wctomb): New function.
	(_wctomb_r): Just call __wctomb from here.


Index: libc/locale/locale.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/locale/locale.c,v
retrieving revision 1.9
diff -u -p -r1.9 locale.c
--- libc/locale/locale.c	3 Mar 2009 09:28:45 -0000	1.9
+++ libc/locale/locale.c	20 Mar 2009 11:56:10 -0000
@@ -47,11 +47,14 @@ and <<"C">> values for <[locale]>; strin
 honored unless _MB_CAPABLE is defined in which case POSIX locale strings
 are allowed, plus five extensions supported for backward compatibility with
 older implementations using newlib: <<"C-UTF-8">>, <<"C-JIS">>, <<"C-EUCJP">>,
-<<"C-SJIS">>, or <<"C-ISO-8859-x">> with 1 <= x <= 15.  Even when using
-POSIX locale strings, the only charsets allowed are <<"UTF-8">>, <<"JIS">>,
-<<"EUCJP">>, <<"SJIS">>, or <<"ISO-8859-x">> with 1 <= x <= 15.  (<<"">> is 
-also accepted; if given, the settings are read from the corresponding
-LC_* environment variables and $LANG according to POSIX rules.
+<<"C-SJIS">>, <<"C-ISO-8859-x">> with 1 <= x <= 15, or <<"C-CPxxx">> with
+xxx in [737, 775, 1125, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257,
+1258].  Even when using POSIX locale strings, the only charsets allowed
+are <<"UTF-8">>, <<"JIS">>, <<"EUCJP">>, <<"SJIS">>, <<"ISO-8859-x">> with
+1 <= x <= 15, or <<"CPxxx">> with xxx in [737, 775, 1125, 1250, 1251, 1252,
+1253, 1254, 1255, 1256, 1257, 1258].  (<<"">> is also accepted; if
+given, the settings are read from the corresponding LC_* environment variables
+and $LANG according to POSIX rules.
 
 If you use <<NULL>> as the <[locale]> argument, <<setlocale>> returns
 a pointer to the string representing the current locale (always
@@ -85,6 +88,9 @@ PORTABILITY
 ANSI C requires <<setlocale>>, but the only locale required across all
 implementations is the C locale.
 
+NOTES
+There is no ISO-8859-12 codepage.  It's also refused by this implementation.
+
 No supporting OS subroutines are required.
 */
 
@@ -129,6 +135,11 @@ No supporting OS subroutines are require
 #include <limits.h>
 #include <reent.h>
 #include <stdlib.h>
+#include <wchar.h>
+#include "../stdlib/local.h"
+#ifdef __CYGWIN__
+#include <windows.h>
+#endif
 
 #define _LC_LAST      7
 #define ENCODING_LEN 31
@@ -361,6 +372,63 @@ currentlocale()
 #endif
 
 #ifdef _MB_CAPABLE
+#ifdef __CYGWIN__
+void
+set_charset_from_codepage (char *charset)
+{
+  /* On Cygwin, set the charset to the current ANSI codepage by default,
+     if it's one of the supported singlebyte codepages. */
+  int cp = GetACP ();
+  switch (cp)
+    {
+    case 737:
+    case 775:
+    case 1125:
+    case 1250:
+    case 1251:
+    case 1252:
+    case 1253:
+    case 1254:
+    case 1255:
+    case 1256:
+    case 1257:
+    case 1258:
+      sprintf (charset, "CP%lu", cp);
+      break;
+    case 28591:
+    case 28592:
+    case 28593:
+    case 28594:
+    case 28595:
+    case 28596:
+    case 28597:
+    case 28598:
+    case 28599:
+    case 28603:
+    case 28605:
+      sprintf (charset, "ISO-8859-%lu", cp - 28590);
+      break;
+    case 932:
+      strcpy (charset, "SJIS");
+      break;
+    case 50220:
+    case 50221:
+    case 50222:
+      strcpy (charset, "JIS");
+      break;
+    case 51932:
+      strcpy (charset, "EUCJP");
+      break;
+    case 65001:
+      strcpy (charset, "UTF-8");
+      break;
+    default:
+      strcpy (charset, "ISO-8859-1");
+      break;
+    }
+}
+#endif
+
 static char *
 loadlocale(struct _reent *p, int category)
 {
@@ -382,7 +450,11 @@ loadlocale(struct _reent *p, int categor
   if (!strcmp (locale, "POSIX"))
     strcpy (locale, "C");
   if (!strcmp (locale, "C"))				/* Default "C" locale */
+#ifdef __CYGWIN__
+    set_charset_from_codepage (charset);
+#else
     strcpy (charset, "ISO-8859-1");
+#endif
   else if (locale[0] == 'C' && locale[1] == '-')	/* Old newlib style */
 	strcpy (charset, locale + 2);
   else							/* POSIX style */
@@ -414,7 +486,11 @@ loadlocale(struct _reent *p, int categor
 	}
       else if (c[0] == '\0' || c[0] == '@')
 	/* End of string or just a modifier */
+#ifdef __CYGWIN__
+	set_charset_from_codepage (charset);
+#else
 	strcpy (charset, "ISO-8859-1");
+#endif
       else
 	/* Invalid string */
       	return NULL;
@@ -426,32 +502,84 @@ loadlocale(struct _reent *p, int categor
       if (strcmp (charset, "UTF-8"))
 	return NULL;
       mbc_max = 6;
+#ifdef _MB_CAPABLE
+      __wctomb = __utf8_wctomb;
+      __mbtowc = __utf8_mbtowc;
+#endif
     break;
     case 'J':
       if (strcmp (charset, "JIS"))
 	return NULL;
       mbc_max = 8;
+#ifdef _MB_CAPABLE
+      __wctomb = __jis_wctomb;
+      __mbtowc = __jis_mbtowc;
+#endif
     break;
     case 'E':
       if (strcmp (charset, "EUCJP"))
 	return NULL;
       mbc_max = 2;
+#ifdef _MB_CAPABLE
+      __wctomb = __eucjp_wctomb;
+      __mbtowc = __eucjp_mbtowc;
+#endif
     break;
     case 'S':
       if (strcmp (charset, "SJIS"))
 	return NULL;
       mbc_max = 2;
+#ifdef _MB_CAPABLE
+      __wctomb = __sjis_wctomb;
+      __mbtowc = __sjis_mbtowc;
+#endif
     break;
     case 'I':
-    default:
-      /* Must be exactly one of ISO-8859-1, [...] ISO-8859-15. */
+      /* Must be exactly one of ISO-8859-1, [...] ISO-8859-16, except for
+         ISO-8859-12. */
       if (strncmp (charset, "ISO-8859-", 9))
 	return NULL;
       val = strtol (charset + 9, &end, 10);
-      if (val < 1 || val > 15 || *end)
+      if (val < 1 || val > 16 || val == 12 || *end)
 	return NULL;
       mbc_max = 1;
-      break;
+#ifdef _MB_CAPABLE
+      __wctomb = __iso_wctomb;
+      __mbtowc = __iso_mbtowc;
+#endif
+    break;
+    case 'C':
+      if (charset[1] != 'P')
+	return NULL;
+      val = strtol (charset + 2, &end, 10);
+      if (*end)
+	return NULL;
+      switch (val)
+	{
+	case 737:
+	case 775:
+	case 1125:
+	case 1250:
+	case 1251:
+	case 1252:
+	case 1253:
+	case 1254:
+	case 1255:
+	case 1256:
+	case 1257:
+	case 1258:
+	  mbc_max = 1;
+#ifdef _MB_CAPABLE
+	  __wctomb = __cp_wctomb;
+	  __mbtowc = __cp_mbtowc;
+#endif
+	  break;
+	default:
+	  return NULL;
+	}
+    break;
+    default:
+      return NULL;
     }
   if (category == LC_CTYPE)
     {
Index: libc/stdlib/Makefile.am
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/Makefile.am,v
retrieving revision 1.28
diff -u -p -r1.28 Makefile.am
--- libc/stdlib/Makefile.am	25 Feb 2009 21:33:17 -0000	1.28
+++ libc/stdlib/Makefile.am	20 Mar 2009 11:56:10 -0000
@@ -48,6 +48,7 @@ GENERAL_SOURCES = \
 	rand_r.c	\
 	realloc.c	\
 	reallocf.c	\
+	sb_charsets.c	\
 	strtod.c	\
 	strtol.c	\
 	strtoul.c	\
Index: libc/stdlib/local.h
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/local.h,v
retrieving revision 1.1.1.1
diff -u -p -r1.1.1.1 local.h
--- libc/stdlib/local.h	17 Feb 2000 19:39:47 -0000	1.1.1.1
+++ libc/stdlib/local.h	20 Mar 2009 11:56:10 -0000
@@ -5,4 +5,35 @@
 
 char *	_EXFUN(_gcvt,(struct _reent *, double , int , char *, char, int));
 
+char *__locale_charset ();
+
+int (*__wctomb) (struct _reent *, char *, wchar_t, mbstate_t *);
+int __ascii_wctomb (struct _reent *, char *, wchar_t, mbstate_t *);
+#ifdef _MB_CAPABLE
+int __utf8_wctomb (struct _reent *, char *, wchar_t, mbstate_t *);
+int __sjis_wctomb (struct _reent *, char *, wchar_t, mbstate_t *);
+int __eucjp_wctomb (struct _reent *, char *, wchar_t, mbstate_t *);
+int __jis_wctomb (struct _reent *, char *, wchar_t, mbstate_t *);
+int __iso_wctomb (struct _reent *, char *, wchar_t, mbstate_t *);
+int __cp_wctomb (struct _reent *, char *, wchar_t, mbstate_t *);
+#endif
+
+int (*__mbtowc) (struct _reent *r, wchar_t *, const char *s, size_t n,
+                  mbstate_t *);
+int __ascii_mbtowc (struct _reent *r, wchar_t *, const char *s, size_t n, mbstate_t *);
+#ifdef _MB_CAPABLE
+int __utf8_mbtowc (struct _reent *r, wchar_t *, const char *s, size_t n, mbstate_t *);
+int __sjis_mbtowc (struct _reent *r, wchar_t *, const char *s, size_t n, mbstate_t *);
+int __eucjp_mbtowc (struct _reent *r, wchar_t *, const char *s, size_t n, mbstate_t *);
+int __jis_mbtowc (struct _reent *r, wchar_t *, const char *s, size_t n, mbstate_t *);
+int __iso_mbtowc (struct _reent *r, wchar_t *, const char *s, size_t n, mbstate_t *);
+int __cp_mbtowc (struct _reent *r, wchar_t *, const char *s, size_t n, mbstate_t *);
+#endif
+
+wchar_t __iso_8859_conv[14][0x60];
+int __iso_8859_index (const char *);
+
+wchar_t __cp_conv[12][0x80];
+int __cp_index (const char *);
+
 #endif
Index: libc/stdlib/mbtowc_r.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/mbtowc_r.c,v
retrieving revision 1.11
diff -u -p -r1.11 mbtowc_r.c
--- libc/stdlib/mbtowc_r.c	19 Mar 2009 19:47:52 -0000	1.11
+++ libc/stdlib/mbtowc_r.c	20 Mar 2009 11:56:10 -0000
@@ -5,10 +5,17 @@
 #include <wchar.h>
 #include <string.h>
 #include <errno.h>
+#include "local.h"
 
+int (*__mbtowc) (struct _reent *r, wchar_t *, const char *s, size_t n,
+		 mbstate_t *)
 #ifdef _MB_CAPABLE
-extern char *__locale_charset ();
+   = __iso_mbtowc;
+#else
+   = __ascii_mbtowc;
+#endif
 
+#ifdef _MB_CAPABLE
 typedef enum { ESCAPE, DOLLAR, BRACKET, AT, B, J, 
                NUL, JIS_CHAR, OTHER, JIS_C_NUM } JIS_CHAR_TYPE;
 typedef enum { ASCII, JIS, A_ESC, A_ESC_DL, JIS_1, J_ESC, J_ESC_BR,
@@ -43,13 +50,12 @@ static JIS_ACTION JIS_action_table[JIS_S
 /* J_ESC */   { ERROR,   ERROR,    NOOP,     ERROR,   ERROR,   ERROR,   ERROR,   ERROR,   ERROR },
 /* J_ESC_BR */{ ERROR,   ERROR,    ERROR,    ERROR,   MAKE_A,  MAKE_A,  ERROR,   ERROR,   ERROR },
 };
-#endif /* _MB_CAPABLE */
 
 /* we override the mbstate_t __count field for more complex encodings and use it store a state value */
 #define __state __count
 
 int
-_DEFUN (_mbtowc_r, (r, pwc, s, n, state),
+_DEFUN (__iso_mbtowc, (r, pwc, s, n, state),
         struct _reent *r   _AND
         wchar_t       *pwc _AND 
         const char    *s   _AND        
@@ -62,190 +68,374 @@ _DEFUN (_mbtowc_r, (r, pwc, s, n, state)
   if (pwc == NULL)
     pwc = &dummy;
 
-  if (s != NULL && n == 0)
+  if (s == NULL)
+    return 0;
+
+  if (n == 0)
     return -2;
 
-#ifdef _MB_CAPABLE
-  if (strlen (__locale_charset ()) <= 1)
-    { /* fall-through */ }
-  else if (!strcmp (__locale_charset (), "UTF-8"))
-    {
-      int ch;
-      int i = 0;
-
-      if (s == NULL)
-        return 0; /* UTF-8 character encodings are not state-dependent */
-
-      if (state->__count == 4)
-	{
-	  /* Create the second half of the surrogate pair.  For a description
-	     see the comment below. */
-	  wint_t tmp = (wchar_t)((state->__value.__wchb[0] & 0x07) << 18)
-	    |   (wchar_t)((state->__value.__wchb[1] & 0x3f) << 12)
-	    |   (wchar_t)((state->__value.__wchb[2] & 0x3f) << 6)
-	    |   (wchar_t)(state->__value.__wchb[3] & 0x3f);
-	  state->__count = 0;
-	  *pwc = 0xdc00 | ((tmp - 0x10000) & 0x3ff);
-	  return 2;
-	}
-      if (state->__count == 0)
-	ch = t[i++];
-      else
+  if (*t >= 0xa0)
+    {
+      int iso_idx = __iso_8859_index (__locale_charset () + 9);
+      if (iso_idx >= 0)
 	{
-	  if (n < (size_t)-1)
-	    ++n;
-	  ch = state->__value.__wchb[0];
+	  *pwc = __iso_8859_conv[iso_idx][*t - 0xa0];
+	  if (*pwc == 0) /* Invalid character */
+	    {
+	      r->_errno = EILSEQ;
+	      return -1;
+	    }
+	  return 1;
 	}
+    }
+
+  *pwc = (wchar_t) *t;
+  
+  if (*t == '\0')
+    return 0;
+
+  return 1;
+}
+
+int
+_DEFUN (__cp_mbtowc, (r, pwc, s, n, state),
+        struct _reent *r   _AND
+        wchar_t       *pwc _AND 
+        const char    *s   _AND        
+        size_t         n   _AND
+        mbstate_t      *state)
+{
+  wchar_t dummy;
+  unsigned char *t = (unsigned char *)s;
+
+  if (pwc == NULL)
+    pwc = &dummy;
+
+  if (s == NULL)
+    return 0;
+
+  if (n == 0)
+    return -2;
 
-      if (ch == '\0')
+  if (*t >= 0x80)
+    {
+      int cp_idx = __cp_index (__locale_charset () + 2);
+      if (cp_idx >= 0)
 	{
-	  *pwc = 0;
-	  state->__count = 0;
-	  return 0; /* s points to the null character */
+	  *pwc = __cp_conv[cp_idx][*t - 0x80];
+	  if (*pwc == 0) /* Invalid character */
+	    {
+	      r->_errno = EILSEQ;
+	      return -1;
+	    }
+	  return 1;
 	}
+    }
+
+  *pwc = (wchar_t)*t;
+  
+  if (*t == '\0')
+    return 0;
+
+  return 1;
+}
 
-      if (ch >= 0x0 && ch <= 0x7f)
+int
+_DEFUN (__utf8_mbtowc, (r, pwc, s, n, state),
+        struct _reent *r   _AND
+        wchar_t       *pwc _AND 
+        const char    *s   _AND        
+        size_t         n   _AND
+        mbstate_t      *state)
+{
+  wchar_t dummy;
+  unsigned char *t = (unsigned char *)s;
+  int ch;
+  int i = 0;
+
+  if (pwc == NULL)
+    pwc = &dummy;
+
+  if (s == NULL)
+    return 0;
+
+  if (n == 0)
+    return -2;
+
+  if (state->__count == 4)
+    {
+      /* Create the second half of the surrogate pair.  For a description
+	 see the comment below. */
+      wint_t tmp = (wchar_t)((state->__value.__wchb[0] & 0x07) << 18)
+	|   (wchar_t)((state->__value.__wchb[1] & 0x3f) << 12)
+	|   (wchar_t)((state->__value.__wchb[2] & 0x3f) << 6)
+	|   (wchar_t)(state->__value.__wchb[3] & 0x3f);
+      state->__count = 0;
+      *pwc = 0xdc00 | ((tmp - 0x10000) & 0x3ff);
+      return 2;
+    }
+  if (state->__count == 0)
+    ch = t[i++];
+  else
+    {
+      if (n < (size_t)-1)
+	++n;
+      ch = state->__value.__wchb[0];
+    }
+
+  if (ch == '\0')
+    {
+      *pwc = 0;
+      state->__count = 0;
+      return 0; /* s points to the null character */
+    }
+
+  if (ch >= 0x0 && ch <= 0x7f)
+    {
+      /* single-byte sequence */
+      state->__count = 0;
+      *pwc = ch;
+      return 1;
+    }
+  if (ch >= 0xc0 && ch <= 0xdf)
+    {
+      /* two-byte sequence */
+      state->__value.__wchb[0] = ch;
+      state->__count = 1;
+      if (n < 2)
+	return -2;
+      ch = t[i++];
+      if (ch < 0x80 || ch > 0xbf)
 	{
-	  /* single-byte sequence */
-	  state->__count = 0;
-	  *pwc = ch;
-	  return 1;
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      if (state->__value.__wchb[0] < 0xc2)
+	{
+	  /* overlong UTF-8 sequence */
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      state->__count = 0;
+      *pwc = (wchar_t)((state->__value.__wchb[0] & 0x1f) << 6)
+	|    (wchar_t)(ch & 0x3f);
+      return i;
+    }
+  if (ch >= 0xe0 && ch <= 0xef)
+    {
+      /* three-byte sequence */
+      wchar_t tmp;
+      state->__value.__wchb[0] = ch;
+      if (state->__count == 0)
+	state->__count = 1;
+      else if (n < (size_t)-1)
+	++n;
+      if (n < 2)
+	return -2;
+      ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
+      if (state->__value.__wchb[0] == 0xe0 && ch < 0xa0)
+	{
+	  /* overlong UTF-8 sequence */
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      if (ch < 0x80 || ch > 0xbf)
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      state->__value.__wchb[1] = ch;
+      state->__count = 2;
+      if (n < 3)
+	return -2;
+      ch = t[i++];
+      if (ch < 0x80 || ch > 0xbf)
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      state->__count = 0;
+      tmp = (wchar_t)((state->__value.__wchb[0] & 0x0f) << 12)
+	|    (wchar_t)((state->__value.__wchb[1] & 0x3f) << 6)
+	|     (wchar_t)(ch & 0x3f);
+    
+      if (tmp >= 0xd800 && tmp <= 0xdfff)
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      *pwc = tmp;
+      return i;
+    }
+  if (ch >= 0xf0 && ch <= 0xf7)
+    {
+      /* four-byte sequence */
+      wint_t tmp;
+      state->__value.__wchb[0] = ch;
+      if (state->__count == 0)
+	state->__count = 1;
+      else if (n < (size_t)-1)
+	++n;
+      if (n < 2)
+	return -2;
+      ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
+      if (state->__value.__wchb[0] == 0xf0 && ch < 0x90)
+	{
+	  /* overlong UTF-8 sequence */
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      if (ch < 0x80 || ch > 0xbf)
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      state->__value.__wchb[1] = ch;
+      if (state->__count == 1)
+	state->__count = 2;
+      else if (n < (size_t)-1)
+	++n;
+      if (n < 3)
+	return -2;
+      ch = (state->__count == 2) ? t[i++] : state->__value.__wchb[2];
+      if (ch < 0x80 || ch > 0xbf)
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
 	}
-      else if (ch >= 0xc0 && ch <= 0xdf)
+      state->__value.__wchb[2] = ch;
+      state->__count = 3;
+      if (n < 4)
+	return -2;
+      ch = t[i++];
+      if (ch < 0x80 || ch > 0xbf)
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+      tmp = (wint_t)((state->__value.__wchb[0] & 0x07) << 18)
+	|   (wint_t)((state->__value.__wchb[1] & 0x3f) << 12)
+	|   (wint_t)((state->__value.__wchb[2] & 0x3f) << 6)
+	|   (wint_t)(ch & 0x3f);
+      if (tmp > 0xffff && sizeof(wchar_t) == 2)
+	{
+	  /* On systems which have wchar_t being UTF-16 values, the value
+	     doesn't fit into a single wchar_t in this case.  So what we
+	     do here is to store the state with a special value of __count
+	     and return the first half of a surrogate pair.  As return
+	     value we choose to return the half of the actual UTF-8 char.
+	     The second half is returned in case we recognize the special
+	     __count value above. */
+	  state->__value.__wchb[3] = ch;
+	  state->__count = 4;
+	  *pwc = 0xd800 | (((tmp - 0x10000) >> 10) & 0x3ff);
+	  return 2;
+	}
+      *pwc = tmp;
+      state->__count = 0;
+      return i;
+    }
+
+  r->_errno = EILSEQ;
+  return -1;
+}
+
+int
+_DEFUN (__sjis_mbtowc, (r, pwc, s, n, state),
+        struct _reent *r   _AND
+        wchar_t       *pwc _AND 
+        const char    *s   _AND        
+        size_t         n   _AND
+        mbstate_t      *state)
+{
+  wchar_t dummy;
+  unsigned char *t = (unsigned char *)s;
+  int ch;
+  int i = 0;
+
+  if (pwc == NULL)
+    pwc = &dummy;
+
+  if (s == NULL)
+    return 0;  /* not state-dependent */
+
+  if (n == 0)
+    return -2;
+
+  ch = t[i++];
+  if (state->__count == 0)
+    {
+      if (_issjis1 (ch))
 	{
-	  /* two-byte sequence */
 	  state->__value.__wchb[0] = ch;
 	  state->__count = 1;
-	  if (n < 2)
+	  if (n <= 1)
 	    return -2;
 	  ch = t[i++];
-	  if (ch < 0x80 || ch > 0xbf)
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  if (state->__value.__wchb[0] < 0xc2)
-	    {
-	      /* overlong UTF-8 sequence */
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  state->__count = 0;
-	  *pwc = (wchar_t)((state->__value.__wchb[0] & 0x1f) << 6)
-	    |    (wchar_t)(ch & 0x3f);
-	  return i;
 	}
-      else if (ch >= 0xe0 && ch <= 0xef)
+    }
+  if (state->__count == 1)
+    {
+      if (_issjis2 (ch))
 	{
-	  /* three-byte sequence */
-	  wchar_t tmp;
-	  state->__value.__wchb[0] = ch;
-	  if (state->__count == 0)
-	    state->__count = 1;
-	  else if (n < (size_t)-1)
-	    ++n;
-	  if (n < 2)
-	    return -2;
-	  ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
-	  if (state->__value.__wchb[0] == 0xe0 && ch < 0xa0)
-	    {
-	      /* overlong UTF-8 sequence */
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  if (ch < 0x80 || ch > 0xbf)
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  state->__value.__wchb[1] = ch;
-	  state->__count = 2;
-	  if (n < 3)
-	    return -2;
-	  ch = t[i++];
-	  if (ch < 0x80 || ch > 0xbf)
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
+	  *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
 	  state->__count = 0;
-	  tmp = (wchar_t)((state->__value.__wchb[0] & 0x0f) << 12)
-	    |    (wchar_t)((state->__value.__wchb[1] & 0x3f) << 6)
-	    |     (wchar_t)(ch & 0x3f);
-	
-	  if (tmp >= 0xd800 && tmp <= 0xdfff)
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  *pwc = tmp;
 	  return i;
 	}
-      else if (ch >= 0xf0 && ch <= 0xf7)
+      else  
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+    }
+
+  *pwc = (wchar_t)*t;
+  
+  if (*t == '\0')
+    return 0;
+
+  return 1;
+}
+
+int
+_DEFUN (__eucjp_mbtowc, (r, pwc, s, n, state),
+        struct _reent *r   _AND
+        wchar_t       *pwc _AND 
+        const char    *s   _AND        
+        size_t         n   _AND
+        mbstate_t      *state)
+{
+  wchar_t dummy;
+  unsigned char *t = (unsigned char *)s;
+  int ch;
+  int i = 0;
+
+  if (pwc == NULL)
+    pwc = &dummy;
+
+  if (s == NULL)
+    return 0;
+
+  if (n == 0)
+    return -2;
+
+  ch = t[i++];
+  if (state->__count == 0)
+    {
+      if (_iseucjp (ch))
 	{
-	  /* four-byte sequence */
-	  wint_t tmp;
 	  state->__value.__wchb[0] = ch;
-	  if (state->__count == 0)
-	    state->__count = 1;
-	  else if (n < (size_t)-1)
-	    ++n;
-	  if (n < 2)
-	    return -2;
-	  ch = (state->__count == 1) ? t[i++] : state->__value.__wchb[1];
-	  if (state->__value.__wchb[0] == 0xf0 && ch < 0x90)
-	    {
-	      /* overlong UTF-8 sequence */
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  if (ch < 0x80 || ch > 0xbf)
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  state->__value.__wchb[1] = ch;
-	  if (state->__count == 1)
-	    state->__count = 2;
-	  else if (n < (size_t)-1)
-	    ++n;
-	  if (n < 3)
-	    return -2;
-	  ch = (state->__count == 2) ? t[i++] : state->__value.__wchb[2];
-	  if (ch < 0x80 || ch > 0xbf)
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  state->__value.__wchb[2] = ch;
-	  state->__count = 3;
-	  if (n < 4)
+	  state->__count = 1;
+	  if (n <= 1)
 	    return -2;
 	  ch = t[i++];
-	  if (ch < 0x80 || ch > 0xbf)
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	  tmp = (wint_t)((state->__value.__wchb[0] & 0x07) << 18)
-	    |   (wint_t)((state->__value.__wchb[1] & 0x3f) << 12)
-	    |   (wint_t)((state->__value.__wchb[2] & 0x3f) << 6)
-	    |   (wint_t)(ch & 0x3f);
-	  if (tmp > 0xffff && sizeof(wchar_t) == 2)
-	    {
-	      /* On systems which have wchar_t being UTF-16 values, the value
-		 doesn't fit into a single wchar_t in this case.  So what we
-		 do here is to store the state with a special value of __count
-		 and return the first half of a surrogate pair.  As return
-		 value we choose to return the half of the actual UTF-8 char.
-		 The second half is returned in case we recognize the special
-		 __count value above. */
-	      state->__value.__wchb[3] = ch;
-	      state->__count = 4;
-	      *pwc = 0xd800 | (((tmp - 0x10000) >> 10) & 0x3ff);
-	      return 2;
-	    }
-	  *pwc = tmp;
+	}
+    }
+  if (state->__count == 1)
+    {
+      if (_iseucjp (ch))
+	{
+	  *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
 	  state->__count = 0;
 	  return i;
 	}
@@ -254,165 +444,138 @@ _DEFUN (_mbtowc_r, (r, pwc, s, n, state)
 	  r->_errno = EILSEQ;
 	  return -1;
 	}
-    }      
-  else if (!strcmp (__locale_charset (), "SJIS"))
+    }
+
+  *pwc = (wchar_t)*t;
+  
+  if (*t == '\0')
+    return 0;
+
+  return 1;
+}
+
+int
+_DEFUN (__jis_mbtowc, (r, pwc, s, n, state),
+        struct _reent *r   _AND
+        wchar_t       *pwc _AND 
+        const char    *s   _AND        
+        size_t         n   _AND
+        mbstate_t      *state)
+{
+  wchar_t dummy;
+  unsigned char *t = (unsigned char *)s;
+  JIS_STATE curr_state;
+  JIS_ACTION action;
+  JIS_CHAR_TYPE ch;
+  unsigned char *ptr;
+  unsigned int i;
+  int curr_ch;
+
+  if (pwc == NULL)
+    pwc = &dummy;
+
+  if (s == NULL)
     {
-      int ch;
-      int i = 0;
-      if (s == NULL)
-        return 0;  /* not state-dependent */
-      ch = t[i++];
-      if (state->__count == 0)
-	{
-	  if (_issjis1 (ch))
-	    {
-	      state->__value.__wchb[0] = ch;
-	      state->__count = 1;
-	      if (n <= 1)
-		return -2;
-	      ch = t[i++];
-	    }
-	}
-      if (state->__count == 1)
-	{
-	  if (_issjis2 (ch))
-	    {
-	      *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
-	      state->__count = 0;
-	      return i;
-	    }
-	  else  
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-	}
+      state->__state = ASCII;
+      return 1;  /* state-dependent */
     }
-  else if (!strcmp (__locale_charset (), "EUCJP"))
+
+  if (n == 0)
+    return -2;
+
+  curr_state = state->__state;
+  ptr = t;
+
+  for (i = 0; i < n; ++i)
     {
-      int ch;
-      int i = 0;
-      if (s == NULL)
-        return 0;  /* not state-dependent */
-      ch = t[i++];
-      if (state->__count == 0)
-	{
-	  if (_iseucjp (ch))
-	    {
-	      state->__value.__wchb[0] = ch;
-	      state->__count = 1;
-	      if (n <= 1)
-		return -2;
-	      ch = t[i++];
-	    }
-	}
-      if (state->__count == 1)
+      curr_ch = t[i];
+      switch (curr_ch)
 	{
-	  if (_iseucjp (ch))
-	    {
-	      *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)ch;
-	      state->__count = 0;
-	      return i;
-	    }
+	case ESC_CHAR:
+	  ch = ESCAPE;
+	  break;
+	case '$':
+	  ch = DOLLAR;
+	  break;
+	case '@':
+	  ch = AT;
+	  break;
+	case '(':
+	  ch = BRACKET;
+	  break;
+	case 'B':
+	  ch = B;
+	  break;
+	case 'J':
+	  ch = J;
+	  break;
+	case '\0':
+	  ch = NUL;
+	  break;
+	default:
+	  if (_isjis (curr_ch))
+	    ch = JIS_CHAR;
 	  else
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
+	    ch = OTHER;
+	}
+
+      action = JIS_action_table[curr_state][ch];
+      curr_state = JIS_state_table[curr_state][ch];
+    
+      switch (action)
+	{
+	case NOOP:
+	  break;
+	case EMPTY:
+	  state->__state = ASCII;
+	  *pwc = (wchar_t)0;
+	  return 0;
+	case COPY_A:
+	  state->__state = ASCII;
+	  *pwc = (wchar_t)*ptr;
+	  return (i + 1);
+	case COPY_J1:
+	  state->__value.__wchb[0] = t[i];
+	  break;
+	case COPY_J2:
+	  state->__state = JIS;
+	  *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)(t[i]);
+	  return (i + 1);
+	case MAKE_A:
+	  ptr = (unsigned char *)(t + i + 1);
+	  break;
+	case ERROR:
+	default:
+	  r->_errno = EILSEQ;
+	  return -1;
 	}
+
     }
-  else if (!strcmp (__locale_charset (), "JIS"))
-    {
-      JIS_STATE curr_state;
-      JIS_ACTION action;
-      JIS_CHAR_TYPE ch;
-      unsigned char *ptr;
-      unsigned int i;
-      int curr_ch;
- 
-      if (s == NULL)
-        {
-          state->__state = ASCII;
-          return 1;  /* state-dependent */
-        }
-
-      curr_state = state->__state;
-      ptr = t;
-
-      for (i = 0; i < n; ++i)
-        {
-          curr_ch = t[i];
-          switch (curr_ch)
-            {
-	    case ESC_CHAR:
-              ch = ESCAPE;
-              break;
-	    case '$':
-              ch = DOLLAR;
-              break;
-            case '@':
-              ch = AT;
-              break;
-            case '(':
-	      ch = BRACKET;
-              break;
-            case 'B':
-              ch = B;
-              break;
-            case 'J':
-              ch = J;
-              break;
-            case '\0':
-              ch = NUL;
-              break;
-            default:
-              if (_isjis (curr_ch))
-                ch = JIS_CHAR;
-              else
-                ch = OTHER;
-	    }
 
-          action = JIS_action_table[curr_state][ch];
-          curr_state = JIS_state_table[curr_state][ch];
-        
-          switch (action)
-            {
-            case NOOP:
-              break;
-            case EMPTY:
-              state->__state = ASCII;
-              *pwc = (wchar_t)0;
-              return 0;
-            case COPY_A:
-	      state->__state = ASCII;
-              *pwc = (wchar_t)*ptr;
-              return (i + 1);
-            case COPY_J1:
-              state->__value.__wchb[0] = t[i];
-	      break;
-            case COPY_J2:
-              state->__state = JIS;
-              *pwc = (((wchar_t)state->__value.__wchb[0]) << 8) + (wchar_t)(t[i]);
-              return (i + 1);
-            case MAKE_A:
-              ptr = (unsigned char *)(t + i + 1);
-              break;
-            case ERROR:
-            default:
-	      r->_errno = EILSEQ;
-              return -1;
-            }
+  state->__state = curr_state;
+  return -2;  /* n < bytes needed */
+}
+#endif /* _MB_CAPABLE */
 
-        }
+int
+_DEFUN (__ascii_mbtowc, (r, pwc, s, n, state),
+        struct _reent *r   _AND
+        wchar_t       *pwc _AND 
+        const char    *s   _AND        
+        size_t         n   _AND
+        mbstate_t      *state)
+{
+  wchar_t dummy;
+  unsigned char *t = (unsigned char *)s;
 
-      state->__state = curr_state;
-      return -2;  /* n < bytes needed */
-    }
-#endif /* _MB_CAPABLE */               
+  if (pwc == NULL)
+    pwc = &dummy;
 
-  /* otherwise this must be the "C" locale or unknown locale */
   if (s == NULL)
-    return 0;  /* not state-dependent */
+    return 0;
+
+  if (n == 0)
+    return -2;
 
   *pwc = (wchar_t)*t;
   
@@ -421,3 +584,14 @@ _DEFUN (_mbtowc_r, (r, pwc, s, n, state)
 
   return 1;
 }
+
+int
+_DEFUN (_mbtowc_r, (r, pwc, s, n, state),
+        struct _reent *r   _AND
+        wchar_t       *pwc _AND 
+        const char    *s   _AND        
+        size_t         n   _AND
+        mbstate_t      *state)
+{
+  return __mbtowc (r, pwc, s, n, state);
+}
Index: libc/stdlib/sb_charsets.c
===================================================================
RCS file: libc/stdlib/sb_charsets.c
diff -N libc/stdlib/sb_charsets.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ libc/stdlib/sb_charsets.c	20 Mar 2009 11:56:10 -0000
@@ -0,0 +1,476 @@
+#include <newlib.h>
+#include <stdlib.h>
+#include <locale.h>
+#include "mbctype.h"
+#include <wchar.h>
+#include <string.h>
+#include <errno.h>
+
+#ifdef _MB_CAPABLE
+extern char *__locale_charset ();
+
+/* Tables for the ISO-8859-x to UTF conversion.  The first index into the
+   table is a value computed from the value x (function __iso_8859_index),
+   the second index is the value of the incoming character - 0xa0.
+   Values < 0xa0 don't have to be converted anyway. */
+wchar_t __iso_8859_conv[14][0x60] = {
+  /* ISO-8859-2 */
+  { 0xa0, 0x104, 0x2d8, 0x141, 0xa4, 0x13d, 0x15a, 0xa7,
+    0xa8, 0x160, 0x15e, 0x164, 0x179, 0xad, 0x17d, 0x17b,
+    0xb0, 0x105, 0x2db, 0x142, 0xb4, 0x13e, 0x15b, 0x2c7,
+    0xb8, 0x161, 0x15f, 0x165, 0x17a, 0x2dd, 0x17e, 0x17c,
+    0x154, 0xc1, 0xc2, 0x102, 0xc4, 0x139, 0x106, 0xc7,
+    0x10c, 0xc9, 0x118, 0xcb, 0x11a, 0xcd, 0xce, 0x10e,
+    0x110, 0x143, 0x147, 0xd3, 0xd4, 0x150, 0xd6, 0xd7,
+    0x158, 0x16e, 0xda, 0x170, 0xdc, 0xdd, 0x162, 0xdf,
+    0x155, 0xe1, 0xe2, 0x103, 0xe4, 0x13a, 0x107, 0xe7,
+    0x10d, 0xe9, 0x119, 0xeb, 0x11b, 0xed, 0xee, 0x10f,
+    0x111, 0x144, 0x148, 0xf3, 0xf4, 0x151, 0xf6, 0xf7,
+    0x159, 0x16f, 0xfa, 0x171, 0xfc, 0xfd, 0x163, 0x2d9 },
+  /* ISO-8859-3 */
+  { 0xa0, 0x126, 0x2d8, 0xa3, 0xa4, 0x0, 0x124, 0xa7,
+    0xa8, 0x130, 0x15e, 0x11e, 0x134, 0xad, 0x0, 0x17b,
+    0xb0, 0x127, 0xb2, 0xb3, 0xb4, 0xb5, 0x125, 0xb7,
+    0xb8, 0x131, 0x15f, 0x11f, 0x135, 0xbd, 0x0, 0x17c,
+    0xc0, 0xc1, 0xc2, 0x0, 0xc4, 0x10a, 0x108, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
+    0x0, 0xd1, 0xd2, 0xd3, 0xd4, 0x120, 0xd6, 0xd7,
+    0x11c, 0xd9, 0xda, 0xdb, 0xdc, 0x16c, 0x15c, 0xdf,
+    0xe0, 0xe1, 0xe2, 0x0, 0xe4, 0x10b, 0x109, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
+    0x0, 0xf1, 0xf2, 0xf3, 0xf4, 0x121, 0xf6, 0xf7,
+    0x11d, 0xf9, 0xfa, 0xfb, 0xfc, 0x16d, 0x15d, 0x2d9 },
+  /* ISO-8859-4 */
+  { 0xa0, 0x104, 0x138, 0x156, 0xa4, 0x128, 0x13b, 0xa7,
+    0xa8, 0x160, 0x112, 0x122, 0x166, 0xad, 0x17d, 0xaf,
+    0xb0, 0x105, 0x2db, 0x157, 0xb4, 0x129, 0x13c, 0x2c7,
+    0xb8, 0x161, 0x113, 0x123, 0x167, 0x14a, 0x17e, 0x14b,
+    0x100, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0x12e,
+    0x10c, 0xc9, 0x118, 0xcb, 0x116, 0xcd, 0xce, 0x12a,
+    0x110, 0x145, 0x14c, 0x136, 0xd4, 0xd5, 0xd6, 0xd7,
+    0xd8, 0x172, 0xda, 0xdb, 0xdc, 0x168, 0x16a, 0xdf,
+    0x101, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0x12f,
+    0x10d, 0xe9, 0x119, 0xeb, 0x117, 0xed, 0xee, 0x12b,
+    0x111, 0x146, 0x14d, 0x137, 0xf4, 0xf5, 0xf6, 0xf7,
+    0xf8, 0x173, 0xfa, 0xfb, 0xfc, 0x169, 0x16b, 0x2d9 },
+  /* ISO-8859-5 */
+  { 0xa0, 0x401, 0x402, 0x403, 0x404, 0x405, 0x406, 0x407,
+    0x408, 0x409, 0x40a, 0x40b, 0x40c, 0xad, 0x40e, 0x40f,
+    0x410, 0x411, 0x412, 0x413, 0x414, 0x415, 0x416, 0x417,
+    0x418, 0x419, 0x41a, 0x41b, 0x41c, 0x41d, 0x41e, 0x41f,
+    0x420, 0x421, 0x422, 0x423, 0x424, 0x425, 0x426, 0x427,
+    0x428, 0x429, 0x42a, 0x42b, 0x42c, 0x42d, 0x42e, 0x42f,
+    0x430, 0x431, 0x432, 0x433, 0x434, 0x435, 0x436, 0x437,
+    0x438, 0x439, 0x43a, 0x43b, 0x43c, 0x43d, 0x43e, 0x43f,
+    0x440, 0x441, 0x442, 0x443, 0x444, 0x445, 0x446, 0x447,
+    0x448, 0x449, 0x44a, 0x44b, 0x44c, 0x44d, 0x44e, 0x44f,
+    0x2116, 0x451, 0x452, 0x453, 0x454, 0x455, 0x456, 0x457,
+    0x458, 0x459, 0x45a, 0x45b, 0x45c, 0xa7, 0x45e, 0x45f },
+  /* ISO-8859-6 */
+  { 0xa0, 0x0, 0x0, 0x0, 0xa4, 0x0, 0x0, 0x0,
+    0x0, 0x0, 0x0, 0x0, 0x60c, 0xad, 0x0, 0x0,
+    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+    0x0, 0x0, 0x0, 0x61b, 0x0, 0x0, 0x0, 0x61f,
+    0x0, 0x621, 0x622, 0x623, 0x624, 0x625, 0x626, 0x627,
+    0x628, 0x629, 0x62a, 0x62b, 0x62c, 0x62d, 0x62e, 0x62f,
+    0x630, 0x631, 0x632, 0x633, 0x634, 0x635, 0x636, 0x637,
+    0x638, 0x639, 0x63a, 0x0, 0x0, 0x0, 0x0, 0x0,
+    0x640, 0x641, 0x642, 0x643, 0x644, 0x645, 0x646, 0x647,
+    0x648, 0x649, 0x64a, 0x64b, 0x64c, 0x64d, 0x64e, 0x64f,
+    0x650, 0x651, 0x652, 0x64b, 0xf4, 0xf5, 0xf6, 0xf7,
+    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff },
+  /* ISO-8859-7 */
+  { 0xa0, 0x2018, 0x2019, 0xa3, 0x20ac, 0x20af, 0xa6, 0xa7,
+    0xa8, 0xa9, 0x37a, 0xab, 0xac, 0xad, 0x0, 0x2015,
+    0xb0, 0xb1, 0xb2, 0xb3, 0x384, 0x385, 0x386, 0xb7,
+    0x388, 0x389, 0x38a, 0xbb, 0x38c, 0xbd, 0x38e, 0x38f,
+    0x390, 0x391, 0x392, 0x393, 0x394, 0x395, 0x396, 0x397,
+    0x398, 0x399, 0x39a, 0x39b, 0x39c, 0x39d, 0x39e, 0x39f,
+    0x3a0, 0x3a1, 0x0, 0x3a3, 0x3a4, 0x3a5, 0x3a6, 0x3a7,
+    0x3a8, 0x3a9, 0x3aa, 0x3ab, 0x3ac, 0x3ad, 0x3ae, 0x3af,
+    0x3b0, 0x3b1, 0x3b2, 0x3b3, 0x3b4, 0x3b5, 0x3b6, 0x3b7,
+    0x3b8, 0x3b9, 0x3ba, 0x3bb, 0x3bc, 0x3bd, 0x3be, 0x3bf,
+    0x3c0, 0x3c1, 0x3c2, 0x3c3, 0x3c4, 0x3c5, 0x3c6, 0x3c7,
+    0x3c8, 0x3c9, 0x3ca, 0x3cb, 0x3cc, 0x3cd, 0x3ce, 0xff },
+  /* ISO-8859-8 */
+  { 0xa0, 0x0, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0xd7, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0xb9, 0xf7, 0xbb, 0xbc, 0xbd, 0xbe, 0x0,
+    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2017,
+    0x5d0, 0x5d1, 0x5d2, 0x5d3, 0x5d4, 0x5d5, 0x5d6, 0x5d7,
+    0x5d8, 0x5d9, 0x5da, 0x5db, 0x5dc, 0x5dd, 0x5de, 0x5df,
+    0x5e0, 0x5e1, 0x5e2, 0x5e3, 0x5e4, 0x5e5, 0x5e6, 0x5e7,
+    0x5e8, 0x5e9, 0x5ea, 0x0, 0x0, 0x200e, 0x200f, 0x200e },
+  /* ISO-8859-9 */
+  { 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
+    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
+    0x11e, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
+    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0x130, 0x15e, 0xdf,
+    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
+    0x11f, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
+    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0x131, 0x15f, 0xff },
+  /* ISO-8859-10 */
+  { 0xa0, 0x104, 0x112, 0x122, 0x12a, 0x128, 0x136, 0xa7,
+    0x13b, 0x110, 0x160, 0x166, 0x17d, 0xad, 0x16a, 0x14a,
+    0xb0, 0x105, 0x113, 0x123, 0x12b, 0x129, 0x137, 0xb7,
+    0x13c, 0x111, 0x161, 0x167, 0x17e, 0x2015, 0x16b, 0x14b,
+    0x100, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0x12e,
+    0x10c, 0xc9, 0x118, 0xcb, 0x116, 0xcd, 0xce, 0xcf,
+    0xd0, 0x145, 0x14c, 0xd3, 0xd4, 0xd5, 0xd6, 0x168,
+    0xd8, 0x172, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf,
+    0x101, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0x12f,
+    0x10d, 0xe9, 0x119, 0xeb, 0x117, 0xed, 0xee, 0xef,
+    0xf0, 0x146, 0x14d, 0xf3, 0xf4, 0xf5, 0xf6, 0x169,
+    0xf8, 0x173, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0x138 },
+  /* ISO-8859-11 */
+  { 0xa0, 0xe01, 0xe02, 0xe03, 0xe04, 0xe05, 0xe06, 0xe07,
+    0xe08, 0xe09, 0xe0a, 0xe0b, 0xe0c, 0xe0d, 0xe0e, 0xe0f,
+    0xe10, 0xe11, 0xe12, 0xe13, 0xe14, 0xe15, 0xe16, 0xe17,
+    0xe18, 0xe19, 0xe1a, 0xe1b, 0xe1c, 0xe1d, 0xe1e, 0xe1f,
+    0xe20, 0xe21, 0xe22, 0xe23, 0xe24, 0xe25, 0xe26, 0xe27,
+    0xe28, 0xe29, 0xe2a, 0xe2b, 0xe2c, 0xe2d, 0xe2e, 0xe2f,
+    0xe30, 0xe31, 0xe32, 0xe33, 0xe34, 0xe35, 0xe36, 0xe37,
+    0xe38, 0xe39, 0xe3a, 0x0, 0x0, 0x0, 0x0, 0xe3f,
+    0xe40, 0xe41, 0xe42, 0xe43, 0xe44, 0xe45, 0xe46, 0xe47,
+    0xe48, 0xe49, 0xe4a, 0xe4b, 0xe4c, 0xe4d, 0xe4e, 0xe4f,
+    0xe50, 0xe51, 0xe52, 0xe53, 0xe54, 0xe55, 0xe56, 0xe57,
+    0xe58, 0xe59, 0xe5a, 0xe5b, 0xe31, 0xe34, 0xe47, 0xff },
+  /* ISO-8859-12 doesn't exist.  The below code decrements the index
+     into the table by one for ISO numbers > 12. */
+  /* ISO-8859-13 */
+  { 0xa0, 0x201d, 0xa2, 0xa3, 0xa4, 0x201e, 0xa6, 0xa7,
+    0xd8, 0xa9, 0x156, 0xab, 0xac, 0xad, 0xae, 0xc6,
+    0xb0, 0xb1, 0xb2, 0xb3, 0x201c, 0xb5, 0xb6, 0xb7,
+    0xf8, 0xb9, 0x157, 0xbb, 0xbc, 0xbd, 0xbe, 0xe6,
+    0x104, 0x12e, 0x100, 0x106, 0xc4, 0xc5, 0x118, 0x112,
+    0x10c, 0xc9, 0x179, 0x116, 0x122, 0x136, 0x12a, 0x13b,
+    0x160, 0x143, 0x145, 0xd3, 0x14c, 0xd5, 0xd6, 0xd7,
+    0x172, 0x141, 0x15a, 0x16a, 0xdc, 0x17b, 0x17d, 0xdf,
+    0x105, 0x12f, 0x101, 0x107, 0xe4, 0xe5, 0x119, 0x113,
+    0x10d, 0xe9, 0x17a, 0x117, 0x123, 0x137, 0x12b, 0x13c,
+    0x161, 0x144, 0x146, 0xf3, 0x14d, 0xf5, 0xf6, 0xf7,
+    0x173, 0x142, 0x15b, 0x16b, 0xfc, 0x17c, 0x17e, 0x2019 },
+  /* ISO-8859-14 */
+  { 0xa0, 0x1e02, 0x1e03, 0xa3, 0x10a, 0x10b, 0x1e0a, 0xa7,
+    0x1e80, 0xa9, 0x1e82, 0x1e0b, 0x1ef2, 0xad, 0xae, 0x178,
+    0x1e1e, 0x1e1f, 0x120, 0x121, 0x1e40, 0x1e41, 0xb6, 0x1e56,
+    0x1e81, 0x1e57, 0x1e83, 0x1e60, 0x1ef3, 0x1e84, 0x1e85, 0x1e61,
+    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
+    0x174, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0x1e6a,
+    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0x176, 0xdf,
+    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
+    0x175, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0x1e6b,
+    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0x177, 0xff },
+  /* ISO-8859-15 */
+  { 0xa0, 0xa1, 0xa2, 0xa3, 0x20ac, 0xa5, 0x160, 0xa7,
+    0x161, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0x17d, 0xb5, 0xb6, 0xb7,
+    0x17e, 0xb9, 0xba, 0xbb, 0x152, 0x153, 0x178, 0xbf,
+    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
+    0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
+    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf,
+    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
+    0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
+    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff },
+  /* ISO-8859-16 */
+  { 0xa0, 0x104, 0x105, 0x141, 0x20ac, 0x201e, 0x160, 0xa7,
+    0x161, 0xa9, 0x218, 0xab, 0x179, 0xad, 0x17a, 0x17b,
+    0xb0, 0xb1, 0x10c, 0x142, 0x17d, 0x201d, 0xb6, 0xb7,
+    0x17e, 0x10d, 0x219, 0xbb, 0x152, 0x153, 0x178, 0x17c,
+    0xc0, 0xc1, 0xc2, 0x102, 0xc4, 0x106, 0xc6, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
+    0x110, 0x143, 0xd2, 0xd3, 0xd4, 0x150, 0xd6, 0x15a,
+    0x170, 0xd9, 0xda, 0xdb, 0xdc, 0x118, 0x21a, 0xdf,
+    0xe0, 0xe1, 0xe2, 0x103, 0xe4, 0x107, 0xe6, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
+    0x111, 0x144, 0xf2, 0xf3, 0xf4, 0x151, 0xf6, 0x15b,
+    0x171, 0xf9, 0xfa, 0xfb, 0xfc, 0x119, 0x21b, 0xff }
+};
+
+/* Tables for the Windows default singlebyte ANSI codepage conversion. 
+   The first index into the table is a value computed from the codepage
+   value (function __cp_index), the second index is the value of the
+   incoming character - 0x80.
+   Values < 0x80 don't have to be converted anyway. */
+wchar_t __cp_conv[12][0x80] = {
+  /* CP737 */
+  { 0x391, 0x392, 0x393, 0x394, 0x395, 0x396, 0x397, 0x398,
+    0x399, 0x39a, 0x39b, 0x39c, 0x39d, 0x39e, 0x39f, 0x3a0,
+    0x3a1, 0x3a3, 0x3a4, 0x3a5, 0x3a6, 0x3a7, 0x3a8, 0x3a9,
+    0x3b1, 0x3b2, 0x3b3, 0x3b4, 0x3b5, 0x3b6, 0x3b7, 0x3b8,
+    0x3b9, 0x3ba, 0x3bb, 0x3bc, 0x3bd, 0x3be, 0x3bf, 0x3c0,
+    0x3c1, 0x3c3, 0x3c2, 0x3c4, 0x3c5, 0x3c6, 0x3c7, 0x3c8,
+    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
+    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
+    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
+    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
+    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
+    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
+    0x3c9, 0x3ac, 0x3ad, 0x3ae, 0x3ca, 0x3af, 0x3cc, 0x3cd,
+    0x3cb, 0x3ce, 0x386, 0x388, 0x389, 0x38a, 0x38c, 0x38e,
+    0x38f, 0xb1, 0x2265, 0x2264, 0x3aa, 0x3ab, 0xf7, 0x2248,
+    0xb0, 0x2219, 0xb7, 0x221a, 0x207f, 0xb2, 0x25a0, 0xa0 },
+  /* CP775 */
+  { 0x106, 0xfc, 0xe9, 0x101, 0xe4, 0x123, 0xe5, 0x107,
+    0x142, 0x113, 0x156, 0x157, 0x12b, 0x179, 0xc4, 0xc5,
+    0xc9, 0xe6, 0xc6, 0x14d, 0xf6, 0x122, 0xa2, 0x15a,
+    0x15b, 0xd6, 0xdc, 0xf8, 0xa3, 0xd8, 0xd7, 0xa4,
+    0x100, 0x12a, 0xf3, 0x17b, 0x17c, 0x17a, 0x201d, 0xa6,
+    0xa9, 0xae, 0xac, 0xbd, 0xbc, 0x141, 0xab, 0xbb,
+    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x104, 0x10c, 0x118,
+    0x116, 0x2563, 0x2551, 0x2557, 0x255d, 0x12e, 0x160, 0x2510,
+    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x172, 0x16a,
+    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x17d,
+    0x105, 0x10d, 0x119, 0x117, 0x12f, 0x161, 0x173, 0x16b,
+    0x17e, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
+    0xd3, 0xdf, 0x14c, 0x143, 0xf5, 0xd5, 0xb5, 0x144,
+    0x136, 0x137, 0x13b, 0x13c, 0x146, 0x112, 0x145, 0x2019,
+    0xad, 0xb1, 0x201c, 0xbe, 0xb6, 0xa7, 0xf7, 0x201e,
+    0xb0, 0x2219, 0xb7, 0xb9, 0xb3, 0xb2, 0x25a0, 0xa0 },
+  /* CP1125 */
+  { 0x410, 0x411, 0x412, 0x413, 0x414, 0x415, 0x416, 0x417,
+    0x418, 0x419, 0x41a, 0x41b, 0x41c, 0x41d, 0x41e, 0x41f,
+    0x420, 0x421, 0x422, 0x423, 0x424, 0x425, 0x426, 0x427,
+    0x428, 0x429, 0x42a, 0x42b, 0x42c, 0x42d, 0x42e, 0x42f,
+    0x430, 0x431, 0x432, 0x433, 0x434, 0x435, 0x436, 0x437,
+    0x438, 0x439, 0x43a, 0x43b, 0x43c, 0x43d, 0x43e, 0x43f,
+    0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
+    0x2555, 0x2563, 0x2551, 0x2557, 0x255d, 0x255c, 0x255b, 0x2510,
+    0x2514, 0x2534, 0x252c, 0x251c, 0x2500, 0x253c, 0x255e, 0x255f,
+    0x255a, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256c, 0x2567,
+    0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256b,
+    0x256a, 0x2518, 0x250c, 0x2588, 0x2584, 0x258c, 0x2590, 0x2580,
+    0x440, 0x441, 0x442, 0x443, 0x444, 0x445, 0x446, 0x447,
+    0x448, 0x449, 0x44a, 0x44b, 0x44c, 0x44d, 0x44e, 0x44f,
+    0x401, 0x451, 0x490, 0x491, 0x404, 0x454, 0x406, 0x456,
+    0x407, 0x457, 0xb7, 0x221a, 0x2116, 0xa4, 0x25a0, 0xa0 },
+  /* CP1250 */
+  { 0x20ac, 0x0, 0x201a, 0x0, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x0, 0x2030, 0x160, 0x2039, 0x15a, 0x164, 0x17d, 0x179,
+    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x0, 0x2122, 0x161, 0x203a, 0x15b, 0x165, 0x17e, 0x17a,
+    0xa0, 0x2c7, 0x2d8, 0x141, 0xa4, 0x104, 0xa6, 0xa7,
+    0xa8, 0xa9, 0x15e, 0xab, 0xac, 0xad, 0xae, 0x17b,
+    0xb0, 0xb1, 0x2db, 0x142, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0x105, 0x15f, 0xbb, 0x13d, 0x2dd, 0x13e, 0x17c,
+    0x154, 0xc1, 0xc2, 0x102, 0xc4, 0x139, 0x106, 0xc7,
+    0x10c, 0xc9, 0x118, 0xcb, 0x11a, 0xcd, 0xce, 0x10e,
+    0x110, 0x143, 0x147, 0xd3, 0xd4, 0x150, 0xd6, 0xd7,
+    0x158, 0x16e, 0xda, 0x170, 0xdc, 0xdd, 0x162, 0xdf,
+    0x155, 0xe1, 0xe2, 0x103, 0xe4, 0x13a, 0x107, 0xe7,
+    0x10d, 0xe9, 0x119, 0xeb, 0x11b, 0xed, 0xee, 0x10f,
+    0x111, 0x144, 0x148, 0xf3, 0xf4, 0x151, 0xf6, 0xf7,
+    0x159, 0x16f, 0xfa, 0x171, 0xfc, 0xfd, 0x163, 0x2d9 },
+  /* CP1251 */
+  { 0x402, 0x403, 0x201a, 0x453, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x20ac, 0x2030, 0x409, 0x2039, 0x40a, 0x40c, 0x40b, 0x40f,
+    0x452, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x0, 0x2122, 0x459, 0x203a, 0x45a, 0x45c, 0x45b, 0x45f,
+    0xa0, 0x40e, 0x45e, 0x408, 0xa4, 0x490, 0xa6, 0xa7,
+    0x401, 0xa9, 0x404, 0xab, 0xac, 0xad, 0xae, 0x407,
+    0xb0, 0xb1, 0x406, 0x456, 0x491, 0xb5, 0xb6, 0xb7,
+    0x451, 0x2116, 0x454, 0xbb, 0x458, 0x405, 0x455, 0x457,
+    0x410, 0x411, 0x412, 0x413, 0x414, 0x415, 0x416, 0x417,
+    0x418, 0x419, 0x41a, 0x41b, 0x41c, 0x41d, 0x41e, 0x41f,
+    0x420, 0x421, 0x422, 0x423, 0x424, 0x425, 0x426, 0x427,
+    0x428, 0x429, 0x42a, 0x42b, 0x42c, 0x42d, 0x42e, 0x42f,
+    0x430, 0x431, 0x432, 0x433, 0x434, 0x435, 0x436, 0x437,
+    0x438, 0x439, 0x43a, 0x43b, 0x43c, 0x43d, 0x43e, 0x43f,
+    0x440, 0x441, 0x442, 0x443, 0x444, 0x445, 0x446, 0x447,
+    0x448, 0x449, 0x44a, 0x44b, 0x44c, 0x44d, 0x44e, 0x44f },
+  /* CP1252 */
+  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x2c6, 0x2030, 0x160, 0x2039, 0x152, 0x0, 0x17d, 0x0,
+    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x2dc, 0x2122, 0x161, 0x203a, 0x153, 0x0, 0x17e, 0x178,
+    0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
+    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
+    0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
+    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf,
+    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
+    0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
+    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff },
+  /* CP1253 */
+  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x0, 0x2030, 0x0, 0x2039, 0x0, 0x0, 0x0, 0x0,
+    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x0, 0x2122, 0x0, 0x203a, 0x0, 0x0, 0x0, 0x0,
+    0xa0, 0x385, 0x386, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0x0, 0xab, 0xac, 0xad, 0xae, 0x2015,
+    0xb0, 0xb1, 0xb2, 0xb3, 0x384, 0xb5, 0xb6, 0xb7,
+    0x388, 0x389, 0x38a, 0xbb, 0x38c, 0xbd, 0x38e, 0x38f,
+    0x390, 0x391, 0x392, 0x393, 0x394, 0x395, 0x396, 0x397,
+    0x398, 0x399, 0x39a, 0x39b, 0x39c, 0x39d, 0x39e, 0x39f,
+    0x3a0, 0x3a1, 0x0, 0x3a3, 0x3a4, 0x3a5, 0x3a6, 0x3a7,
+    0x3a8, 0x3a9, 0x3aa, 0x3ab, 0x3ac, 0x3ad, 0x3ae, 0x3af,
+    0x3b0, 0x3b1, 0x3b2, 0x3b3, 0x3b4, 0x3b5, 0x3b6, 0x3b7,
+    0x3b8, 0x3b9, 0x3ba, 0x3bb, 0x3bc, 0x3bd, 0x3be, 0x3bf,
+    0x3c0, 0x3c1, 0x3c2, 0x3c3, 0x3c4, 0x3c5, 0x3c6, 0x3c7,
+    0x3c8, 0x3c9, 0x3ca, 0x3cb, 0x3cc, 0x3cd, 0x3ce, 0xff },
+  /* CP1254 */
+  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x2c6, 0x2030, 0x160, 0x2039, 0x152, 0x0, 0x0, 0x0,
+    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x2dc, 0x2122, 0x161, 0x203a, 0x153, 0x0, 0x0, 0x178,
+    0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
+    0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf,
+    0x11e, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7,
+    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0x130, 0x15e, 0xdf,
+    0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef,
+    0x11f, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7,
+    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0x131, 0x15f, 0xff },
+  /* CP1255 */
+  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x2c6, 0x2030, 0x0, 0x2039, 0x0, 0x0, 0x0, 0x0,
+    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x2dc, 0x2122, 0x0, 0x203a, 0x0, 0x0, 0x0, 0x0,
+    0xa0, 0xa1, 0xa2, 0xa3, 0x20aa, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0xd7, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0xb9, 0xf7, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
+    0x5b0, 0x5b1, 0x5b2, 0x5b3, 0x5b4, 0x5b5, 0x5b6, 0x5b7,
+    0x5b8, 0x5b9, 0x0, 0x5bb, 0x5bc, 0x5bd, 0x5be, 0x5bf,
+    0x5c0, 0x5c1, 0x5c2, 0x5c3, 0x5f0, 0x5f1, 0x5f2, 0x5f3,
+    0x5f4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+    0x5d0, 0x5d1, 0x5d2, 0x5d3, 0x5d4, 0x5d5, 0x5d6, 0x5d7,
+    0x5d8, 0x5d9, 0x5da, 0x5db, 0x5dc, 0x5dd, 0x5de, 0x5df,
+    0x5e0, 0x5e1, 0x5e2, 0x5e3, 0x5e4, 0x5e5, 0x5e6, 0x5e7,
+    0x5e8, 0x5e9, 0x5ea, 0x0, 0x0, 0x200e, 0x200f, 0xff },
+  /* CP1256 */
+  { 0x20ac, 0x67e, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x2c6, 0x2030, 0x679, 0x2039, 0x152, 0x686, 0x698, 0x688,
+    0x6af, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x6a9, 0x2122, 0x691, 0x203a, 0x153, 0x200c, 0x200d, 0x6ba,
+    0xa0, 0x60c, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0x6be, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0xb9, 0x61b, 0xbb, 0xbc, 0xbd, 0xbe, 0x61f,
+    0x6c1, 0x621, 0x622, 0x623, 0x624, 0x625, 0x626, 0x627,
+    0x628, 0x629, 0x62a, 0x62b, 0x62c, 0x62d, 0x62e, 0x62f,
+    0x630, 0x631, 0x632, 0x633, 0x634, 0x635, 0x636, 0xd7,
+    0x637, 0x638, 0x639, 0x63a, 0x640, 0x641, 0x642, 0x643,
+    0xe0, 0x644, 0xe2, 0x645, 0x646, 0x647, 0x648, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0x649, 0x64a, 0xee, 0xef,
+    0x64b, 0x64c, 0x64d, 0x64e, 0xf4, 0x64f, 0x650, 0xf7,
+    0x651, 0xf9, 0x652, 0xfb, 0xfc, 0x200e, 0x200f, 0x6d2 },
+  /* CP1257 */
+  { 0x20ac, 0x0, 0x201a, 0x0, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x0, 0x2030, 0x0, 0x2039, 0x0, 0xa8, 0x2c7, 0xb8,
+    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x0, 0x2122, 0x0, 0x203a, 0x0, 0xaf, 0x2db, 0x0,
+    0xa0, 0x0, 0xa2, 0xa3, 0xa4, 0x0, 0xa6, 0xa7,
+    0xd8, 0xa9, 0x156, 0xab, 0xac, 0xad, 0xae, 0xc6,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xf8, 0xb9, 0x157, 0xbb, 0xbc, 0xbd, 0xbe, 0xe6,
+    0x104, 0x12e, 0x100, 0x106, 0xc4, 0xc5, 0x118, 0x112,
+    0x10c, 0xc9, 0x179, 0x116, 0x122, 0x136, 0x12a, 0x13b,
+    0x160, 0x143, 0x145, 0xd3, 0x14c, 0xd5, 0xd6, 0xd7,
+    0x172, 0x141, 0x15a, 0x16a, 0xdc, 0x17b, 0x17d, 0xdf,
+    0x105, 0x12f, 0x101, 0x107, 0xe4, 0xe5, 0x119, 0x113,
+    0x10d, 0xe9, 0x17a, 0x117, 0x123, 0x137, 0x12b, 0x13c,
+    0x161, 0x144, 0x146, 0xf3, 0x14d, 0xf5, 0xf6, 0xf7,
+    0x173, 0x142, 0x15b, 0x16b, 0xfc, 0x17c, 0x17e, 0x2d9 },
+  /* CP1258 */
+  { 0x20ac, 0x0, 0x201a, 0x192, 0x201e, 0x2026, 0x2020, 0x2021,
+    0x2c6, 0x2030, 0x0, 0x2039, 0x152, 0x0, 0x0, 0x0,
+    0x0, 0x2018, 0x2019, 0x201c, 0x201d, 0x2022, 0x2013, 0x2014,
+    0x2dc, 0x2122, 0x0, 0x203a, 0x153, 0x0, 0x0, 0x178,
+    0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7,
+    0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf,
+    0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
+    0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf,
+    0xc0, 0xc1, 0xc2, 0x102, 0xc4, 0xc5, 0xc6, 0xc7,
+    0xc8, 0xc9, 0xca, 0xcb, 0x300, 0xcd, 0xce, 0xcf,
+    0x110, 0xd1, 0x309, 0xd3, 0xd4, 0x1a0, 0xd6, 0xd7,
+    0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0x1af, 0x303, 0xdf,
+    0xe0, 0xe1, 0xe2, 0x103, 0xe4, 0xe5, 0xe6, 0xe7,
+    0xe8, 0xe9, 0xea, 0xeb, 0x301, 0xed, 0xee, 0xef,
+    0x111, 0xf1, 0x323, 0xf3, 0xf4, 0x1a1, 0xf6, 0xf7,
+    0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0x1b0, 0x20ab, 0xff }
+};
+
+int
+__iso_8859_index (const char *charset_ext)
+{
+  int iso_idx = atoi (charset_ext);
+  if (iso_idx >= 2 && iso_idx <= 16)
+    {
+      iso_idx -= 2;
+      if (iso_idx > 10)
+	--iso_idx;
+      return iso_idx;
+    }
+  return -1;
+}
+
+int
+__cp_index (const char *charset_ext)
+{
+  int cp_idx = atoi (__locale_charset () + 2);
+  switch (cp_idx)
+    {
+    case 737:
+      cp_idx = 0;
+      break;
+    case 775:
+      cp_idx = 1;
+      break;
+    case 1125:
+      cp_idx = 2;
+      break;
+    case 1250:
+      cp_idx = 3;
+      break;
+    case 1251:
+      cp_idx = 4;
+      break;
+    case 1252:
+      cp_idx = 5;
+      break;
+    case 1253:
+      cp_idx = 6;
+      break;
+    case 1254:
+      cp_idx = 7;
+      break;
+    case 1255:
+      cp_idx = 8;
+      break;
+    case 1256:
+      cp_idx = 9;
+      break;
+    case 1257:
+      cp_idx = 10;
+      break;
+    case 1258:
+      cp_idx = 11;
+      break;
+    default:
+      cp_idx = -1;
+    }
+  return cp_idx;
+}
+#endif /* _MB_CAPABLE */
Index: libc/stdlib/wctomb_r.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/wctomb_r.c,v
retrieving revision 1.12
diff -u -p -r1.12 wctomb_r.c
--- libc/stdlib/wctomb_r.c	19 Mar 2009 19:47:52 -0000	1.12
+++ libc/stdlib/wctomb_r.c	20 Mar 2009 11:56:10 -0000
@@ -4,209 +4,326 @@
 #include <wchar.h>
 #include <locale.h>
 #include "mbctype.h"
+#include "local.h"
 
-extern char *__locale_charset ();
+int (*__wctomb) (struct _reent *, char *, wchar_t, mbstate_t *)
+#ifdef _MB_CAPABLE
+    = __iso_wctomb;
+#else
+    = __ascii_wctomb;
+#endif
 
+#ifdef _MB_CAPABLE
 /* for some conversions, we use the __count field as a place to store a state value */
 #define __state __count
 
 int
-_DEFUN (_wctomb_r, (r, s, wchar, state),
+_DEFUN (__utf8_wctomb, (r, s, wchar, state),
         struct _reent *r     _AND 
         char          *s     _AND
         wchar_t        _wchar _AND
         mbstate_t     *state)
 {
-  /* Avoids compiler warnings about comparisons that are always false
-     due to limited range when sizeof(wchar_t) is 2 but sizeof(wint_t)
-     is 4, as is the case on cygwin.  */
   wint_t wchar = _wchar;
 
-  if (strlen (__locale_charset ()) <= 1)
-    { /* fall-through */ }
-  else if (!strcmp (__locale_charset (), "UTF-8"))
-    {
-      if (s == NULL)
-        return 0; /* UTF-8 encoding is not state-dependent */
+  if (s == NULL)
+    return 0; /* UTF-8 encoding is not state-dependent */
 
-      if (state->__count == -4 && (wchar < 0xdc00 || wchar >= 0xdfff))
+  if (state->__count == -4 && (wchar < 0xdc00 || wchar >= 0xdfff))
+    {
+      /* At this point only the second half of a surrogate pair is valid. */
+      r->_errno = EILSEQ;
+      return -1;
+    }
+  if (wchar <= 0x7f)
+    {
+      *s = wchar;
+      return 1;
+    }
+  if (wchar >= 0x80 && wchar <= 0x7ff)
+    {
+      *s++ = 0xc0 | ((wchar & 0x7c0) >> 6);
+      *s   = 0x80 |  (wchar &  0x3f);
+      return 2;
+    }
+  if (wchar >= 0x800 && wchar <= 0xffff)
+    {
+      if (wchar >= 0xd800 && wchar <= 0xdfff)
 	{
-	  /* At this point only the second half of a surrogate pair is valid. */
-	  r->_errno = EILSEQ;
-	  return -1;
-	}
-      if (wchar <= 0x7f)
-        {
-          *s = wchar;
-          return 1;
-        }
-      else if (wchar >= 0x80 && wchar <= 0x7ff)
-        {
-          *s++ = 0xc0 | ((wchar & 0x7c0) >> 6);
-          *s   = 0x80 |  (wchar &  0x3f);
-          return 2;
-        }
-      else if (wchar >= 0x800 && wchar <= 0xffff)
-        {
-          if (wchar >= 0xd800 && wchar <= 0xdfff)
+	  wint_t tmp;
+	  /* UTF-16 surrogates -- must not occur in normal UCS-4 data */
+	  if (sizeof (wchar_t) != 2)
+	    {
+	      r->_errno = EILSEQ;
+	      return -1;
+	    }
+	  if (wchar >= 0xdc00)
 	    {
-	      wint_t tmp;
-	      /* UTF-16 surrogates -- must not occur in normal UCS-4 data */
-	      if (sizeof (wchar_t) != 2)
+	      /* Second half of a surrogate pair. It's not valid if
+		 we don't have already read a first half of a surrogate
+		 before. */
+	      if (state->__count != -4)
 		{
 		  r->_errno = EILSEQ;
 		  return -1;
 		}
-	      if (wchar >= 0xdc00)
-		{
-		  /* Second half of a surrogate pair. It's not valid if
-		     we don't have already read a first half of a surrogate
-		     before. */
-		  if (state->__count != -4)
-		    {
-		      r->_errno = EILSEQ;
-		      return -1;
-		    }
-		  /* If it's valid, reconstruct the full Unicode value and
-		     return the trailing three bytes of the UTF-8 char. */
-		  tmp = (state->__value.__wchb[0] << 16)
-			| (state->__value.__wchb[1] << 8)
-			| (wchar & 0x3ff);
-		  state->__count = 0;
-		  *s++ = 0x80 | ((tmp &  0x3f000) >> 12);
-		  *s++ = 0x80 | ((tmp &    0xfc0) >> 6);
-		  *s   = 0x80 |  (tmp &     0x3f);
-		  return 3;
-	      	}
-	      /* First half of a surrogate pair.  Store the state and return
-	         the first byte of the UTF-8 char. */
-	      tmp = ((wchar & 0x3ff) << 10) + 0x10000;
-	      state->__value.__wchb[0] = (tmp >> 16) & 0xff;
-	      state->__value.__wchb[1] = (tmp >> 8) & 0xff;
-	      state->__count = -4;
-	      *s = (0xf0 | ((tmp & 0x1c0000) >> 18));
-	      return 1;
+	      /* If it's valid, reconstruct the full Unicode value and
+		 return the trailing three bytes of the UTF-8 char. */
+	      tmp = (state->__value.__wchb[0] << 16)
+		    | (state->__value.__wchb[1] << 8)
+		    | (wchar & 0x3ff);
+	      state->__count = 0;
+	      *s++ = 0x80 | ((tmp &  0x3f000) >> 12);
+	      *s++ = 0x80 | ((tmp &    0xfc0) >> 6);
+	      *s   = 0x80 |  (tmp &     0x3f);
+	      return 3;
 	    }
-          *s++ = 0xe0 | ((wchar & 0xf000) >> 12);
-          *s++ = 0x80 | ((wchar &  0xfc0) >> 6);
-          *s   = 0x80 |  (wchar &   0x3f);
-          return 3;
-        }
-      else if (wchar >= 0x10000 && wchar <= 0x10ffff)
-        {
-          *s++ = 0xf0 | ((wchar & 0x1c0000) >> 18);
-          *s++ = 0x80 | ((wchar &  0x3f000) >> 12);
-          *s++ = 0x80 | ((wchar &    0xfc0) >> 6);
-          *s   = 0x80 |  (wchar &     0x3f);
-          return 4;
-        }
+	  /* First half of a surrogate pair.  Store the state and return
+	     the first byte of the UTF-8 char. */
+	  tmp = ((wchar & 0x3ff) << 10) + 0x10000;
+	  state->__value.__wchb[0] = (tmp >> 16) & 0xff;
+	  state->__value.__wchb[1] = (tmp >> 8) & 0xff;
+	  state->__count = -4;
+	  *s = (0xf0 | ((tmp & 0x1c0000) >> 18));
+	  return 1;
+	}
+      *s++ = 0xe0 | ((wchar & 0xf000) >> 12);
+      *s++ = 0x80 | ((wchar &  0xfc0) >> 6);
+      *s   = 0x80 |  (wchar &   0x3f);
+      return 3;
+    }
+  if (wchar >= 0x10000 && wchar <= 0x10ffff)
+    {
+      *s++ = 0xf0 | ((wchar & 0x1c0000) >> 18);
+      *s++ = 0x80 | ((wchar &  0x3f000) >> 12);
+      *s++ = 0x80 | ((wchar &    0xfc0) >> 6);
+      *s   = 0x80 |  (wchar &     0x3f);
+      return 4;
+    }
+
+  r->_errno = EILSEQ;
+  return -1;
+}
+
+int
+_DEFUN (__sjis_wctomb, (r, s, wchar, state),
+        struct _reent *r     _AND 
+        char          *s     _AND
+        wchar_t        _wchar _AND
+        mbstate_t     *state)
+{
+  wint_t wchar = _wchar;
+
+  unsigned char char2 = (unsigned char)wchar;
+  unsigned char char1 = (unsigned char)(wchar >> 8);
+
+  if (s == NULL)
+    return 0;  /* not state-dependent */
+
+  if (char1 != 0x00)
+    {
+    /* first byte is non-zero..validate multi-byte char */
+      if (_issjis1(char1) && _issjis2(char2)) 
+	{
+	  *s++ = (char)char1;
+	  *s = (char)char2;
+	  return 2;
+	}
       else
 	{
 	  r->_errno = EILSEQ;
 	  return -1;
 	}
     }
-  else if (!strcmp (__locale_charset (), "SJIS"))
+  *s = (char) wchar;
+  return 1;
+}
+
+int
+_DEFUN (__eucjp_wctomb, (r, s, wchar, state),
+        struct _reent *r     _AND 
+        char          *s     _AND
+        wchar_t        _wchar _AND
+        mbstate_t     *state)
+{
+  wint_t wchar = _wchar;
+  unsigned char char2 = (unsigned char)wchar;
+  unsigned char char1 = (unsigned char)(wchar >> 8);
+
+  if (s == NULL)
+    return 0;  /* not state-dependent */
+
+  if (char1 != 0x00)
     {
-      unsigned char char2 = (unsigned char)wchar;
-      unsigned char char1 = (unsigned char)(wchar >> 8);
+    /* first byte is non-zero..validate multi-byte char */
+      if (_iseucjp (char1) && _iseucjp (char2)) 
+	{
+	  *s++ = (char)char1;
+	  *s = (char)char2;
+	  return 2;
+	}
+      else
+	{
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
+    }
+  *s = (char) wchar;
+  return 1;
+}
 
-      if (s == NULL)
-        return 0;  /* not state-dependent */
+int
+_DEFUN (__jis_wctomb, (r, s, wchar, state),
+        struct _reent *r     _AND 
+        char          *s     _AND
+        wchar_t        _wchar _AND
+        mbstate_t     *state)
+{
+  wint_t wchar = _wchar;
+  int cnt = 0; 
+  unsigned char char2 = (unsigned char)wchar;
+  unsigned char char1 = (unsigned char)(wchar >> 8);
 
-      if (char1 != 0x00)
-        {
-        /* first byte is non-zero..validate multi-byte char */
-          if (_issjis1(char1) && _issjis2(char2)) 
-            {
-              *s++ = (char)char1;
-              *s = (char)char2;
-              return 2;
-            }
-          else
+  if (s == NULL)
+    return 1;  /* state-dependent */
+
+  if (char1 != 0x00)
+    {
+    /* first byte is non-zero..validate multi-byte char */
+      if (_isjis (char1) && _isjis (char2)) 
+	{
+	  if (state->__state == 0)
 	    {
-	      r->_errno = EILSEQ;
-	      return -1;
+	      /* must switch from ASCII to JIS state */
+	      state->__state = 1;
+	      *s++ = ESC_CHAR;
+	      *s++ = '$';
+	      *s++ = 'B';
+	      cnt = 3;
 	    }
-        }
+	  *s++ = (char)char1;
+	  *s = (char)char2;
+	  return cnt + 2;
+	}
+      r->_errno = EILSEQ;
+      return -1;
+    }
+  if (state->__state != 0)
+    {
+      /* must switch from JIS to ASCII state */
+      state->__state = 0;
+      *s++ = ESC_CHAR;
+      *s++ = '(';
+      *s++ = 'B';
+      cnt = 3;
     }
-  else if (!strcmp (__locale_charset (), "EUCJP"))
+  *s = (char)char2;
+  return cnt + 1;
+}
+
+int
+_DEFUN (__iso_wctomb, (r, s, wchar, state),
+        struct _reent *r     _AND 
+        char          *s     _AND
+        wchar_t        _wchar _AND
+        mbstate_t     *state)
+{
+  wint_t wchar = _wchar;
+
+  if (s == NULL)
+    return 0;
+
+  /* wchars <= 0x9f translate to all ISO charsets directly. */
+  if (wchar >= 0xa0)
     {
-      unsigned char char2 = (unsigned char)wchar;
-      unsigned char char1 = (unsigned char)(wchar >> 8);
+      int iso_idx = __iso_8859_index (__locale_charset () + 9);
+      if (iso_idx >= 0)
+	{
+	  unsigned char mb;
 
-      if (s == NULL)
-        return 0;  /* not state-dependent */
-
-      if (char1 != 0x00)
-        {
-        /* first byte is non-zero..validate multi-byte char */
-          if (_iseucjp (char1) && _iseucjp (char2)) 
-            {
-              *s++ = (char)char1;
-              *s = (char)char2;
-              return 2;
-            }
-          else
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-        }
+	  if (s == NULL)
+	    return 0;
+
+	  for (mb = 0; mb < 0x60; ++mb)
+	    if (__iso_8859_conv[iso_idx][mb] == wchar)
+	      {
+		*s = (char) (mb + 0xa0);
+		return 1;
+	      }
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
     }
-  else if (!strcmp (__locale_charset (), "JIS"))
+ 
+  if ((size_t)wchar >= 0x100)
     {
-      int cnt = 0; 
-      unsigned char char2 = (unsigned char)wchar;
-      unsigned char char1 = (unsigned char)(wchar >> 8);
-
-      if (s == NULL)
-        return 1;  /* state-dependent */
-
-      if (char1 != 0x00)
-        {
-        /* first byte is non-zero..validate multi-byte char */
-          if (_isjis (char1) && _isjis (char2)) 
-            {
-              if (state->__state == 0)
-                {
-                  /* must switch from ASCII to JIS state */
-                  state->__state = 1;
-                  *s++ = ESC_CHAR;
-                  *s++ = '$';
-                  *s++ = 'B';
-                  cnt = 3;
-                }
-              *s++ = (char)char1;
-              *s = (char)char2;
-              return cnt + 2;
-            }
-          else
-	    {
-	      r->_errno = EILSEQ;
-	      return -1;
-	    }
-        }
-      else
-        {
-          if (state->__state != 0)
-            {
-              /* must switch from JIS to ASCII state */
-              state->__state = 0;
-              *s++ = ESC_CHAR;
-              *s++ = '(';
-              *s++ = 'B';
-              cnt = 3;
-            }
-          *s = (char)char2;
-          return cnt + 1;
-        }
+      r->_errno = EILSEQ;
+      return -1;
+    }
+
+  *s = (char) wchar;
+  return 1;
+}
+
+int
+_DEFUN (__cp_wctomb, (r, s, wchar, state),
+        struct _reent *r     _AND 
+        char          *s     _AND
+        wchar_t        _wchar _AND
+        mbstate_t     *state)
+{
+  wint_t wchar = _wchar;
+
+  if (s == NULL)
+    return 0;
+
+  if (wchar >= 0x80)
+    {
+      int cp_idx = __cp_index (__locale_charset () + 2);
+      if (cp_idx >= 0)
+	{
+	  unsigned char mb;
+
+	  if (s == NULL)
+	    return 0;
+
+	  for (mb = 0; mb < 0x80; ++mb)
+	    if (__cp_conv[cp_idx][mb] == wchar)
+	      {
+		*s = (char) (mb + 0x80);
+		return 1;
+	      }
+	  r->_errno = EILSEQ;
+	  return -1;
+	}
     }
 
+  if ((size_t)wchar >= 0x100)
+    {
+      r->_errno = EILSEQ;
+      return -1;
+    }
+
+  *s = (char) wchar;
+  return 1;
+}
+#endif /* _MB_CAPABLE */
+
+int
+_DEFUN (__ascii_wctomb, (r, s, wchar, state),
+        struct _reent *r     _AND 
+        char          *s     _AND
+        wchar_t        _wchar _AND
+        mbstate_t     *state)
+{
+  /* Avoids compiler warnings about comparisons that are always false
+     due to limited range when sizeof(wchar_t) is 2 but sizeof(wint_t)
+     is 4, as is the case on cygwin.  */
+  wint_t wchar = _wchar;
+
   if (s == NULL)
     return 0;
  
-  /* otherwise we are dealing with a single byte character */
   if ((size_t)wchar >= 0x100)
     {
       r->_errno = EILSEQ;
@@ -216,4 +333,13 @@ _DEFUN (_wctomb_r, (r, s, wchar, state),
   *s = (char) wchar;
   return 1;
 }
-    
+
+int
+_DEFUN (_wctomb_r, (r, s, wchar, state),
+        struct _reent *r     _AND 
+        char          *s     _AND
+        wchar_t        _wchar _AND
+        mbstate_t     *state)
+{
+  return __wctomb (r, s, _wchar, state);
+}


-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat



More information about the Newlib mailing list