[RFC] Refresh iswblank and iswspace (was Re: Update wctype functions to Unicode 5.2?)
Corinna Vinschen
vinschen@redhat.com
Sat Feb 13 14:38:00 GMT 2010
On Feb 12 21:57, Corinna Vinschen wrote:
> Additionally the functions iswblank, iswspace, towlower and towupper
> could need some revamp. If an update of the aforementioned tables to
> Unicode 5.2 is not a big deal, I'd volunteer to update these functions
> as required.
For a start, here are patches to iswblank and iswspace to add the missing
characters U+180e (MONGOLIAN VOWEL SEPARATOR), U+2007 (FIGURE SPACE), and
U+202f (NARROW NO-BREAK SPACE). I also took out U+200b (ZERO WIDTH SPACE)
which is explicitely not marked as space character in the Unicode
database, despite its designation.
I also changed the formatting to GNU and lowercased 0x00A0 since it was
the only uppercased hex digit in both files.
Ok to apply?
Thanks,
Corinna
* libc/ctype/iswblank.c (iswblank): Add missing chars from more
recent Unicode character table. Reformat slightly.
* libc/ctype/iswspace.c (iswspace): Ditto.
Index: libc/ctype/iswblank.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/ctype/iswblank.c,v
retrieving revision 1.8
diff -u -p -r1.8 iswblank.c
--- libc/ctype/iswblank.c 24 Aug 2009 16:59:35 -0000 1.8
+++ libc/ctype/iswblank.c 13 Feb 2010 14:27:59 -0000
@@ -67,11 +67,11 @@ _DEFUN(iswblank,(c), wint_t c)
{
#ifdef _MB_CAPABLE
c = _jp2uc (c);
- return (c == 0x0009 || c == 0x0020 ||
- c == 0x00A0 || c == 0x1680 ||
- (c >= 0x2000 && c <= 0x2006) ||
- (c >= 0x2008 && c <= 0x200b) ||
- c == 0x205f || c == 0x3000);
+ /* Based un Unicode 5.2 */
+ return (c == 0x0009 || c == 0x0020 || c == 0x00a0
+ || c == 0x1680 || c == 0x180e
+ || (c >= 0x2000 && c <= 0x200a)
+ || c == 0x202f || c == 0x205f || c == 0x3000);
#else
return (c < 0x100 ? isblank (c) : 0);
#endif /* _MB_CAPABLE */
Index: libc/ctype/iswspace.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/ctype/iswspace.c,v
retrieving revision 1.8
diff -u -p -r1.8 iswspace.c
--- libc/ctype/iswspace.c 24 Aug 2009 16:59:35 -0000 1.8
+++ libc/ctype/iswspace.c 13 Feb 2010 14:27:59 -0000
@@ -67,12 +67,13 @@ _DEFUN(iswspace,(c), wint_t c)
{
#ifdef _MB_CAPABLE
c = _jp2uc (c);
- return ((c >= 0x0009 && c <= 0x000d) || c == 0x0020 ||
- c == 0x00A0 || c == 0x1680 ||
- (c >= 0x2000 && c <= 0x2006) ||
- (c >= 0x2008 && c <= 0x200b) ||
- c == 0x2028 || c == 0x2029 ||
- c == 0x205f || c == 0x3000);
+ /* Based un Unicode 5.2 */
+ return ((c >= 0x0009 && c <= 0x000d)
+ || c == 0x0020 || c == 0x00a0
+ || c == 0x1680 || c == 0x180e
+ || (c >= 0x2000 && c <= 0x200a)
+ || c == 0x2028 || c == 0x2029 || c == 0x202f
+ || c == 0x205f || c == 0x3000);
#else
return (c < 0x100 ? isspace (c) : 0);
#endif /* _MB_CAPABLE */
--
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat
More information about the Newlib
mailing list