isspace() & i18n

Earnie Boyd earnie_boyd@yahoo.com
Wed May 30 08:49:00 GMT 2001


egor duda wrote:
> 
> Hi!
> 
> Wednesday, 30 May, 2001 Christopher Faylor cgf@redhat.com wrote:
> 
> CF> On Wed, May 30, 2001 at 06:11:56PM +0400, egor duda wrote:
> >>Wednesday, 30 May, 2001 Christopher Faylor cgf@redhat.com wrote:
> >>CF> On Wed, May 30, 2001 at 02:57:56PM +0400, egor duda wrote:
> >>>>  cygwin calls newlib's isspace() passing it a signed char. this works
> >>>>ok for ascii symbols 0x00-0x7f, but fails with, say, cyrillic symbols
> >>>> with codes > 0x80. As a result `cd dir-with-last-cyrillic-letter'
> >>>>fails as chdir strips last symbols, thinking they're spaces --
> >>>>isspace() is called with negative parameter.
> >>>>
> >>>>Any thoughts as of how we should handle this?
> >>
> >>CF> Maybe we just need a cygwin_isspace which checks for just tabs and spaces?
> >>
> >>it's possible, of course, but the problem with referencing negative
> >>array indices in is*() remains.
> >>
> >>i think we should either conform to standard and explicitly convert
> >>types or define appropriate strings as unsigned char*, (typedef PATH_STR,
> >>perhaps), or define cygwin_is*() as macros that do the conversion, or,
> >>as glibc does, expand _ctype to allow indices in range [-128,256].
> 
> CF> IMO, "we" should convert the arguments to the is* functions to unsigned
> CF> char. This is a decision for the newlib folks though, isn't it?
> 
> in actual calls, not in prototypes. newlib's currently takes int as
> is*() argument, and is absolutely right because standard says so. it's
> a caller's responsibility to provide correct parameters. so conversion
> should take place in cygwin code, not in newlib's. mimicking glibc
> behavior -- accepting negative arguments to support broken callers, is
> newlib folks decision.
> 

So does this solve the problem?

-- 
Earnie.
Index: path.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/path.cc,v
retrieving revision 1.139
diff -u -p -r1.139 path.cc
--- path.cc	2001/05/14 02:52:12	1.139
+++ path.cc	2001/05/30 15:45:49
@@ -2929,7 +2929,7 @@ chdir (const char *dir)
      whitespace to SetCurrentDirectory.  This doesn't work too well
      with other parts of the API, though, apparently.  So nuke trailing
      white space. */
-  for (s = strchr (dir, '\0'); --s >= dir && isspace (*s); )
+  for (s = strchr (dir, '\0'); --s >= dir && isspace ((unsigned int)*s); )
     *s = '\0';
 
   if (path.error)


More information about the Cygwin-developers mailing list