Bug in mbsrtowcs?

Corinna Vinschen vinschen@redhat.com
Fri Feb 13 18:35:00 GMT 2009


while I'm looking into implementing the new SUSv4 functions wcsnrtombs
and mbsnrtowcs, I started puzzeling over a strange piece of code in

  while (n > 0)
      bytes = _mbrtowc_r (r, ptr, *src, nms, ps);
      else if (bytes == -2)
        *src += MB_CUR_MAX;
      else [...]

So, if the byte sequence starting at *src is an incomplete multibyte
char, *src is skipped by MB_CUR_MAX and the loop continues.

Hang on.  If _mbrtowc_r encounters an incomplete MB char then it does
not form an invalid character so there's no reason to return with -1 and
set errno to EILSEQ.  However, it also doesn't form a *valid* character,
it's just incomplete.  Thus it must be the start of the last character
at the end of the input string.

If that's correct, then it appears incorrect to me to move *src by
MB_CUR_MAX because that implies that a mb to wc conversion has been
successful.  Worse, it also means that *src has been skipped over the
incomplete MB sequence.  The returned value of *src does not allow the
calling function to see exactly where the conversion has stopped.
Instead it points to an arbitrary character MB_CUR_MAX bytes after the
start of the incomplete character where the conversion stopped.  This
could lead to all sorts of incorrect results in subsequent character
handling functions, AFAICS.  You don't even know where you're pointing
to after doing that, it could be well after the trailing \0.

If I'm not entirely off-track, that means we can immediately exit from 
wcsrtombs in this case and return the current value of count and not
move *src from the current position:

Index: mbsrtowcs.c
RCS file: /cvs/src/src/newlib/libc/stdlib/mbsrtowcs.c,v
retrieving revision 1.5
diff -u -p -r1.5 mbsrtowcs.c
--- mbsrtowcs.c	23 Apr 2004 21:44:22 -0000	1.5
+++ mbsrtowcs.c	13 Feb 2009 14:12:56 -0000
@@ -48,9 +48,7 @@ _DEFUN (_mbsrtowcs_r, (r, dst, src, n, p
       else if (bytes == -2)
-	{
-	  *src += MB_CUR_MAX;
-	}
+	return count;
       else if (bytes == 0)
 	  *src = NULL;

Still under the assumption that I'm not off-track, you don't have to
fix that since I'll fix that when sending my wcsnrtombs/mbsnrtowcs


Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

More information about the Newlib mailing list