This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: should mbrtowc(&wc, "", 1, &ps) set wc?
- To: linux-utf8 at nl dot linux dot org
- Subject: Re: should mbrtowc(&wc, "", 1, &ps) set wc?
- From: Markus Kuhn <Markus dot Kuhn at cl dot cam dot ac dot uk>
- Date: Sat, 11 Nov 2000 18:45:36 +0000
- cc: libc-alpha at sourceware dot cygnus dot com
Edmund GRIMLEY EVANS wrote on 2000-11-11 16:56 UTC:
> Should mbrtowc(&wc, "", 1, &ps) set wc?
Yes. It must set wc=0 and return 0 (at least in all ISO 8859 and UTF-8
locales), because the next 1 byte is (and therefore completes) "the
multibyte character that corresponds to the null wide character".
ISO/IEC 9899:1999:
7.24.6.3.2 The mbrtowc function
[...]
[#4] The mbrtowc function returns the first of the following
that applies (given the current conversion state):
0 if the next n or fewer bytes complete the
multibyte character that corresponds to the
null wide character (which is the value
^^^^^^^^^^^^^^^^^^^^^
stored).
^^^^^^
[...]
> Apparently it does with glibc-2.1 and with Bruno's libutf8, but it
> doesn't with glibc-2.2 (from CVS a few days ago).
That sounds indeed like a glibc 2.2 bug then.
> Program:
>
> #include <locale.h>
> #include <stdio.h>
> #include <wchar.h>
>
> int main()
> {
> mbstate_t ps;
> size_t r;
> wchar_t wc = 333;
>
> setlocale(LC_ALL, "");
>
> memset(&ps, 0, sizeof(ps));
> r = mbrtowc(&wc, "a", 1, &ps);
> printf("%d %d\n", r, (int)wc);
> r = mbrtowc(&wc, "", 1, &ps);
> printf("%d %d\n", r, (int)wc);
>
> return 0;
> }
>
> Result with glibc-2.1 or libutf8:
>
> 1 97
> 0 0
>
> Result with glibc-2.2:
>
> 1 97
> 0 97
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>