This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [1.7] bug in printf and %ls

On May 15 13:30, Alexey Borzenkov wrote:
> [...]
>  It appears that there's a bug in printf with %ls that
> will refuse to print the string completely if the wide string for %ls
> cannot be represented in current charset. It's interesting that
> sometimes it behaves differently. For example:
> $ mkpasswd -C
> NDGAMES\aborzenkov:unused:11721:10513:U-NDGAMES\aborzenkov,*sidremoved*:/home/aborzenkov:/bin/bash
> $ mkgroup -C
> Notice that in the second case it somehow managed to print domain name
> and separator before failing.
> Another example:
> #include <stdio.h>
> #include <locale.h>
> int main(int argc, char** argv)
> {
>   setlocale(LC_ALL, "en_US.CP1252");
>   printf("'%ls'", L"\u0410\u0411\u0412");
>   return 0;
> }
> Prints nothing, i.e. it doesn't print neither of single quotes. If it
> couldn't represent those characters, I think it should either ignore
> them, or try to display them with SO-UTF-8. Making printf call fail
> like that is, imho, really unexpected.

printf must not decide by itself over the charset to use for the widechar
to multibyte conversion.  If you run the same on Linux, you also get a
broken output.  It only manages to print the leading quoting char.  It
does not print the second quoting char, because the mbtowc conversion
failed.  If you check the return code of printf, you see why:

  if (printf("'%ls'xxx", L"\u0410\u0411\u0412") < 0)
    perror ("\nprintf");

prints "printf: Invalid or incomplete multibyte or wide character"
on Linux as well as on Cygwin.

I'll change mkgroup and mkpasswd to call setlocale and to fall back to
UTF-8 if the locale is "C".


Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

Unsubscribe info:
Problem reports:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]