Unconsistent command-line parsing in case of UTF-8 quoted arguments

Jérôme Froissart software@froissart.eu
Tue Oct 6 21:36:07 GMT 2020

Thanks for your replies.
This issue only happens when a program is run from cmd.exe, not from a
Cygwin bash shell.
This is important for me, since I discovered this bug in a project
that must be run from Windows graphical shell (i.e. there is no
sensible way to run it through Cygwin and Bash).

> Please show us the output from "uname -a" and "locale" run from the bash prompt.

> Please provide the results of "locale" command right before running your test
> binary.
Here are the more detailed steps to reproduce the issue (along with
answers to your requests about `uname`, `locale`, etc.).
(I mostly reproduced what billziss-gh had done before, I do not take
all the credits :D)

Here is an example C file
    $ cat example.c
    #include <stdio.h>

    const char *GetCommandLineA(void);

    int main(int argc, char *argv[])
        const char *s = GetCommandLineA();
        printf("C=%s\n", s);

        for (int i = 0; argc > i; i++)
            printf("%d=%s\n", i, argv[i]);

        return 0;

I have built it with gcc from Cygwin
    $ gcc -o binary example.c

Running it from the same Cygwin bash prompt works as expected
    $ uname -a
    CYGWIN_NT-10.0 XPS 3.1.5(0.340/5/3) 2020-06-01 08:59 x86_64 Cygwin
    # (XPS is my Windows machine name)

    $ locale

    $ which gcc

    # The following runs as expected
    $ ./binary.exe "foo bar" "Jérôme"
    1=foo bar

Now, let's start a Windows shell (cmd.exe)
Note that I had to copy cygwin1.dll from my Cygwin installation
directory, otherwise binary.exe would not start.
I do not know whether there is a `locale` equivalent in Windows
command prompt, so I merely ran my program.
    C:\Users\Public>binary.exe "foo bar" "Jérôme"
    C=binary.exe  "foo bar" "J□r□me"
    1=foo bar

This behaviour is not expected and is quite inconsistent with what
happened through Bash.
Besides the "strange squares" that appear on the first line, and the
extra space after binary.exe, I especially did not expect "Jérôme" to
remain quoted as a second argument.

Sorry for the delay in my answer. I hope this is now clear, please ask
me for more examples or investigation if you need.
Thanks for your help.


