This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: Formatting command line arguments when starting a Cygwin process from a native process
- From: Peter Rosin <peda at lysator dot liu dot se>
- To: cygwin at cygwin dot com, David Allsopp <dra-news at metastack dot com>
- Date: Mon, 9 May 2016 11:43:19 +0200
- Subject: Re: Formatting command line arguments when starting a Cygwin process from a native process
- Authentication-results: sourceware.org; auth=none
- References: <005c01d1a6e2$30270ba0$907522e0$ at metastack dot com> <CACoZoo1LObZ0zu9X5O6dV4cO4jN+GO28bdRbuDkTMdaKHXpVbQ at mail dot gmail dot com> <000101d1a76d$c37c6b80$4a754280$ at metastack dot com> <967954968 dot 20160506172040 at yandex dot ru> <006301d1a834$6ccd1380$46673a80$ at cantab dot net>
Hi!
On 2016-05-07 09:45, David Allsopp wrote:
> Andrey Repin wrote:
>> Greetings, David Allsopp!
>
> And greetings to you, too!
>
> <snip>
>
>>> I'm not using cmd, or any shell for that matter (that's actually the
>>> point) - I am in a native Win32 process invoking a Cygwin process
>>> directly using the Windows API's CreateProcess call. As it happens,
>>> the program I have already has the arguments for the Cygwin process
>>> in an array, but Windows internally requires a single command line
>>> string (which is not in any related to Cmd).
>>
>> Then all you need is a rudimentary quoting.
>
> Yes, but the question still remains what that rudimentary quoting is - i.e.
> I can see how to quote spaces which appear in elements of argv, but I cannot
> see how to quote double quotes!
>
>> The rest will be handled by getopt when the command line is parsed.
>
> That's outside my required level - I'm interested in Cygwin's emulation
> handling the difference between an operating system which actually passes
> argc and argv when creating processes (Posix exec/spawn) and Windows (which
> only passes a single string command line). The Microsoft C Runtime and
> Windows have a "clear" (at least by MS standards) specification of how that
> single string gets converted to argv, I'm trying to determine Cygwin's -
> getopt definitely isn't part of that.
>
>>>> However, I've found Windows's interpretation to be inconsistent, so
>>>> often have to play with it to find what the "right combination" is
>>>> for a particular instance.
>>>>
>>>> I find echoing the parameters to a temporary text file and then
>>>> using the file as input to be more reliable and easier to
>>>> troubleshoot, and it breaks apart whether it is Windows cli
>>>> inconsistencies or receiving program issues very nicely with the
>>>> text file content as an intermediary
>>
>>> This is an OK tack, but I don't wish to do this by experimentation
>>> and get caught out later by a case I didn't think of, so what I'm
>>> trying to determine is *exactly* how the Cygwin DLL processes the
>>> command line via its source code so that I can present it with my
>>> argv array converted to a single command line and be certain that
>>> the Cygwin will
>> recover the same argv DLL.
>>
>>> My reading of the relevant sources suggests that with globbing
>>> disabled, backslash escape sequences are *never* interpreted (since
>>> the quote function returns early - dcrt0.cc, line 171). If there is
>>> no way of encoding the double quote character, then perhaps I have
>>> to run with globbing enabled but ensure that the globify function
>>> will never actually expand anything - but as that's a lot of work, I
>>> was wondering
>> if I was missing something with the simpler "noglob" case.
>>
>> The point being, when you pass the shell and enter direct process
>> execution, you don't need much of shell magic at all.
>> Shell conventions designed to ease interaction between system and
>> operator.
>> But you have a system talking to the system, you can be very literal.
>
> Indeed, which is why I'm trying to avoid the shell! But I can't be entirely
> literal, because Posix and Windows are not compatible, so I need to
> determine precisely how Cygwin's emulation works... and so far, it doesn't
> seem to be a terribly clearly defined animal!
>
> So, resorting to C files to try to demonstrate it further. spawn.cc seems to
> suggest that there should be some kind of escaping available, but I'm
> struggling to follow the code. Consider these two:
>
> callee.c
> #include <stdio.h>
> int main (int argc, char* argv[])
> {
> int i;
>
> printf("argc = %d\n", argc);
> for (i = 0; i < argc; i++) {
> printf("argv[%d] = %s\n", i, *argv++);
> }
> return 0;
> }
>
> caller.c
> #include <windows.h>
> #include <stdio.h>
>
> int main (void)
> {
> LPTSTR commandLine;
> STARTUPINFO startupInfo = {sizeof(STARTUPINFO), NULL, NULL, NULL, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, NULL, NULL, NULL, NULL};
> PROCESS_INFORMATION process = {NULL, NULL, 0, 0};
>
> commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz *";
> if (!CreateProcess("callee.exe", commandLine, NULL, NULL, FALSE, 0,
> NULL, NULL, &startupInfo, &process)) {
> printf("Error spawning process!\n");
> return 1;
> } else {
> WaitForSingleObject(process.hProcess, INFINITE);
> CloseHandle(process.hThread);
> CloseHandle(process.hProcess);
> return 0;
> }
> }
>
> If you compile as follows:
>
> $ gcc -o callee callee.c
> $ i686-w64-mingw32-gcc -o caller caller.c
> $ export CYGWIN=noglob # Or the * will be expanded
> $ ./caller
>
> and the output is as required:
> argc = 6
> argv[0] = callee
> argv[1] = @te
> st
> argv[2] = fo@o
> argv[3] = bar baz
> argv[4] = fliggle
> argv[5] = *
>
> But if I want to embed an actual " character in any of those arguments, I
> cannot see any way to escape it which actually works at the moment. For
> example, if you change commandLine in caller.c to be "callee.exe test\\\"
> argument" then the erroneous output is:
>
> argc = 2
> argv[0] = callee
> argv[1] = test\ argument
>
> where the required output is
>
> argc = 3
> argv[0] = callee
> argv[1] = test"
> argv[2] = argument
>
> Any further clues appreciated. Is it actually even a bug?!
I think cygwin emulates posix shell style command line parsing when
invoked from a Win32 process (like you do). So, try single quotes:
commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz '*' '\"\\'\"'";
I get this (w/o noglob):
argc = 7
argv[0] = callee
argv[1] = @te
st
argv[2] = fo@o
argv[3] = bar baz
argv[4] = baz
argv[5] = *
argv[6] = "'"
Cheers,
Peter
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple