This is the mail archive of the
cygwin
mailing list for the Cygwin project.
RE: Formatting command line arguments when starting a Cygwin process from a native process
- From: "David Allsopp" <dra27 at cantab dot net>
- To: <cygwin at cygwin dot com>
- Date: Sat, 7 May 2016 09:45:27 +0200
- Subject: RE: Formatting command line arguments when starting a Cygwin process from a native process
- Authentication-results: sourceware.org; auth=none
- References: <005c01d1a6e2$30270ba0$907522e0$ at metastack dot com> <CACoZoo1LObZ0zu9X5O6dV4cO4jN+GO28bdRbuDkTMdaKHXpVbQ at mail dot gmail dot com> <000101d1a76d$c37c6b80$4a754280$ at metastack dot com> <967954968 dot 20160506172040 at yandex dot ru>
Andrey Repin wrote:
> Greetings, David Allsopp!
And greetings to you, too!
<snip>
> > I'm not using cmd, or any shell for that matter (that's actually the
> > point) - I am in a native Win32 process invoking a Cygwin process
> > directly using the Windows API's CreateProcess call. As it happens,
> > the program I have already has the arguments for the Cygwin process
> > in an array, but Windows internally requires a single command line
> > string (which is not in any related to Cmd).
>
> Then all you need is a rudimentary quoting.
Yes, but the question still remains what that rudimentary quoting is - i.e.
I can see how to quote spaces which appear in elements of argv, but I cannot
see how to quote double quotes!
> The rest will be handled by getopt when the command line is parsed.
That's outside my required level - I'm interested in Cygwin's emulation
handling the difference between an operating system which actually passes
argc and argv when creating processes (Posix exec/spawn) and Windows (which
only passes a single string command line). The Microsoft C Runtime and
Windows have a "clear" (at least by MS standards) specification of how that
single string gets converted to argv, I'm trying to determine Cygwin's -
getopt definitely isn't part of that.
> >> However, I've found Windows's interpretation to be inconsistent, so
> >> often have to play with it to find what the "right combination" is
> >> for a particular instance.
> >>
> >> I find echoing the parameters to a temporary text file and then
> >> using the file as input to be more reliable and easier to
> >> troubleshoot, and it breaks apart whether it is Windows cli
> >> inconsistencies or receiving program issues very nicely with the
> >> text file content as an intermediary
>
> > This is an OK tack, but I don't wish to do this by experimentation
> > and get caught out later by a case I didn't think of, so what I'm
> > trying to determine is *exactly* how the Cygwin DLL processes the
> > command line via its source code so that I can present it with my
> > argv array converted to a single command line and be certain that
> > the Cygwin will
> recover the same argv DLL.
>
> > My reading of the relevant sources suggests that with globbing
> > disabled, backslash escape sequences are *never* interpreted (since
> > the quote function returns early - dcrt0.cc, line 171). If there is
> > no way of encoding the double quote character, then perhaps I have
> > to run with globbing enabled but ensure that the globify function
> > will never actually expand anything - but as that's a lot of work, I
> > was wondering
> if I was missing something with the simpler "noglob" case.
>
> The point being, when you pass the shell and enter direct process
> execution, you don't need much of shell magic at all.
> Shell conventions designed to ease interaction between system and
> operator.
> But you have a system talking to the system, you can be very literal.
Indeed, which is why I'm trying to avoid the shell! But I can't be entirely
literal, because Posix and Windows are not compatible, so I need to
determine precisely how Cygwin's emulation works... and so far, it doesn't
seem to be a terribly clearly defined animal!
So, resorting to C files to try to demonstrate it further. spawn.cc seems to
suggest that there should be some kind of escaping available, but I'm
struggling to follow the code. Consider these two:
callee.c
#include <stdio.h>
int main (int argc, char* argv[])
{
int i;
printf("argc = %d\n", argc);
for (i = 0; i < argc; i++) {
printf("argv[%d] = %s\n", i, *argv++);
}
return 0;
}
caller.c
#include <windows.h>
#include <stdio.h>
int main (void)
{
LPTSTR commandLine;
STARTUPINFO startupInfo = {sizeof(STARTUPINFO), NULL, NULL, NULL, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, NULL, NULL, NULL, NULL};
PROCESS_INFORMATION process = {NULL, NULL, 0, 0};
commandLine = "callee.exe \"@\"te\"\n\"st fo@o bar\" \"baz baz *";
if (!CreateProcess("callee.exe", commandLine, NULL, NULL, FALSE, 0,
NULL, NULL, &startupInfo, &process)) {
printf("Error spawning process!\n");
return 1;
} else {
WaitForSingleObject(process.hProcess, INFINITE);
CloseHandle(process.hThread);
CloseHandle(process.hProcess);
return 0;
}
}
If you compile as follows:
$ gcc -o callee callee.c
$ i686-w64-mingw32-gcc -o caller caller.c
$ export CYGWIN=noglob # Or the * will be expanded
$ ./caller
and the output is as required:
argc = 6
argv[0] = callee
argv[1] = @te
st
argv[2] = fo@o
argv[3] = bar baz
argv[4] = fliggle
argv[5] = *
But if I want to embed an actual " character in any of those arguments, I
cannot see any way to escape it which actually works at the moment. For
example, if you change commandLine in caller.c to be "callee.exe test\\\"
argument" then the erroneous output is:
argc = 2
argv[0] = callee
argv[1] = test\ argument
where the required output is
argc = 3
argv[0] = callee
argv[1] = test"
argv[2] = argument
Any further clues appreciated. Is it actually even a bug?!
David
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple