This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules


On 2019-08-30 14:59, Stephen Provine wrote:
>> Cygwin command line parsing has to match Unix shell command line processing,
>> like argument splitting, joining within single or double quotes or after a
>> backslash escaped white space characters, globbing, and other actions normally
>> performed by a shell, when any Cygwin program is invoked from any Windows
>> program e.g. cmd, without those Windows limitations which exclude any use of a
>> backslash escape character except preceding another or a double quote.

> I guess my assumption was that the "winshell" parameter would be used to determine
> when a Cygwin process is called from a non-Cygwin process and that it would be more
> appropriate to use standard Windows command line processing (as limiting as it may
> be) in that case. Once in the Cygwin environment, calls from one process to another
> should obviously process command lines according to Unix shell rules.

Not being in the same Cygwin process group and lacking the appropriate interface
info indicates that the invoker was not Cygwin.
Cygwin command line file name globs can include any UTF-8 character excluding
forward and backward (for Windows compatibility) oblique slashes and nulls, with
non-Windows supported characters including leading and trailing spaces and dots,
and result in thousands of file name arguments on the command line e.g.

	$ echo /var/log/* | wc -lwmcL
	      1   66858 2903078 2903078 2903077

shows I need to clean up my /var/log directory as it contains 64K+ files with
names totalling 2234498 chars/bytes, plus 668579 for paths and spaces, plus a
newline terminator.

Some file names with non-Windows supported characters have them converted to the
UTF-16LE BMP PUA by adding xf000, or for characters not supported by non-UTF-8
interface encodings, ^X CAN x18 followed by a BMP UTF-8 sequence, allowing
conversion to UTF-16LE, at the cost of weird characters in the displayed names.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]