bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash

Jay Libove libove@felines.org
Tue Mar 24 07:18:04 GMT 2020


Hi Cygwin team,
Here is a consolidated bug report based on the discussion in recent days which I'd started under the subject " shell expansion produces e.g. "ls: cannot access '*.pdf': No such file or directory" in Windows CMD shell, but works okay in bash " (thread starter https://cygwin.com/pipermail/cygwin/2020-March/244161.html )
Many thanks to Paul, Andrey, and others for helping me nail down where and how it seems to be happening.
My apologies in advance that my coding days are long behind me, so I'm not in a position to include a proposed code fix.

cygcheck output attached (lightly modified to redact a couple of personal items).

Problem:
Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' built-in argv[] globbing will produce unexpected:
"{programName}: cannot access '{glob pattern}: No such file or directory"
e.g.
"ls: cannot access '*.pdf': No such file or directory"
.. despite the fact that e.g. *.pdf definitely exists.

Steps to Reproduce:
* Have some files in the local director with accented characters in the names, e.g.:
C:> mkdir c:\temp\test
C:> cd c:\temp\test
C:> touch héllo.pdf
C:> touch gòodbye.pdf
C:> touch normal.pdf
* DON'T have the LANG= environment variable set to anything
* NOT in bash or Cygwin Terminal, but rather within Windows CMD.exe, execute a Cygwin command which needs to do file name globbing because the Windows CMD.exe shells does not do so for it, e.g.
C:> ls *.pdf
C:> cat *.pdf
These will produce "ls: cannot access '*.pdf': No such file or directory"
Although, curiously,
C:> ls *or*
does correctly produce:
normal.pdf

Also, display output of the áccènted characters is incomplete:
C:> ls
'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf
C:> bash
jay_l@DESKTOP-I9MRIE3 /cygdrive/c/Temp
$ ls
'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf


Analysis:
I've verified that it's not about case sensitivity. That is, it's not a matter of ls *.pdf vs. ls *.PDF.
If these test commands are run either under bash.exe or within a Cygwin Terminal window, the problem does not occur.
I've verified that the Windows system locale (per Windows' Region setting) actually doesn't matter. (I've reproduced this both on systems in Region Spain with language English-International and English-Ireland, and in a VM with a bog standard vanilla US English Windows).

Credits to Paul for suggesting deleting files one by one until the problem goes away, and to Andrey for pointing out `locale` and the LANG= setting.

Set LANG=en_US.UTF-8, e.g.
C:> set LANG=en_US.UTF-8
.. and the problem goes away.
C:> ls *.pdf
gòodbye.pdf
héllo.pdf
normal.pdf
C:> ls
gòodbye.pdf
héllo.pdf
normal.pdf

Interestingly, Andrey mentioned that he sets LANG=ru_RU.CP866 and he doesn't see the problem. When I tried that exact setting, I still had the problem.
So it's maybe not just that LANG must be set to *something*, but that somehow LANG must be set to something that matches something in Windows? (Sorry, I know that's nearly uselessly vague).


In summary, it appears that the way that the argv[] globbing code which gets compiled in to Cygwin programs functions a bit differently than the way the shell globbing code works within bash.exe.
And this produces unexpected globbing failures.


Thanks to all the Cygwin maintainers for this amazing software, for so many years!
-Jay


-------------- next part --------------
A non-text attachment was scrubbed...
Name: cygcheck.out
Type: application/octet-stream
Size: 59952 bytes
Desc: cygcheck.out
URL: <http://cygwin.com/pipermail/cygwin/attachments/20200324/d88b70be/attachment.obj>


More information about the Cygwin mailing list