Bug 25659

Summary: glob("/foo/*/") may also match regular & other kind of files, not just directories
Product: glibc Reporter: Șpagoveanu <spagoveanu>
Component: globAssignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: normal CC: gadelat, stephane+sourceware
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Șpagoveanu 2020-03-12 07:45:41 UTC
On Linux, glob(3) assumes that a regular file cannot have d_type == DT_UNKNOWN, irrespective of filesystem support or whether the GLOB_ALTDIRFUNC feature was used.

This results in a pattern like "/foo/*/" also matching non-directories, which is contrary to what the standard requires ("The <slash> character in a pathname shall be explicitly matched by using one or more <slash> characters in the pattern"). At least on Linux, you CANNOT refer to a regular file as "/path/to/file/".

See the simple testcase below, using a minix filesystem, which does not support d_type.

You can also check these, about a (fixed) bug in GNU make, caused by its use of GLOB_ALTDIRFUNC:
https://www.mail-archive.com/bug-make@gnu.org/msg11073.html
https://lists.gnu.org/archive/html/bug-make/2018-06/msg00009.html

Once upon a time I had started writing a patch, but I gave up as I had to rewrite the whole logic of glob_in_dir(). FWIW, the glibc documentation claims the GLOB_ONLYDIR is only a "hint", but the implementation itself is assuming more than that. 

--------------x----------------
 # cat >glob.c <<'EOT'; cc -Wall -O2 glob.c -o glob
#include <stdio.h>
#include <glob.h>
int main(int ac, char **av){
        int i, j;
        for(i = 1; i < ac; i++){
                glob_t g = {};
                if(glob(av[i], 0, 0, &g) == 0)
                        for(j = 0; j < g.gl_pathc; j++)
                                printf("%s\n", g.gl_pathv[j]);
        }
}
EOT

# truncate -s1G minix.img; mkfs.minix minix.img
...
# mkdir -p dir; mount minix.img dir
# touch dir/file
# ./glob 'dir/*/'
dir/file
--------------x----------------
Comment 1 Stephane Chazelas 2021-03-13 08:52:14 UTC
Note that it was reported at https://unix.stackexchange.com/questions/638955/what-could-be-a-cause-for-getdents-returning-different-results-on-2-systems, there about a "prl_fs" filesystem (likely related to some Parallel virtualisation software).

I can reproduce it with glibc 2.33, and also with broken symlinks:


$ mkdir -p testdir/dir
$ ln -s /no/such/file testdir/broken
$ ./glob 'testdir/*/'
testdir/broken
testdir/dir/

Note that while the issue can be reproduced with ./*/ or b*/, I can't reproduce it with */

$ cd testdir
$ ../glob './*/'
./broken
./dir/
$ ../glob 'b*/'
broken
$ ../glob '*/'
dir/