Bug 25659 - glob("/foo/*/") may also match regular & other kind of files, not just directories
Summary: glob("/foo/*/") may also match regular & other kind of files, not just direct...
Status: UNCONFIRMED
Alias: None
Product: glibc
Classification: Unclassified
Component: glob (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-12 07:45 UTC by Șpagoveanu
Modified: 2021-03-13 09:16 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Șpagoveanu 2020-03-12 07:45:41 UTC
On Linux, glob(3) assumes that a regular file cannot have d_type == DT_UNKNOWN, irrespective of filesystem support or whether the GLOB_ALTDIRFUNC feature was used.

This results in a pattern like "/foo/*/" also matching non-directories, which is contrary to what the standard requires ("The <slash> character in a pathname shall be explicitly matched by using one or more <slash> characters in the pattern"). At least on Linux, you CANNOT refer to a regular file as "/path/to/file/".

See the simple testcase below, using a minix filesystem, which does not support d_type.

You can also check these, about a (fixed) bug in GNU make, caused by its use of GLOB_ALTDIRFUNC:
https://www.mail-archive.com/bug-make@gnu.org/msg11073.html
https://lists.gnu.org/archive/html/bug-make/2018-06/msg00009.html

Once upon a time I had started writing a patch, but I gave up as I had to rewrite the whole logic of glob_in_dir(). FWIW, the glibc documentation claims the GLOB_ONLYDIR is only a "hint", but the implementation itself is assuming more than that. 

--------------x----------------
 # cat >glob.c <<'EOT'; cc -Wall -O2 glob.c -o glob
#include <stdio.h>
#include <glob.h>
int main(int ac, char **av){
        int i, j;
        for(i = 1; i < ac; i++){
                glob_t g = {};
                if(glob(av[i], 0, 0, &g) == 0)
                        for(j = 0; j < g.gl_pathc; j++)
                                printf("%s\n", g.gl_pathv[j]);
        }
}
EOT

# truncate -s1G minix.img; mkfs.minix minix.img
...
# mkdir -p dir; mount minix.img dir
# touch dir/file
# ./glob 'dir/*/'
dir/file
--------------x----------------
Comment 1 Stephane Chazelas 2021-03-13 08:52:14 UTC
Note that it was reported at https://unix.stackexchange.com/questions/638955/what-could-be-a-cause-for-getdents-returning-different-results-on-2-systems, there about a "prl_fs" filesystem (likely related to some Parallel virtualisation software).

I can reproduce it with glibc 2.33, and also with broken symlinks:


$ mkdir -p testdir/dir
$ ln -s /no/such/file testdir/broken
$ ./glob 'testdir/*/'
testdir/broken
testdir/dir/

Note that while the issue can be reproduced with ./*/ or b*/, I can't reproduce it with */

$ cd testdir
$ ../glob './*/'
./broken
./dir/
$ ../glob 'b*/'
broken
$ ../glob '*/'
dir/