This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
fnmatch() behaves oddly with *s and FNM_LEADING_DIR
- To: libc-alpha at sourceware dot cygnus dot com
- Subject: fnmatch() behaves oddly with *s and FNM_LEADING_DIR
- From: cjw44 at flatline dot org dot uk
- Date: Sun, 15 Oct 2000 16:13:39 +0100
- Cc: 59829 at bugs dot debian dot org
>Submitter-Id: net
>Originator: Colin Watson
>Organization: riva.ucam.org
>Confidential: no
>Synopsis: fnmatch() with FNM_LEADING_DIR matches * inconsistently
>Severity: non-critical
>Priority: low
>Category: libc
>Class: sw-bug
>Release: libc-2.1.95
>Environment:
Host type: i386-pc-linux-gnu
System: Linux riva 2.4.0-test2 #1 Sun Jun 25 22:05:08 BST 2000 i686 unknown
Architecture: i686
Addons: linuxthreads
Build CC: gcc
Compiler version: 2.95.2 20000220 (Debian GNU/Linux)
Kernel headers: UTS_RELEASE
Symbol versioning: yes
Build static: yes
Build shared: yes
Build pic-default: no
Build profile: yes
Build omitfp: no
Build bounded: no
Build static-nss: no
Stdio: libio
>Description:
This bug was originally reported in Debian bug #59829,
<URL:http://bugs.debian.org/59829>. I'll repeat the problem description
here, with a little editing.
Ryan Tracey <ryant@thawte.com> wrote:
| Tar no longer excludes the files and directories that previous versions
| used to exclude (sorry, I have no idea with which version the change
| occurred, but it was within the past month or so). For example, to
| exclude all the MS Frontpage extensions files and directories in the
| 'webspace' directory tree, I used to do this:
|
| tar xcf /var/tmp/website.tgz --exclude=_\* webspace\
|
| This used to exclude all the _vti_cnf/ and _private/ directories that
| don't need to be on the main website.
tar checks whether a name is excluded by using the libc function
fnmatch() with FNM_FILE_NAME and FNM_LEADING_DIR. With these flags, a
pattern like "_*" matches a string that contains something matching "_*"
and containing no slashes, followed by a string containing exactly one
slash: that is, the pattern is matched against everything but the final
component of the file name and the preceding slash. "*" will match
"foo/bar", but not "foo" or "foo/bar/baz", using these flags - despite
the fact that the pattern "foo" will match all three of these strings
using these flags. This causes tar a good deal of confusion.
There are two other places in names.c where FNM_LEADING_DIR is used;
however, they don't use FNM_FILE_NAME, and here the behaviour of
fnmatch() is different, even in the absence of wildcards. The pattern
"foo" matches "foo", "foo/bar", and "foo/bar/baz", as does the pattern
"*".
In other words, the behaviour of fnmatch() when both of the flags
FNM_FILE_NAME and FNM_LEADING_DIR are specified is counter-intuitive.
The documentation for FNM_LEADING_DIR says that it ignores "a trailing
sequence of characters starting with a `/' in STRING", and the fact that
"x/y/z" matches the pattern "x" seems to confirm that this is indeed "a
trailing sequence" rather than "the shortest trailing sequence".
However, "x/y/z" does not match the pattern "*", even though there is an
available leading directory containing no slashes that matches that
pattern.
>How-To-Repeat:
Have a look at the output of the following program (0 indicates a
successful match, 1 indicates a failure):
===== cut here =====
#include <fnmatch.h>
#include <stdio.h>
int main()
{
printf("%d %d %d\n",
fnmatch("x", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("x", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("x", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
printf("%d %d %d\n",
fnmatch("*", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("*", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("*", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
printf("%d %d %d\n",
fnmatch("*x", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("*x", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("*x", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
printf("%d %d %d\n",
fnmatch("x*", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("x*", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
fnmatch("x*", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
}
===== cut here =====
(Incidentally, if you put the two #includes the other way round, you
get:
[cjw44@riva ~/src/fnmatch-bug]$ gcc -c -g test.c
test.c: In function `main':
test.c:7: `FNM_LEADING_DIR' undeclared (first use in this function)
test.c:7: (Each undeclared identifier is reported only once
test.c:7: for each function it appears in.)
test.c:23: `FNM_FILE_NAME' undeclared (first use in this function)
This has got to be a bug too.)
This program outputs:
0 0 0
1 0 1
0 0 0
1 0 1
Thus, final *s will only allow one trailing slash and file name
component in the presence of FNM_FILE_NAME and FNM_LEADING_DIR, whereas
other atoms in patterns don't care how many there are. If you remove the
FNM_FILE_NAME flag so that *s can match slashes, then all these matches
succeed, as you might expect.
>Fix:
If this (strange, I think) behaviour really is by design and/or POSIX,
then it needs to be documented. If not, the following patch fixes it. It
has the effect that a wildcard at the end of a pattern with
FNM_FILE_NAME and FNM_LEADING_DIR trivially matches anything, as long as
everything before it matched. That is, it will munch up to the first
slash and then declare that it has found a matching leading directory.
--- glibc-2.1.95/posix/fnmatch_loop.c.orig Mon Sep 25 16:23:06 2000
+++ glibc-2.1.95/posix/fnmatch_loop.c Sun Oct 15 16:03:12 2000
@@ -99,25 +99,18 @@
if (c == L('\0'))
/* The wildcard(s) is/are the last element of the pattern.
If the name is a file name and contains another slash
- this does mean it cannot match. If the FNM_LEADING_DIR
- flag is set and exactly one slash is following, we have
- a match. */
+ this means it cannot match, unless the FNM_LEADING_DIR
+ flag is set. */
{
int result = (flags & FNM_FILE_NAME) == 0 ? 0 : FNM_NOMATCH;
if (flags & FNM_FILE_NAME)
{
- const CHAR *slashp = STRCHR (n, L('/'));
-
if (flags & FNM_LEADING_DIR)
- {
- if (slashp != NULL
- && STRCHR (slashp + 1, L('/')) == NULL)
- result = 0;
- }
+ result = 0;
else
{
- if (slashp == NULL)
+ if (STRCHR (n, L('/')) == NULL)
result = 0;
}
}