This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

fnmatch() behaves oddly with *s and FNM_LEADING_DIR


>Submitter-Id:	net
>Originator:	Colin Watson
>Organization:  riva.ucam.org
>Confidential:	no
>Synopsis:	fnmatch() with FNM_LEADING_DIR matches * inconsistently
>Severity:	non-critical
>Priority:	low
>Category:	libc
>Class:		sw-bug
>Release:	libc-2.1.95
>Environment:
	
Host type: i386-pc-linux-gnu
System: Linux riva 2.4.0-test2 #1 Sun Jun 25 22:05:08 BST 2000 i686 unknown
Architecture: i686

Addons: linuxthreads

Build CC: gcc
Compiler version: 2.95.2 20000220 (Debian GNU/Linux)
Kernel headers: UTS_RELEASE
Symbol versioning: yes
Build static: yes
Build shared: yes
Build pic-default: no
Build profile: yes
Build omitfp: no
Build bounded: no
Build static-nss: no
Stdio: libio

>Description:

This bug was originally reported in Debian bug #59829,
<URL:http://bugs.debian.org/59829>. I'll repeat the problem description
here, with a little editing.

Ryan Tracey <ryant@thawte.com> wrote:
| Tar no longer excludes the files and directories that previous versions
| used to exclude (sorry, I have no idea with which version the change
| occurred, but it was within the past month or so). For example, to
| exclude all the MS Frontpage extensions files and directories in the
| 'webspace' directory tree, I used to do this:
| 
|         tar xcf /var/tmp/website.tgz --exclude=_\* webspace\
| 
| This used to exclude all the _vti_cnf/ and _private/ directories that
| don't need to be on the main website.

tar checks whether a name is excluded by using the libc function
fnmatch() with FNM_FILE_NAME and FNM_LEADING_DIR. With these flags, a
pattern like "_*" matches a string that contains something matching "_*"
and containing no slashes, followed by a string containing exactly one
slash: that is, the pattern is matched against everything but the final
component of the file name and the preceding slash. "*" will match
"foo/bar", but not "foo" or "foo/bar/baz", using these flags - despite
the fact that the pattern "foo" will match all three of these strings
using these flags. This causes tar a good deal of confusion.

There are two other places in names.c where FNM_LEADING_DIR is used;
however, they don't use FNM_FILE_NAME, and here the behaviour of
fnmatch() is different, even in the absence of wildcards. The pattern
"foo" matches "foo", "foo/bar", and "foo/bar/baz", as does the pattern
"*".

In other words, the behaviour of fnmatch() when both of the flags
FNM_FILE_NAME and FNM_LEADING_DIR are specified is counter-intuitive.
The documentation for FNM_LEADING_DIR says that it ignores "a trailing
sequence of characters starting with a `/' in STRING", and the fact that
"x/y/z" matches the pattern "x" seems to confirm that this is indeed "a
trailing sequence" rather than "the shortest trailing sequence".
However, "x/y/z" does not match the pattern "*", even though there is an
available leading directory containing no slashes that matches that
pattern.

>How-To-Repeat:

Have a look at the output of the following program (0 indicates a
successful match, 1 indicates a failure):

===== cut here =====
#include <fnmatch.h>
#include <stdio.h>

int main()
{
    printf("%d %d %d\n",
	    fnmatch("x", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("x", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("x", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
    printf("%d %d %d\n",
	    fnmatch("*", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("*", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("*", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
    printf("%d %d %d\n",
	    fnmatch("*x", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("*x", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("*x", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
    printf("%d %d %d\n",
	    fnmatch("x*", "x", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("x*", "x/y", FNM_FILE_NAME | FNM_LEADING_DIR),
	    fnmatch("x*", "x/y/z", FNM_FILE_NAME | FNM_LEADING_DIR));
}
===== cut here =====

(Incidentally, if you put the two #includes the other way round, you
get:

[cjw44@riva ~/src/fnmatch-bug]$ gcc -c -g test.c
test.c: In function `main':
test.c:7: `FNM_LEADING_DIR' undeclared (first use in this function)
test.c:7: (Each undeclared identifier is reported only once
test.c:7: for each function it appears in.)
test.c:23: `FNM_FILE_NAME' undeclared (first use in this function)

This has got to be a bug too.)

This program outputs:

0 0 0
1 0 1
0 0 0
1 0 1

Thus, final *s will only allow one trailing slash and file name
component in the presence of FNM_FILE_NAME and FNM_LEADING_DIR, whereas
other atoms in patterns don't care how many there are. If you remove the
FNM_FILE_NAME flag so that *s can match slashes, then all these matches
succeed, as you might expect.

>Fix:

If this (strange, I think) behaviour really is by design and/or POSIX,
then it needs to be documented. If not, the following patch fixes it. It
has the effect that a wildcard at the end of a pattern with
FNM_FILE_NAME and FNM_LEADING_DIR trivially matches anything, as long as
everything before it matched. That is, it will munch up to the first
slash and then declare that it has found a matching leading directory.

--- glibc-2.1.95/posix/fnmatch_loop.c.orig	Mon Sep 25 16:23:06 2000
+++ glibc-2.1.95/posix/fnmatch_loop.c	Sun Oct 15 16:03:12 2000
@@ -99,25 +99,18 @@
 	  if (c == L('\0'))
 	    /* The wildcard(s) is/are the last element of the pattern.
 	       If the name is a file name and contains another slash
-	       this does mean it cannot match.  If the FNM_LEADING_DIR
-	       flag is set and exactly one slash is following, we have
-	       a match.  */
+	       this means it cannot match, unless the FNM_LEADING_DIR
+	       flag is set.  */
 	    {
 	      int result = (flags & FNM_FILE_NAME) == 0 ? 0 : FNM_NOMATCH;
 
 	      if (flags & FNM_FILE_NAME)
 		{
-		  const CHAR *slashp = STRCHR (n, L('/'));
-
 		  if (flags & FNM_LEADING_DIR)
-		    {
-		      if (slashp != NULL
-			  && STRCHR (slashp + 1, L('/')) == NULL)
-			result = 0;
-		    }
+		    result = 0;
 		  else
 		    {
-		      if (slashp == NULL)
+		      if (STRCHR (n, L('/')) == NULL)
 			result = 0;
 		    }
 		}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]