Questions on fnmatch() and case folding

Nick Stoughton nstoughton@logitech.com
Thu Jan 26 19:27:00 GMT 2017


Some questions have arisen during the Austin Group (the POSIX
maintainers) meetings around adding support in POSIX for case
insensitive file name matching (see
http://austingroupbugs.net/view.php?id=1031)

It was observed that the glibc implementation of fnmatch() with the
FNM_CASEFOLD flag does NOT do case folding when given an explicit
character class. That is to say, the string "A" does not match the
pattern "[[:lower:]]" even with FNM_CASEFOLD.

I've checked the current master branch on git, and the issue (if
indeed it is an issue) is still present there.

There's also a question with range expressions such as "[Z-a]"
(assuming a POSIX locale): should this match characters such as '_'
(which in ASCII at least lies between upper case Z and lower case a),
and whether or not case insensitivity should or should not affect
this.

My personal expectation is that "[[:lower:]]" should match an
uppercase character if case folding is occurring (which it does not in
glibc). Is this a bug?

In the POSIX locale, [:lower:] is the character set
abcdefghijklmnopqrstuvwxyz, and [:upper:] is a similar (upper case)
set. Thus we might expect
[[:upper:]-[:lower:]] to be the same as
[ABCDEFGHIJKLMNOPQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz]
... but it isn't!

The program below demonstrates...
-- 
Nick

#include <stdio.h>
#include <fnmatch.h>

#define ARRAY_SIZE(a)   (sizeof(a)/sizeof(a[0]))

int
main(int argc, const char *argv[])
{
        const char *pattern[] = {
                "aa", "AA", "[[:lower:]][[:lower:]]", "[a-z][a-z]",
                "[[=a=]][[=a=]]",
"[[:upper:]-[:lower:]][[:upper:]-[:lower:]]", "[Z-a][Z-a]",
        };
        const char *name[] = {
                "aa", "AA", "aA", "Aa", "aB", "__",
        };
        int flags[] = { FNM_PATHNAME, FNM_CASEFOLD | FNM_PATHNAME };

        for (int i = 0; i < ARRAY_SIZE(pattern); i++) {
                for (int j = 0; j < ARRAY_SIZE(name); j++) {
                        for (int k = 0; k < ARRAY_SIZE(flags); k++) {
                                int match = fnmatch(pattern[i],
name[j], flags[k]);
                                printf("%s %s %s case %s\n", pattern[i],
                                        match == 0 ? "matches" : "does
not match",
                                        name[j],
                                        flags[k] & FNM_CASEFOLD ?
"insensitively" : "sensitively");
                        }
                }
                printf("\n");
        }
        return 0;
}



More information about the Libc-help mailing list