This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Questions on fnmatch() and case folding
- From: Nick Stoughton <nstoughton at logitech dot com>
- To: libc-help at sourceware dot org
- Date: Thu, 26 Jan 2017 11:27:25 -0800
- Subject: Questions on fnmatch() and case folding
- Authentication-results: sourceware.org; auth=none
Some questions have arisen during the Austin Group (the POSIX
maintainers) meetings around adding support in POSIX for case
insensitive file name matching (see
http://austingroupbugs.net/view.php?id=1031)
It was observed that the glibc implementation of fnmatch() with the
FNM_CASEFOLD flag does NOT do case folding when given an explicit
character class. That is to say, the string "A" does not match the
pattern "[[:lower:]]" even with FNM_CASEFOLD.
I've checked the current master branch on git, and the issue (if
indeed it is an issue) is still present there.
There's also a question with range expressions such as "[Z-a]"
(assuming a POSIX locale): should this match characters such as '_'
(which in ASCII at least lies between upper case Z and lower case a),
and whether or not case insensitivity should or should not affect
this.
My personal expectation is that "[[:lower:]]" should match an
uppercase character if case folding is occurring (which it does not in
glibc). Is this a bug?
In the POSIX locale, [:lower:] is the character set
abcdefghijklmnopqrstuvwxyz, and [:upper:] is a similar (upper case)
set. Thus we might expect
[[:upper:]-[:lower:]] to be the same as
[ABCDEFGHIJKLMNOPQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz]
... but it isn't!
The program below demonstrates...
--
Nick
#include <stdio.h>
#include <fnmatch.h>
#define ARRAY_SIZE(a) (sizeof(a)/sizeof(a[0]))
int
main(int argc, const char *argv[])
{
const char *pattern[] = {
"aa", "AA", "[[:lower:]][[:lower:]]", "[a-z][a-z]",
"[[=a=]][[=a=]]",
"[[:upper:]-[:lower:]][[:upper:]-[:lower:]]", "[Z-a][Z-a]",
};
const char *name[] = {
"aa", "AA", "aA", "Aa", "aB", "__",
};
int flags[] = { FNM_PATHNAME, FNM_CASEFOLD | FNM_PATHNAME };
for (int i = 0; i < ARRAY_SIZE(pattern); i++) {
for (int j = 0; j < ARRAY_SIZE(name); j++) {
for (int k = 0; k < ARRAY_SIZE(flags); k++) {
int match = fnmatch(pattern[i],
name[j], flags[k]);
printf("%s %s %s case %s\n", pattern[i],
match == 0 ? "matches" : "does
not match",
name[j],
flags[k] & FNM_CASEFOLD ?
"insensitively" : "sensitively");
}
}
printf("\n");
}
return 0;
}