This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [PATCH] Fix fnmatch escape handling in brackets (BZ #361)
- From: Jakub Jelinek <jakub at redhat dot com>
- To: "Markus F.X.J. Oberhumer" <markus at oberhumer dot com>
- Cc: Ulrich Drepper <drepper at redhat dot com>, Glibc <libc-alpha at sources dot redhat dot com>
- Date: Thu, 2 Sep 2004 04:22:43 -0400
- Subject: Re: [PATCH] Fix fnmatch escape handling in brackets (BZ #361)
- References: <20040901190912.GM30497@sunsite.ms.mff.cuni.cz> <200409020551.18238.markus@oberhumer.com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Thu, Sep 02, 2004 at 05:51:18AM +0200, Markus F.X.J. Oberhumer wrote:
> Jakub,
>
> I have not been able to test your patch yet, but the program attached below
> prints some errors for me (I originally thought this was caused by the
> backslash at the end but I now see it is completely unrelated).
>
> Not sure if this is actually a bug - at least it is confusing that when
> enlarging the pattern-range or adding FNM_CASEFOLD an "A" does
> not match a range starting with "A" anymore.
I don't think it is a bug.
Unlike RE_ICASE regcomp which converts to uppercase, FNM_CASEFOLD
converts to lowercase.
And a range where start character is > end character is invalid.
You can play with regex RE_ICASE too:
$ echo | LC_ALL=C sed -n '/[a-[]/Ip'
$ echo | LC_ALL=C sed -n '/[A-[]/Ip'
$ echo | LC_ALL=C sed -n '/[[-a]/Ip'
sed: -e expression #1, char 8: Invalid range end
$ echo | LC_ALL=C sed -n '/[[-A]/Ip'
sed: -e expression #1, char 8: Invalid range end
$ echo | LC_ALL=C sed -n '/[[-A]/p'
sed: -e expression #1, char 7: Invalid range end
$ echo | LC_ALL=C sed -n '/[[-a]/p'
$ echo | LC_ALL=C sed -n '/[a-[]/p'
sed: -e expression #1, char 7: Invalid range end
$ echo | LC_ALL=C sed -n '/[A-[]/p'
and fnmatch behaves very similarly to this (just with the difference
that it uses lowercase instead of uppercase conversion).
See that while [[-a] is valid range without RE_ICASE, it is not
with RE_ICASE (but [a-[] is, which without RE_ICASE is not valid).
So, when you are using FNM_CASEFOLD, it is always better to write the
ranges where both range ends aren't a-z or A-Z in lowercase.
I have checked Solaris and there your testcase with #define FNM_CASEFOLD FNM_IGNORECASE
behaves the same as on Linux.
Jakub