This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
This is a corner case in the POSIX specification which is not handled correctly by the matcher. The corner case is exposed by the regex (A?){3,6} with pattern "AAAA". Here, the register must match the fourth "A", and not the empty string, because a braced expression must match as few times as possible if the extra matches were empty. In other words, you must not match (A?) to an empty string from the fourth time on. The fix that the patch implements works backwards by keeping the void matches and rejecting them when setting the registers. Simply checking if the register was already set fails on "AA", where instead the third match is compulsory and the register must be set to empty. To do so, since parse_dup_op converts the above regex to (A?)(A?)(A?)(A?)?(A?)?(A?)?, I need to mark specially the fake (A?)? nodes, when they are duplicated, by setting the new OPT_SUBEXP flag. When a node is marked OPT_SUBEXP, update_regs treats it specially. A marked OP_OPEN_SUBEXP must follow a (not necessarily marked) OP_CLOSE_SUBEXP, so we know that the start of a marked subexpression is the same as the end of the previous occurrence of the subexpressions. So update_regs does nothing for a marked OP_OPEN_SUBEXP, delaying the update to when the OP_CLOSE_SUBEXP is found. Upon a marked OP_CLOSE_SUBEXP, we check for an empty match and discard it, otherwise we shift rm_eo to rm_so and set rm_eo. This is against my other patch, but should apply cleanly even without it. I'll send a testcase on Monday. Interestingly, Perl and PCRE behave like the current glibc implementation, so this may trigger a fake failure in the new PCRE-based tests that Jakub wrote. Thanks very much, Paolo 2003-12-13 Paolo Bonzini <bonzini@gnu.org> * posix/regex_internal.h (re_token_t): Add the OPT_SUBEXP bitfield. * posix/regcomp.c (duplicate_tree_1): Extract out of duplicate_tree. (duplicate_tree): Add new FL_OPT parameter. (parse_dup_op): Pass it when compiling (RE){M,} with M > 0 or (RE){M,N} with N > M > 0. * posix/regexec.c (update_regs): Honor OPT_SUBEXP.
Attachment:
regex-fix-repeated-empty-regex.patch
Description: Binary data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |