This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
> Does this fix also the: > > a(b|c?)+d abcd 0 4 2 3 > [etc] No, it did not. I seemed to recall that everything went through the same lowering path, but this happens in PCRE and not in the glibc matcher; the attached follow-up patch instead catches these cases too. This requires some care in update_regs, because nodes that are marked opt_subexp now may be the first occurrence of the subexpression, for example in (b|c?)*. The patch also lowers OP_DUP_PLUS into OP_DUP_ASTERISK, simplifying first/follow calculation and freeing one epsilon-matching value in re_token_type_t. Paolo 2003-12-14 Paolo Bonzini <bonzini@gnu.org> * posix/regcomp.c (parse_dup_op): Process OP_DUP_PLUS, OP_DUP_ASTERISK, and OP_DUP_QUESTION like OP_OPEN_DUP_NUM, in order to lower OP_DUP_PLUS and mark subexpressions as OPT_SUBEXP. (optimize_utf8, calc_first, calc_next, calc_epsdest): Don't consider the OP_DUP_PLUS case. * posix/regexec.c (update_regs): OPT_SUBEXP subexpression may now happen even when PMATCH[REG_NUM].RM_SO == -1. (NUMBER_OF_STATES): Unused, remove it. * posix/regex_internal.h (re_token_type_t): Move OP_DUP_PLUS among the tokens rather than among the epsilon-transiting nodes.
Attachment:
regex-fix-more-optional-subexps.patch
Description: Binary data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |