This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] fix more repeated subexpression testcases


> Does this fix also the:
>
> a(b|c?)+d abcd 0 4 2 3
> [etc]

No, it did not.  I seemed to recall that everything went through the same
lowering path, but this happens in PCRE and not in the glibc matcher; the
attached follow-up patch instead catches these cases too.  This requires
some care in update_regs, because nodes that are marked opt_subexp now may
be the first occurrence of the subexpression, for example in (b|c?)*.

The patch also lowers OP_DUP_PLUS into OP_DUP_ASTERISK, simplifying
first/follow calculation and freeing one epsilon-matching value in
re_token_type_t.


Paolo

2003-12-14 Paolo Bonzini <bonzini@gnu.org>

        * posix/regcomp.c (parse_dup_op): Process OP_DUP_PLUS,
        OP_DUP_ASTERISK, and OP_DUP_QUESTION like OP_OPEN_DUP_NUM,
        in order to lower OP_DUP_PLUS and mark subexpressions as
        OPT_SUBEXP.
        (optimize_utf8, calc_first, calc_next, calc_epsdest):
        Don't consider the OP_DUP_PLUS case.
        * posix/regexec.c (update_regs): OPT_SUBEXP subexpression
        may now happen even when PMATCH[REG_NUM].RM_SO == -1.
        (NUMBER_OF_STATES): Unused, remove it.
        * posix/regex_internal.h (re_token_type_t): Move
        OP_DUP_PLUS among the tokens rather than among the
        epsilon-transiting nodes.





Attachment: regex-fix-more-optional-subexps.patch
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]