This is sources Bugzilla
Bugzilla Version 2.17.5
Bugzilla Bug 6395
  regex ^$ is not detected as anchored Last modified: 2008-05-15 03:07:46
     Query page      Enter new bug
Bug#: 6395   Hardware:   Reporter: Paolo Bonzini <bonzini@gnu.org>
Host: Target: Build:
Product:     Add CC:
Component:   Version:   CC:
Remove selected CCs
Status: RESOLVED   Priority:  
Resolution: FIXED   Severity:  
Assigned To: roland@redhat.com <roland@redhat.com>   Target Milestone:  
Flags: Requestee:
  backport ()
  examined ()
  testsuite ()
Summary:
Keywords:

Attachment Description Type Created Actions
glibc-speedup-double-anchor.patch patch against glibc CVS patch 2008-04-11 12:45 Edit | Diff
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 6395 depends on: Show dependency tree
Show dependency graph
Bug 6395 blocks:

Additional Comments:


Leave as RESOLVED FIXED
Reopen bug
Mark bug as VERIFIED

View Bug Activity   |   Format For Printing


Description:   Last confirmed: 0000-00-00 00:00 Opened: 2008-04-11 12:44
Regex matching has an optimization where only one match is tried for a regex
anchored to the beginning of the buffer. While other anchors are resolved with
the fastmap, this one allows further optimization and is special cased. However,
because of a bug in create_cd_newstate, ^$ would be mistakenly treated as a
non-anchoring match, and re_search_internal would try matching it at every position.

In fact, the bug is (almost) fixed by this hunk:


@@ -1682,8 +1680,6 @@ create_cd_newstate (const re_dfa_t *dfa,
        newstate->halt = 1;
       else if (type == OP_BACK_REF)
        newstate->has_backref = 1;
-      else if (type == ANCHOR)
-       constraint = node->opr.ctx_type;

       if (constraint)
        {


However, some complications in building the NFA prevent this from fixing the
problem. Therefore, this patch cleans up the handling of anchors so that tests
on type == ANCHOR are not necessary anymore. When creating the NFA (calc_first),
I move the opr.ctx_type to the constraint field of re_token_t, and then I always
look at it unconditionally, without special-casing ANCHORs. This also allows
some simplification of duplicate_node_closure.

------- Additional Comment #1 From Paolo Bonzini 2008-04-11 12:45 -------
Created an attachment (id=2690)
patch against glibc CVS

------- Additional Comment #2 From Paolo Bonzini 2008-04-25 14:26 -------
patch tested on gnu sed trunk testsuite (which also includes those parts of the
glibc testsuite that do not require full internationalization).

roland, i was told to reassign it to either you or ulrich

------- Additional Comment #3 From Ulrich Drepper 2008-05-15 03:07 -------
Added to cvs.

     Query page      Enter new bug
Actions: New | Query | bug # | Reports | Requests   New Account | Log In