Bug 26653 - regex mishandles \b inside interval expressions
Summary: regex mishandles \b inside interval expressions
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: regex (show other bugs)
Version: 2.32
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-23 00:38 UTC by eggert
Modified: 2020-09-23 00:38 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
test program illustrating the \b-inside-interval regex bug (288 bytes, text/x-csrc)
2020-09-23 00:38 UTC, eggert
Details

Note You need to log in before you can comment on or make changes to this bug.
Description eggert 2020-09-23 00:38:44 UTC
Created attachment 12855 [details]
test program illustrating the \b-inside-interval regex bug

The glibc regular expression code mishandles extended regular expressions such as:

  (.*\band){2}

although it correctly processes equivalent expressions such as:

  (.*(\<|\>)and){2}

There is a similar problem with the same regular expression in BRE syntax:

  \(.*\band\)\{2\}

To reproduce the problem, compile and run the attached file b-interval-bug.c. It will exit with status 1, whereas the correct exit status is 0.

This bug was reported against GNU 'sed' here:

https://bugs.gnu.org/41558

and the original posting is from StackExchange, here:

https://unix.stackexchange.com/questions/579889/why-doesnt-this-sed-command-replace-the-3rd-to-last-and

The original posting mentions backreferences but the bug occurs even without backreferences.