For every repeated + in an extended regex, parse_dup_op seems to duplicate the parse tree. dualbus@hp:~/v$ ulimit -a | grep cpu cpu time (seconds, -t) 1 dualbus@hp:~/v$ grep -E '.++++++++++++++++++++++++++++++++' <<< . Killed This seems to be special to +, since * doesn't behave that way. My guess is that: .+ is expanded to ..* So .+++ is expanded to ........* And so on. Is this documented somewhere?
This seems to be related to bug 17150
*** Bug 25814 has been marked as a duplicate of this bug. ***
This behaviour can rapidly exhaust memory (Bug 25814, Bug 28864, Bug 20095 , Bug 29642), which seems unhelpful when ".++" is not even a valid regex. POSIX says it's undefined: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_06 Why doesn't regcomp just fail to compile it with REG_BADRPT? Similarly for ".**" etc. GNU grep seems to have tests for these that expect BADRPT: https://git.savannah.gnu.org/cgit/grep.git/tree/tests/tests?h=v3.11#n234
I am not sure who exactly GNU grep handles this since it also uses gnulib regex code. Is this code really being tested by GNU grep? I deleted the fil, make check, and it seems not to affect the test's outcome. I also tried to sync with gnulib master, but it also does not help.
(In reply to Adhemerval Zanella from comment #4) > Is this code really being tested by GNU grep? No idea - I grepped the grep code for cases of '++' and found those tests, I don't know if they're actually run or not. FWIW, Solaris 2.11 and AIX 7.3 both have the same behaviour for "a++" jwakely@gcc-solaris11:~$ /usr/xpg4/bin/grep -E a++ <<< a a $ /usr/bin/grep -E a++ <<< a a So maybe POSIX says it's undefined to allow for this traditional/common behaviour. Glibc's support for it seems poor though, given the memory exhaustion problems.