Another RFC: regex in libiberty

Eli Zaretskii eliz@is.elta.co.il
Fri Jun 8 10:37:00 GMT 2001


> From: "Zack Weinberg" <zackw@stanford.edu>
> Date: Fri, 8 Jun 2001 09:59:32 -0700
> 
> On Fri, Jun 08, 2001 at 10:06:51AM +0300, Eli Zaretskii wrote:
> > 
> > One notorious problem with GNU regex is that it is quite slow for many
> > simple jobs, such as matching a simple regular expression with no
> > backtracking.  It seems that the main reason for this slowness is the
> > fact that GNU regex supports null characters in strings.  For
> > examnple, Sed 3.02 compiled with GNU regex is about 2-4 times slower
> > on simple jobs than the same Sed compiled with Spencer's regex
> > library.
> 
> I think the null characters are a red herring.

It's possible; I never had time to look into it far enough to be
sure.  All I know is that the slow-down happened between two specific
versions of GNU regex, and the support for null characters was
introduced between those two versions.

> The regex.c that came with GDB 4.18, which I think is the one that got
> spread around widely, had a bug in its implementation of the POSIX
> regcomp/regexec interface, which caused a major performance hit.  That
> bug has been fixed in GNU libc for a long time.  When I replaced
> fixincludes' copy of regex.c with a more recent version from glibc,
> fixincludes was sped up by a factor of nine.  That same bug affects
> Sed 3.02 - replace the regex.c it ships with with the one from glibc
> 2.2.x and I bet you'll see better performance.
> 
> There's some discussion in these messages:
> 
> http://gcc.gnu.org/ml/gcc-patches/2000-01/msg00764.html
> http://gcc.gnu.org/ml/gcc-patches/2000-01/msg00765.html

Thanks for the pointers.



More information about the Gdb mailing list