This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: Regex performance improvements
- To: libc-alpha at sources dot redhat dot com
- Subject: Re: Regex performance improvements
- From: Paolo Bonzini <bonzini at pc-amo3 dot elet dot polimi dot it>
- Date: Thu, 17 May 2001 10:37:17 +0200 (CEST)
- Reply-To: bonzini at gnu dot org
WRT the code to remove gapped support. It is still present by default,
but if you #define SINGLE_STRING it is removed; so glibc can still export
it, but clients which supply a version of GNU regex for compatibility can
#define it and compile a faster version. As I said, I did this for GNU
grep -- it is not harmful nor backwards-incompatible.
I also thought of adding a fixed-string optimization, which anchors
abc.*def to the last occurrence of def in the pattern; this could be done
with a fast Boyer-Moore search in single-byte mode, while in multi-byte
one could still use Boyer-Moore and encode the skip table in some cunning
way, or maybe employ Knuth-Morris-Pratt. This optimization is done in
Ruby's adaptation of regex.
--
|_ _ _ ___
|_)(_)| ) ,'
--------- '-._.