This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [PATCH] improve regex performance
- To: Isamu Hasegawa <isamu at yamato dot ibm dot com>
- Subject: Re: [PATCH] improve regex performance
- From: Ulrich Drepper <drepper at redhat dot com>
- Date: 25 Jun 2001 17:05:01 -0700
- Cc: libc-alpha at sources dot redhat dot com, shoji at jp dot ibm dot com
- References: <20010621.165322.01363770.isamu@yamato.ibm.com>
- Reply-To: drepper at cygnus dot com (Ulrich Drepper)
Isamu Hasegawa <isamu@yamato.ibm.com> writes:
> The performance of current implementation of regex has a problem,
> if we use the re_search() interface in multibyte environments.
Thanks, I looked at it. Actually, I did more. I had problems
applying the patch since it wasn't against the latest CVS version.
This might be the root of the problems I've seen then.
Take a look at the posix/tst-regex.c program. It's currently not
build but you can compile it. The good news is that it works for
single-byte charsets now (didn't in 2.2.3). But the multi-byte
handling (I'm using UTF-8) seems to be broken. Please take a look.
The regex implementation with and with your latest patch doesn't
produce any output. Either the code is ***really*** slow and it'll
get there eventually or the regexec() function is caught in an endless
loop.
This is really a high-priority problem, I would appreciate if you
could look at it ASAP. Thanks,
[PS: the tst-regex program also contains some timing code so you can
substantiate speedup gains.]
--
---------------. ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Red Hat `--' drepper at redhat.com `------------------------