This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [PATCH] improve regex performance


Isamu Hasegawa <isamu@yamato.ibm.com> writes:

> The performance of current implementation of regex has a problem,
> if we use the re_search() interface in multibyte environments.

Thanks, I looked at it.  Actually, I did more.  I had problems
applying the patch since it wasn't against the latest CVS version.
This might be the root of the problems I've seen then.

Take a look at the posix/tst-regex.c program.  It's currently not
build but you can compile it.  The good news is that it works for
single-byte charsets now (didn't in 2.2.3).  But the multi-byte
handling (I'm using UTF-8) seems to be broken.  Please take a look.
The regex implementation with and with your latest patch doesn't
produce any output.  Either the code is ***really*** slow and it'll
get there eventually or the regexec() function is caught in an endless
loop.

This is really a high-priority problem, I would appreciate if you
could look at it ASAP.  Thanks,


[PS: the tst-regex program also contains some timing code so you can
substantiate speedup gains.]

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]