This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize strstr, strcasestr and memmem

On Fri, May 18, 2012 at 02:25:47PM +1200, Maxim Kuvyrkov wrote:
> On 18/05/2012, at 10:20 AM, Joseph S. Myers wrote:
> > Anyone interested in performance of these functions may also be interested 
> > in bug 12100, where the SSE4 version of strstr reintroduces the unwanted 
> > quadratic asymptotic performance.  (This is not a comment on your patch 
> > itself, but a mention of something people looking at this area might also 
> > be interested in.)
> Thanks for the pointer.  For avoidance of doubt, I've benchmarked the patch on a Core2 machine without SSE4.
> I will benchmark the SSE4 implementation against the normal + this patch on short needles.  The benchmark that motivated this patch is libosip message parsing, which heavily uses string functions with small strings.

I posted strstr implementation at this list and got no response. 

I use only SSE2 instructions and my algorithm is faster than using SSE4 instructions.
With trivial modifications I could also use plain arithmetic/AVX2

A trick is to check first two characters or zero terminator in parallel.
This is about 25 times faster on core2 vs glibc one, on i7 its only 3
times faster.

I use these trick for my extension of regular expression engine a
partialy autogenerated strstr follows.

> --
> Maxim Kuvyrkov
> CodeSourcery / Mentor Graphics


transient bus protocol violation

Attachment: strstr2.c
Description: Text document

Attachment: sse2.h
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]