Re: Proposal to handle __strstr_sse42 and friends issue on x86

On Wed, Dec 11, 2013 at 10:55:01AM +1000, Allan McRae wrote:
> Hi all,
> For those who need some background, see [1].  In short, there is an
> issue with __strstr_sse42 on x86 which has a variety of workarounds.
> Some distributions re-add the inline statement, which is clearly fragile
> and not a fix. Others remove the sse42 string functions - see [2].
> I am going to propose we adopt the removal of the SSE42 routines.  We
> can not ensure that binaries are built with a new enough compiler (gcc
> after 2000) and keep backwards compatibility.  Also, ensuring the stack
> is aligned when entering these functions would be a performance hit that
> would likely remove any advantage of the sse42 routine (not tested...),
> and there are proposals to remove the sse42 routines for both x86 and
> x86_64 due to quadratic complexity anyway [3,4].
> So applying the patch in [2] seems the best approach to me?  Any
> comments/objections?
I send a patch that improves strstr performance. It got acked by Liubov
Dmitrieva and I asked if there are more comments and forgoten about it.

I applied that now, 
sse42 routines are quite ineffective in that regard, with plain sse2 you
can get around five times faster. I planned to add a version that avoids
unaligned loads for older processors.

You can also use this one you just improve performance 15 times instead
30 if you expanded unaligned loads into aligned ones.

