This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PATH: removing quadratic complexity strstr, strcasestr and strcasestr-nonascii
- From: Maxim Kuvyrkov <maxim at codesourcery dot com>
- To: Dmitrieva Liubov <liubov dot dmitrieva at gmail dot com>
- Cc: <libc-alpha at sourceware dot org>
- Date: Tue, 3 Jul 2012 07:07:54 +1200
- Subject: Re: PATH: removing quadratic complexity strstr, strcasestr and strcasestr-nonascii
- References: <CAHjhQ92pjrpesn+ai7N71V4Cqs_Ty5nO+dqtCCSzevyrDNmqfA@mail.gmail.com>
On 30/06/2012, at 3:31 AM, Dmitrieva Liubov wrote:
> This patch removes SSE42 strstr, strcasestr and strcasestr-nonascii
> because these SSE42 algorithms have quadratic complexity.
>
> The patch was checked on x86_64 and x86_32.
>
> ChangeLog:
>
> 2012-06-29 Liubov Dmitrieva <liubov.dmitrieva@gmail.com>
>
> * sysdeps/i386/i686/multiarch/Makefile: Update
> Remove strstr, strcasesrt, strcasestr-nonascii
> SSE42 optimizations since algos have quadratic complexity
> * sysdeps/i386/i686/multiarch/strcasestr-c.c: Delete
> * sysdeps/i386/i686/multiarch/strcasestr-nonascii.c: Delete
> * sysdeps/i386/i686/multiarch/strstr-c.c: Delete
> * sysdeps/i386/i686/multiarch/strstr.c: Delete
>
> * sysdeps/x86_64/multiarch/Makefile: Update
> Remove strstr, strcasesrt, strcasestr-nonascii
> SSE42 optimizations since algos have quadratic complexity
> * sysdeps/x86_64/multiarch/strcasestr-c.c: Delete
> * sysdeps/x86_64/multiarch/strcasestr-nonascii.c: Delete
> * sysdeps/x86_64/multiarch/strstr-c.c: Delete
> * sysdeps/x86_64/multiarch/strstr.c: Delete
>
While I don't have strong opinion about this patch, it doesn't seem right to throw away a perfectly good SSE42 implementation that outperforms the generic one by a factor of 2 on small needles.
As suggested in http://sourceware.org/bugzilla/show_bug.cgi?id=11607#c4, we can use a hybrid approach and use SSE42 implementation for short needles while falling back to generic O(n) implementation for long needles. String/str[case]str.c has
if (needle_len < LONG_NEEDLE_THRESHOLD)
return two_way_short_needle ((const unsigned char *) haystack,
haystack_len,
(const unsigned char *) needle, needle_len);
return two_way_long_needle ((const unsigned char *) haystack, haystack_len,
(const unsigned char *) needle, needle_len);
So, if you renamed current __strstr_sse42 to two_way_short_needle_sse42, and adjusted ifunc definition to resolve two_way_short_needle to either SSE42 implementation or to the generic one -- you would get the best of both worlds.
--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics