This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Testing 2.16 release candidate with gnulib - quadratic behaviourdetected in SSE42 strstr?
- From: Dmitrieva Liubov <liubov dot dmitrieva at gmail dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>, libc-alpha at sourceware dot org, carlos_odonell at mentor dot com
- Date: Thu, 28 Jun 2012 15:12:58 +0400
- Subject: Re: Testing 2.16 release candidate with gnulib - quadratic behaviourdetected in SSE42 strstr?
- References: <4FEBDA04.6010709@mentor.com><CAMe9rOqDXGo69Pa9ZxurQ0YwAPbr258B=Rc0UpOmx5yGo4uC3g@mail.gmail.com>
Hello,
Yes, sse42 strstr shows definitely quadratic time in your example, so
2 times decreasing / increasing of m gives 4 times impact on an
execution time.
This sse42 strstr algorithm was announced here
http://sources.redhat.com/ml/libc-alpha/2009-07/msg00052.html as
Knuth–Morris–Pratt algorithm, but correct KMP has linear time
estimation at worst and can't give quadratic complexity at all.
So, for my machine sse42 strstr without alarm gives:
real 2m17.983s
user 2m16.951s
sys 0m0.107s
And I also have correct KMP algorithm (attached for information)
written in plain C and it easily passes configure test with 5 sec
alarm. :
real 0m0.015s
user 0m0.011s
sys 0m0.004s
Anyway sse42 strstr looks rather difficult and I can't say at the
moment if it's possible to make it always linear like KMP.
More time for investigation is required here.
Should I file a bug or anybody already did it?
--
Liubov Dmitireva
Software Engineer
Intel Corporation
> ---------- Forwarded message ----------
> From: "Carlos O'Donell" <carlos_odonell@mentor.com>
> Date: Jun 27, 2012 9:14 PM
> Subject: Testing 2.16 release candidate with gnulib - quadratic behaviour
> detected in SSE42 strstr?
> To: "libc-alpha" <libc-alpha@sourceware.org>, "Bruno Haible"
> <bruno@clisp.org>
>
> Community,
>
> http://sourceware.org/glibc/wiki/Testing/Gnulib
>
> Running the 2.16 release candidate against the gnulib
> testsuite has turned up three object files being build
> by gnulib that aren't listed in the known issues:
>
> (a) logf
> (b) log10f
> (c) remove
> (d) strstr
>
> I've run out of time just getting the build bits setup
> and reviewing 2.15 backports. I'm hoping that some of
> my European friends can look at these as the earth turns
> to their part of the world...
>
> Initial analysis:
>
> (a) logf
> (b) log10f
>
> Why wasn't -lm added to the link?
>
> This looks like an environment/configuration issue with gnulib.
>
> Adding `-lm' to CC fixes this, but that can't be right.
>
> configure:84449: checking whether log10f works according to ISO C 99 with
> IEC 60559
> configure:84504: gcc
> -Wl,-rpath=/scratch/carloso/build4-lucid-cs/install//lib64:/scratch/carloso/build4-lucid-cs/install//usr/lib64
> -Wl,--dynamic-linker=/scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2
> -std=gnu99 -o conftest -g -O2 -nostdinc
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include-fixed
> -I/scratch/carloso/build4-lucid-cs/install//usr/include
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include
> -Wall conftest.c >&5
> /tmp/cceWScoH.o: In function `main':
> /tmp/testdir/conftest.c:632: undefined reference to `log10f'
> collect2: ld returned 1 exit status
> configure:84508: $? = 1
> configure: program exited with status 1
> configure: failed program was:
> ...
> configure:84537: result: no
>
> (c) remove
>
> This function is added automatically.
>
> e.g.
> ~~~ config.log ~~~
> gl_LIBOBJS=' asnprintf.o chdir-long.o dprintf.o fclose.o fcntl.o fflush.o
> fprintf.o fpurge.o fseek.o fseeko.o fseterr.o futimens.o glob.o ioctl.o
> isfinite.o isnand.o isnanf.o isnanl.o linkat.o nanosleep.o openat-proc.o
> printf.o printf-args.o printf-parse.o remove.o snprintf.o sprintf.o
> strerror_r.o strstr.o utimensat.o vasnprintf.o vdprintf.o vfprintf.o
> vprintf.o vsnprintf.o vsprintf.o'
> gl_LTLIBOBJS=' asnprintf.lo chdir-long.lo dprintf.lo fclose.lo fcntl.lo
> fflush.lo fprintf.lo fpurge.lo fseek.lo fseeko.lo fseterr.lo futimens.lo
> glob.lo ioctl.lo isfinite.lo isnand.lo isnanf.lo isnanl.lo linkat.lo
> nanosleep.lo openat-proc.lo printf.lo printf-args.lo printf-parse.lo
> remove.lo snprintf.lo sprintf.lo strerror_r.lo strstr.lo utimensat.lo
> vasnprintf.lo vdprintf.lo vfprintf.lo vprintf.lo vsnprintf.lo vsprintf.lo'
> ~~~
>
> There is no mention of this in the known issues list here:
> http://sourceware.org/glibc/wiki/Testing/Gnulib
>
> (d) strstr
>
> This is the one that worries me.
>
> configure:121073: checking whether strstr works in linear time
> configure:121161: gcc -lm
> -Wl,-rpath=/scratch/carloso/build4-lucid-cs/install//lib64:/scratch/carloso/build4-lucid-cs/install//usr/lib64
> -Wl,--dynamic-linker=/scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2
> -std=gnu99 -o conftest -g -O2 -nostdinc
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include-fixed
> -I/scratch/carloso/build4-lucid-cs/install//usr/include
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include
> -Wall conftest.c >&5
> configure:121165: $? = 0
> configure:121171: ./conftest
> configure:121175: $? = 142
> configure: program exited with status 142
> configure: failed program was:
> | /* confdefs.h. */
> ...
> | /* end confdefs.h. */
> |
> | #include <signal.h> /* for signal */
> | #include <string.h> /* for strstr */
> | #include <stdlib.h> /* for malloc */
> | #include <unistd.h> /* for alarm */
> | static void quit (int sig) { exit (sig + 128); }
> |
> | int
> | main ()
> | {
> |
> | int result = 0;
> | size_t m = 1000000;
> | char *haystack = (char *) malloc (2 * m + 2);
> | char *needle = (char *) malloc (m + 2);
> | /* Failure to compile this test due to missing alarm is okay,
> | since all such platforms (mingw) also have quadratic strstr. */
> | signal (SIGALRM, quit);
> | alarm (5);
> | /* Check for quadratic performance. */
> | if (haystack && needle)
> | {
> | memset (haystack, 'A', 2 * m);
> | haystack[2 * m] = 'B';
> | haystack[2 * m + 1] = 0;
> | memset (needle, 'A', m);
> | needle[m] = 'B';
> | needle[m + 1] = 0;
> | if (!strstr (haystack, needle))
> | result |= 1;
> | }
> | return result;
> |
> | ;
> | return 0;
> | }
> configure:121193: result: no
>
> This looks like it might be a real bug.
>
> Removing the alarm(5); and attaching to the long running process shows:
>
> GNU gdb (Sourcery CodeBench 2012.09-1 - Preview) 7.2.50.20100908-cvs
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu".
> For bug reporting instructions, please see:
> <https://support.codesourcery.com/GNUToolchain/>.
> (gdb) attach 8102
> Attaching to process 8102
> Reading symbols from /tmp/testdir/conftest...done.
> warning: Could not load shared library symbols for linux-vdso.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> Reading symbols from
> /scratch/carloso/build4-lucid-cs/install//lib64/libm.so.6...done.
> Loaded symbols for /scratch/carloso/build4-lucid-cs/install//lib64/libm.so.6
> Reading symbols from
> /scratch/carloso/build4-lucid-cs/install//lib64/libc.so.6...done.
> Loaded symbols for /scratch/carloso/build4-lucid-cs/install//lib64/libc.so.6
> Reading symbols from
> /scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for
> /scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2
> 0x00007fcc57c6c272 in __m128i_strloadu (s1=0x7fcc582c625d 'A' <repeats 200
> times>..., s2=0x7fcc57a56010 'A' <repeats 200 times>...) at
> ../sysdeps/x86_64/multiarch/strstr.c:92
> 92 if (__builtin_expect ((int) ((size_t) p & 0xfff) > 0xff0, 0))
> (gdb) bt
> #0 0x00007fcc57c6c272 in __m128i_strloadu (s1=0x7fcc582c625d 'A' <repeats
> 200 times>..., s2=0x7fcc57a56010 'A' <repeats 200 times>...) at
> ../sysdeps/x86_64/multiarch/strstr.c:92
> #1 __strstr_sse42 (s1=0x7fcc582c625d 'A' <repeats 200 times>...,
> s2=0x7fcc57a56010 'A' <repeats 200 times>...) at
> ../sysdeps/x86_64/multiarch/strstr.c:317
> #2 0x0000000000400734 in main () at conftest.c:1019
> (gdb) c
> Continuing.
>
> Does this mean we've picked up quadratic behaviour in the supposedly fast
> SSE42 optimized strstr? :-(
>
> The test finishes in an awesome 5 minutes :-)
>
> carloso@build4-lucid-cs:/tmp/testdir$ time ./conftest
> real 5m22.318s
> user 4m21.060s
> sys 0m0.000s
>
> Could someone please verify this?
>
> Cheers,
> Carlos.
> --
> Carlos O'Donell
> Mentor Graphics / CodeSourcery
> carlos_odonell@mentor.com
> carlos@codesourcery.com
> +1 (613) 963 1026
>
#include <stdlib.h>
#include <string.h>
static int *compute_prefix_function(char *pattern, int psize)
{
int k = -1;
int i = 1;
int *pi = malloc(sizeof(int)*psize);
if (!pi)
return NULL;
pi[0] = k;
for (i = 1; i < psize; i++) {
while (k > -1 && pattern[k+1] != pattern[i])
k = pi[k];
if (pattern[i] == pattern[k+1])
k++;
pi[i] = k;
}
return pi;
}
static int kmp(char *target, int tsize, char *pattern, int psize)
{
int i;
int *pi = compute_prefix_function(pattern, psize);
int k = -1;
if (!pi)
return -1;
for (i = 0; i < tsize; i++) {
while (k > -1 && pattern[k+1] != target[i])
k = pi[k];
if (target[i] == pattern[k+1])
k++;
if (k == psize - 1) {
free(pi);
return i-k;
}
}
free(pi);
return -1;
}
char *strstr_kmp (char * haystack, char * needle) {
int index = kmp(haystack, strlen(haystack), needle, strlen(needle));
if (index == -1) return NULL;
return haystack + index;
}