This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Improve strstr performance of short needles
- From: "Gabriel F. T. Gomes" <gabriel at inconstante dot eti dot br>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, nd <nd at arm dot com>
- Date: Mon, 1 Oct 2018 17:46:34 -0300
- Subject: Re: [PATCH] Improve strstr performance of short needles
- References: <DB5PR08MB1030D79438AF0E17282F3E1783020@DB5PR08MB1030.eurprd08.prod.outlook.com> <20180928183018.551042ff@tereshkova> <DB5PR08MB1030F8FBED4FB20D34E2A1CE83EC0@DB5PR08MB1030.eurprd08.prod.outlook.com> <DB5PR08MB1030E149C92B9037E030E3FD83EF0@DB5PR08MB1030.eurprd08.prod.outlook.com>
On Mon, 01 Oct 2018, Wilco Dijkstra wrote:
>>> I ran some tests on a powerpc64le machine and I noticed that, as the
>>> haystack becomes larger, the search for a needle that is *not* in the
>>> haystack, became slower with this change.
>>
>> That seems odd, particularly since the Two-Way algorithm is pretty slow
>> for small needles (large preprocessing overhead and small skips).
>
>I looked at bench-strstr again. There are cases where Two-Way wins by
>10-15% when the string is found but for short strings the linear loop wins.
>On average I can't see any performance difference.
I ran bench tests with your patch on an x86 machine, and I noticed the
same behaviour that I described previously, however on fewer occasions
(never when needle is 2-chars long, and less frequently for 3- and 4-chars
long needles).
Both on powerpc64le and x86, the overall performance gain for bench tests
looks positive, so I don't actually see a compelling reason to not commit
this patch. The intention of my question was to know if you have noticed
something similar in your tests, and I'm glad to hear that you have used a
different benchmark.
>However when iterating through 16K needles randomly chosen from natural
>text I get the following speedups for different haystack and needle sizes:
>
> H64 H256 H4096
>N2: 64% 73% 46%
>N3: 74% 66% 37%
>N4: 94% 92% 66%
Could you add this information (data and explanation) to the commit
message? It would be good to be able to read it some years in the future
and know that a different benchmark (not bench tests) was used to produce
these numbers.
>As you can see the speedup is larger for small needles since the preprocessing
>overhead of the Two-Way algorithm is significant. This is completely hidden by
>the GLIBC test due to using just 1 needle.
Agreed.
Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>