This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Use Quicksearch in strstr
- From: Rich Felker <dalias at libc dot org>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: Alexander Monakov <amonakov at ispras dot ru>, 'GNU C Library' <libc-alpha at sourceware dot org>, nd <nd at arm dot com>
- Date: Thu, 1 Nov 2018 12:42:35 -0400
- Subject: Re: [PATCH] Use Quicksearch in strstr
- References: <DB5PR08MB1030E559B3818F52CA48A63583F30@DB5PR08MB1030.eurprd08.prod.outlook.com> <alpine.LNX.2.20.13.1810291936040.19896@monopod.intra.ispras.ru> <DB5PR08MB1030B4FB720644283130A84A83CC0@DB5PR08MB1030.eurprd08.prod.outlook.com>
On Tue, Oct 30, 2018 at 05:32:12PM +0000, Wilco Dijkstra wrote:
> Alexander Monakov wrote:
> > How does the worst case change look like (I think it's searching for "a{251}ba"
> > in "aaa...aaa" if memcmp works left-to-right)? Does it make sense to move
> > needle length cutoff further below 254?
This "quicksearch" is fundamentally a bad algorithm. If it looks fast
in some cases, it's just a matter of getting lucky, and most likely
due to bad tuning or missed optimizations in the implementation of the
algorithm you're comparing it against. It's fundamentally a naive
O(nm) search that will be this bad in all cases except where the
bad-character table happens to let you advance a lot.
> To be fair you have to compare the worst case of one algorithm vs the worst
> case of another since they are different... Two-way is slow on searching for
> ba{254}b in a long sequence of ba{253}ba{253}. Two-way is only about 3 times
> faster on this case despite having a theoretical factor of 256/2 advantage.
Do you have a basis for calling this "slow"? It should be roughly one
comparison per byte. "Only 3x faster" doesn't sound slow to me.
> I don't think there is any gain slowing down common cases in order to
> reduce the worst-case difference (assuming that is feasibe).
This argument gets made about strstr every couple years, and every
time it's wrong. It's not that "common cases" (as a general class) are
faster with any naive algorithm. Only very specific cases will ever be
faster. Most common cases are roughly the same speed as with a good
algorithm, and some common cases, plus a lot more uncommon ones, are
catastrophically slower with a bad algorithm.
To my knowledge there is no good test for a case that won't become
catastrophically slower with a bad algorithm. If there is, I would be
very surprised if the same test doesn't naturally translate into an
optimized setup for two-way.
> > + do
> > + {
> > + hs--;
>
> > Doesn't this move hs out of bounds and thus invoke undefined behavior?
>
> That's a theoretical issue given *(a+(i-1)), *((a-1)+i) and *((a+i)-1) behave
> identically.
Calling blatant undefined behavior "a theoretical issue" is not
something we should be doing in 2018.
Rich