This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Use Quicksearch in strstr

On Wed, Nov 07, 2018 at 04:01:51PM +0000, Wilco Dijkstra wrote:
> Rich Felker wrote:
> > On Tue, Nov 06, 2018 at 06:26:57PM +0000, Wilco Dijkstra wrote:
> >> That's not true at all - in reality there is hardly any difference in worst case 
> >> performance. Comparing a{253}ba on Quicksearch vs the ba{252}b on 
> >> Two-way using GLIBC 2.27 (original version of Two-way) shows Two-way
> >> is only about 16% faster. If the worst-case performance of Two-way was
> >> deemed acceptable, then the worst-case of Quicksearch must be too.
> >
> > I don't know what figures you're talking about, but the worst-case for
> > "quicksearch" with a bound of 256 bytes is easily over 100x slower
> > than the worst case of two-way.
> Not a chance. You do realize I am measuring this stuff for real rather than
> just making up random numbers? Worst-case performance of both algorithms
> is very similar for needles up to 256. It's not possible to get 10x difference, let
> alone 100x.

I'm not either, but we are measuring different implementations,
particularly of the underlying memcmp. But a better memcmp is not
going to make it 100x faster. Maybe 10x at best, leaving a 10x
difference still. And not all archs have a ridiculously fast memcmp.

Indeed, just re-testing on a glibc box, "quicksearch" takes 33 whole
microseconds to search (and not find) a{90}ba in (ba{89}){9}. Two-way
can do the same in 3 microseconds on this box, although glibc's (2.24
from Debian) included two-way on this box is much slower and takes 12.
So that's a 10x difference at 92 bytes (limitation of the measurement
code from a third party I was using). At 256 bytes I'd expect more
like a 25x difference.

> >> Getting another 2-3x speedup after doubling performance is a huge factor..
> >> The previous implementation was doing effectively nothing useful 75-85%
> >> of the time...
> >
> > Then fix implementation bugs. Don't replace good algorithms with awful
> > ones. For any given needle, it's definitely possible to make it such
> > that,...
> I already did that, remember? If you believe you can do better then post further
> performance improvements.

Nobody has replied to my branch of this thread about limiting shift in
the case of memory. I'm pretty confident it's right now so maybe I
should just submit a patch. But it's frustrating that there is more
interest in discussing bad algorithms than in improving the
implementation of a good one.

> > Quicksort performs *utterly abysmally* on plenty of needles shorter
> > than 256 or even shorter than 50. You cannot use it based on the
> > length of the needle except possibly for very short needles (maybe
> > <16), and even then you will do worse on a lot of real-world cases.
> Citation needed. You clearly haven't ever tried Quicksearch. It beats Two-way 
> in every single case on real inputs. The length of the needle matters only for the
> worst case, and by limiting the needle size, you avoid quadratic behaviour.

See above. I have tried it and it performs abysmally.

> > You can do this if you're implementing a custom search for your
> > application. You cannot do this for the libc strstr because it does
> > not know what the application is. It needs to be able to perform
> > non-pathologically for any inputs.
> Limiting the needle size to avoid quadratic behaviour is perfectly fine for GLIBC
> since the goal is to avoid denial of service via specially crafted inputs.
> It's impossible for strstr to always have identical performance on all possible
> inputs - practically every algorithm has fast and slow cases. Rabin-Karp is
> perhaps an exception given a good hash function. It's also 10x faster than
> Two-way slow cases, so it's a candidate to replace Two-way.

It's also quadratic, so it's not a candidate, unless you cap the
length for which it's used very low.

> >> >> > Calling blatant undefined behavior "a theoretical issue" is not
> >> >> > something we should be doing in 2018.
> >> >> 
> >> >> It's 100% theoretical - or you can show an example that could fail?
> >> >
> >> > With UBSan and LTO it should trap and abort.
> >> 
> >> I mean a real-world failure on an actual supported target, not something
> >> that's not actually possible.
> >
> > This is a real-world near-future target, and introducing gratuitous
> > breakage for it now is utterly backwards, especially when it's so easy
> > to avoid (skipping the -- ahead of time and just adding a -1 to each
> > []).
> It seems unlikely it would ever work across dynamic library boundaries, but 

Static linking is valid and supported usage.

> even if you replace the haystack input in strstr with a hardcoded array, UBSan
> does not report a problem. The only way I could make it report anything at all
> was to use hs = &s[-1].

If it fails to detect it, that's a bug in UBSan that will hopefully be
fixed at some point. I've actually had serious memory corruption bugs
that would have been caught by correctly trapping invalid pointer
arithmetic outside the bounds of the object, but weren't, so I'm aware
that it's not perfect and have a significant interest in its being
improved in the future.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]