This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v3] powerpc: strstr optimization
- From: Steven Munroe <munroesj at linux dot vnet dot ibm dot comcom>
- To: Joseph Myers <joseph at codesourcery dot com>
- Cc: Tulio Magno Quites Machado Filho <tuliom at linux dot vnet dot ibm dot com>, "Carlos O'Donell" <carlos at redhat dot com>, Steve Munroe <sjmunroe at us dot ibm dot com>, Florian Weimer <fweimer at redhat dot com>, OndÅej BÃlka <neleai at seznam dot cz>, GNU C Library <libc-alpha at sourceware dot org>, Rajalakshmi Srinivasaraghavan <raji at linux dot vnet dot ibm dot com>
- Date: Wed, 22 Jul 2015 15:10:09 -0500
- Subject: Re: [PATCH v3] powerpc: strstr optimization
- Authentication-results: sourceware.org; auth=none
- References: <558A5642 dot 5020107 at linux dot vnet dot ibm dot com> <558A5761 dot 2000409 at linux dot vnet dot ibm dot com> <87oajpm8nc dot fsf at totoro dot br dot ibm dot com> <871tgijuri dot fsf at linux dot vnet dot ibm dot com> <55A6FE3F dot 6090701 at redhat dot com> <55A70B70 dot 6090607 at redhat dot com> <20150716195538 dot GA5140 at domone> <55A8110C dot 7000209 at redhat dot com> <alpine dot DEB dot 2 dot 10 dot 1507221607370 dot 21570 at digraph dot polyomino dot org dot uk> <55AFD91C dot 30404 at redhat dot com> <87d1zk9hc7 dot fsf at linux dot vnet dot ibm dot com> <alpine dot DEB dot 2 dot 10 dot 1507221933190 dot 19567 at digraph dot polyomino dot org dot uk>
- Reply-to: munroesj at linux dot vnet dot ibm dot com
On Wed, 2015-07-22 at 19:41 +0000, Joseph Myers wrote:
> On Wed, 22 Jul 2015, Tulio Magno Quites Machado Filho wrote:
>
> > >> If there's a quadratic worst case newly introduced for 2.22, I'd consider
> > >> that a security hole (denial of service) that needs to block the release
> > >> of 2.22 until it's fixed (possibly by removing the implementation in
> > >> question).
> >
> > Could you elaborate why a quadratic worst case in a string function can be
> > considered a denial of service, please?
>
> Because code may use strstr on untrusted inputs, meaning small inputs
> (e.g. MB in size) cause large (effectively infinite) CPU usage - any case
> where an attacker can cause your resource usage to grow more than linearly
> with their usage is a denial-of-service problem. Thus gnulib considers
> such strstr implementations buggy and disables them (so preventing any of
> your optimizations from being effective for any program using gnulib), but
> not everything uses gnulib.
>
> (There are cases, e.g. untrusted regular expressions, where glibc can't
> reasonably avoid such resource usage, and so anyone using untrusted
> regular expressions must apply resource limits externally. But strstr
> isn't such a case.)
>
> > > Tulio,
> > >
> > > Could you please review the possibility of quadratic behaviour and respond
> > > prompty? I don't want this to hold up the release.
> >
> > I confirm this algorithm does have a quadratic worst case, which appears if
> > needle <= 2048.
> > If the needle > 2048, it falls back to the previous implementation.
>
> strstr should be O(m+n) (for needle length m, haystack length n); a naive
> implementation could be O(mn). Are you saying that, for constant m <=
> 2048, it's O(n^2)? Because O(mn) for bounded m is fine (the threshold
> 2048 would preferably have been determined by some sort of benchmark, but
> the precise choice of threshold isn't something to block a release).
>
This is not an objective standard because real code in the real world
does not behave in a mathematically precise way.
What is good enough and what not good enough has to be measured and
measurable.
So if the big nettle is 4X slower then the small nettle, it what a
denial of service? I don't think so!
What about 8x, or 10x? may be? but I still don't think so because
example given are extremely unlikely in the real world, and if a
deliberately bad case you may just see a slightly higher peak load.
On a server with 100-1000s processors who would notice?
100x well that might be a problem.
So what is you number? Because quadratic and O(this) and O(that) are
meaningless in this context.