This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v3] powerpc: strstr optimization


On Wed, 2015-07-22 at 19:41 +0000, Joseph Myers wrote:
> On Wed, 22 Jul 2015, Tulio Magno Quites Machado Filho wrote:
> 
> > >> If there's a quadratic worst case newly introduced for 2.22, I'd consider 
> > >> that a security hole (denial of service) that needs to block the release 
> > >> of 2.22 until it's fixed (possibly by removing the implementation in 
> > >> question).
> > 
> > Could you elaborate why a quadratic worst case in a string function can be
> > considered a denial of service, please?
> 
> Because code may use strstr on untrusted inputs, meaning small inputs 
> (e.g. MB in size) cause large (effectively infinite) CPU usage - any case 
> where an attacker can cause your resource usage to grow more than linearly 
> with their usage is a denial-of-service problem.  Thus gnulib considers 
> such strstr implementations buggy and disables them (so preventing any of 
> your optimizations from being effective for any program using gnulib), but 
> not everything uses gnulib.
> 
> (There are cases, e.g. untrusted regular expressions, where glibc can't 
> reasonably avoid such resource usage, and so anyone using untrusted 
> regular expressions must apply resource limits externally.  But strstr 
> isn't such a case.)
> 
> > > Tulio,
> > >
> > > Could you please review the possibility of quadratic behaviour and respond
> > > prompty? I don't want this to hold up the release.
> > 
> > I confirm this algorithm does have a quadratic worst case, which appears if
> > needle <= 2048.
> > If the needle > 2048, it falls back to the previous implementation.
> 
> strstr should be O(m+n) (for needle length m, haystack length n); a naive 
> implementation could be O(mn).  Are you saying that, for constant m <= 
> 2048, it's O(n^2)?  Because O(mn) for bounded m is fine (the threshold 
> 2048 would preferably have been determined by some sort of benchmark, but 
> the precise choice of threshold isn't something to block a release).
> 

This is not an objective standard because real code in the real world
does not behave in a mathematically precise way.

What is good enough and what not good enough has to be measured and
measurable.

So if the big nettle is 4X slower then the small nettle, it what a
denial of service? I don't think so!

What about 8x, or 10x? may be? but I still don't think so because
example given are extremely unlikely in the real world, and if a
deliberately bad case you may just see a slightly higher peak load.

On a server with 100-1000s processors who would notice?

100x well that might be a problem.

So what is you number? Because quadratic and O(this) and O(that) are
meaningless in this context.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]