strstr, strcasestr speedup; add memmem
Eric Blake
ebb9@byu.net
Fri Jan 11 02:22:00 GMT 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Duane Ellis on 1/10/2008 6:08 PM:
| Eric Blake wrote:
|> OK to commit?
|>
| to be clear, i did not try these, just a quick 2 minute glance.
|
| Optimizations like this, for some are helpful, for others - painful.
| Newlib is still found on _tiny_ systems with very limited code space.
|
| Thank you for including _OPTIMIZE_SPEED_ in this.
Actually, I used this idiom, borrowed from strchr:
#if !defined(PREFER_SIZE_OVER_SPEED) && !defined(__OPTIMIZE_SIZE__)
~ lots of code but linear speed on all possible input
#else
~ compact code, but quadratic speed on worst-case input
#endif
but yes, I intentionally made my patch have the smallest code footprint
possible for platforms that are concerned about code size. Quadratic
effects only come into play when the needle is long and its prefix is
repetitive; oftentimes, if you are on a memory-constrained system and want
the __OPTIMIZE_SIZE__ version, you are unlikely to be using arbitrary
needles in the first place, so you wouldn't notice much speed difference
between the two versions because you are never calling these functions
with problematic arguments.
On the flip side, newlib clients that don't care much about code size, but
do care about searching through arbitrary data not known at compile time
(for example, cygwin), get the benefit of the guaranteed linear execution
time; without this patch, you could launch a form of denial-of-service
attack by crafting strings that will lock the processor up in needless
searching loops. For an example of sniffing whether your system's strstr
is quadratic (and note that even glibc 2.6.1 has this bug), you can do
things like:
$ time perl -e "print \"index(\" . 'a'x200000 . \"b, \" . 'a'x100000 .
\"b)\"" | m4
100000
real 0m14.436s
user 0m13.889s
sys 0m0.061s
$ time perl -e "print \"index(\" . 'a'x400000 . \"b, \" . 'a'x100000 .
\"b)\"" | m4
300000
real 0m43.363s
user 0m42.405s
sys 0m0.091s
Ouch - doubling the search space resulted in more than triple the search
time. But with a linear strstr, the above example completes in less than
a second.
- --
Don't work too hard, make some time for fun as well!
Eric Blake ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHhsdI84KuGfSFAYARAvzxAKCHgrDoQfexkWh7Qy7+sjeSm/yY4QCgrMyF
2i3ynRY5RHLzot10WYbADwI=
=nwtN
-----END PGP SIGNATURE-----
More information about the Newlib
mailing list