This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Faster strlen
- From: Andi Kleen <andi at firstfloor dot org>
- To: OndÅej BÃlka <neleai at seznam dot cz>
- Cc: libc-alpha at sourceware dot org
- Date: Tue, 09 Oct 2012 06:51:15 -0700
- Subject: Re: [PATCH] Faster strlen
- References: <20121007172752.GA22344@domone.kolej.mff.cuni.cz>
OndÅej BÃlka <neleai@seznam.cz> writes:
>
> I also benchmarked atom and added variant which is identical to
> strlen-sse2-pminub except bsf is replaced by table lookup.
Is your micro benchmark just a tight loop or does it fill the caches?
I have doubts that table lookups are a good idea if it blows away
the working set in L1 for the application.
Microbenchmarks that do not use caches much can be very misleading
here. Even if it's slightly slower not doing table lookups
is usually preferred for functions like this, simply because it lessens
the impact on the caches.
I would recommend to measure what happens both if the microbenchmark
stresses data cache and icache. Otherwise you risk winning
benchmarks, but making real apps slower.
-Andi
--
ak@linux.intel.com -- Speaking for myself only