This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
RE: String Functions for x86-64
- From: "Menezes, Evandro" <evandro dot menezes at amd dot com>
- To: "Ulrich Drepper" <drepper at redhat dot com>
- Cc: libc-alpha at sourceware dot org, "Meissner, Michael" <michael dot meissner at amd dot com>
- Date: Tue, 2 May 2006 14:52:14 -0500
- Subject: RE: String Functions for x86-64
Hi, Ulrich.
> > Here are the remaining routines bundled with the others
> previously submitted. Their performance is generally better
> than the existing ones on x86-64 processors.
>
> Where is the (reproducible) data to support that claim? I
> have no reason to believe you.
Just run the tst-* and test-* routines. I have other test programs which test other cases too, but I didn't think these results were necessary as GLIBC's test routines already support my statement. I'd be glad to summarize the results for either test suite for you, if you'd like.
> > strncmp, however, can perform a tad slower for some tiny
> blocks, but is faster, up to several times, for blocks 16
> bytes long or larger.
>
> But the strings passed to strncmp during normal operation are
> in 99% of all cases <= 16 chars. I have data to prove that.
> The fact that you gloss over this problem means you haven't
> done your homework. You are optimizing for the wrong use cases.
Actually, it just means that we looked at different benchmarks. If you use GLIBC's own test routines, you'll find out that in that range the new strncmp is sometimes faster, sometimes slower, depending on the alignments.
For example, for length 8/12 and alignment 0/0, the current one takes 43 cycles on an Opteron and the new one, 22 cycles. For length 8/12, alignment 4/4, the same 43 cycles and 96 cycles, respectively. On a P4, 104 cycles and 40 cycles, 40 cycles and 112 cycles, respectively.
Thanks,
_______________________________________________________
Evandro Menezes GNU Tools Team
512-602-9940 Advanced Micro Devices
evandro.menezes@amd.com Austin, TX