This is the mail archive of the
mailing list for the glibc project.
Re: PATCH: Optimize memcmp for ia32
On Tue, Feb 10, 2004 at 03:48:19PM +0100, Jakub Jelinek wrote:
> On Wed, Feb 04, 2004 at 04:11:26PM -0800, H. J. Lu wrote:
> > This patch optimizes memcmp for ia32. I got average speeup by around
> > 400%.
> If not anything else, you should certainly handle PIC vs. !PIC differently
> (for !PIC you don't need to call thunk etc.).
I can change it.
> Also, why do you need to use %ebx register when for example %eax is always
I will take a look.
> Why do you need 4 separate L(Nbytes) sequences, the only difference between
> them is in the last few instructions? The bigger the routine is, the more
> other instructions will be kicked out of the caches (especially for a
> routine which is not the topmost in the benchmarks).
> I'd say avoiding the table_32bytes table altogether, using just one of the
> 4 sequences (with adjusted start) and computing the jump destination in
> registers shouldn't slow things down.
The adjustement may cause the slow down. With the jump table, we don't
need to adjust anything at all for memoy block smaller than 32 bytes.
That is where the speedup comes from.
> And if you really need the table, shouldn't it go into .rodata and not
I will do that.
BTW, I will be out of office until Feb. 23.