This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Optimize memcmp for ia32

On Tue, Feb 10, 2004 at 09:18:30AM -0800, H. J. Lu wrote:
> On Tue, Feb 10, 2004 at 03:48:19PM +0100, Jakub Jelinek wrote:
> > On Wed, Feb 04, 2004 at 04:11:26PM -0800, H. J. Lu wrote:
> > > This patch optimizes memcmp for ia32. I got average speeup by around
> > > 400%.
> > 
> > If not anything else, you should certainly handle PIC vs. !PIC differently
> > (for !PIC you don't need to call thunk etc.).
> I can change it.
> > Also, why do you need to use %ebx register when for example %eax is always
> > available?
> I will take a look.
> > Why do you need 4 separate L(Nbytes) sequences, the only difference between
> > them is in the last few instructions?  The bigger the routine is, the more
> > other instructions will be kicked out of the caches (especially for a
> > routine which is not the topmost in the benchmarks).
> > I'd say avoiding the table_32bytes table altogether, using just one of the
> > 4 sequences (with adjusted start) and computing the jump destination in
> > registers shouldn't slow things down.
> The adjustement may cause the slow down. With the jump table, we don't
> need to adjust anything at all for memoy block smaller than 32 bytes.
> That is where the speedup comes from.

I meant instead of
        addl    %ecx, %edx
        addl    %ecx, %esi
	andl	$-4, %ecx
        addl    %ecx, %edx
        addl    %ecx, %esi
or something like that (then you'd just start with -28(%esi) -> for 4
cases).  The %ecx & 3 previous value would need to be preserved till
the end, e.g. in the %ebx register which could be replaced with %eax
and you could hardcode that it jumps to L(28bytes) + 14 * (INDEX / 4).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]