This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] PPC optimize strncmp for PPC32/64
- From: Paul Mackerras <paulus at samba dot org>
- To: sjmunroe at vnet dot ibm dot com
- Cc: libc-alpha at sources dot redhat dot com
- Date: Thu, 23 Oct 2003 09:27:44 +1000
- Subject: Re: [PATCH] PPC optimize strncmp for PPC32/64
- References: <3F96FDF2.email@example.com>
Steve Munroe writes:
> Currently PPC32/64 use the generic/strncmp.c implementation. This shows
> up in a recent benchmark as 25% of the total execution. The attached asm
> implementations give a 15-18% improvement over the generic code.
This looks great, but I wonder if we can do better in the unaligned
case than just falling back to the byte-by-byte comparison. I
strongly suspect that doing a word-by-word comparison with unaligned
word loads on one of the strings would be faster than the byte-by-byte
code. I suggest word-by-word (32 bit) rather than doubleword (64 bit)
because doubleword loads that are not 4-byte aligned will trap on
POWER3 (though not on POWER4).