This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: power7 strncmp optimization
- From: Will Schmidt <will_schmidt at vnet dot ibm dot com>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 22 Aug 2011 16:00:29 -0500
- Subject: Re: [PATCH] powerpc: power7 strncmp optimization
- References: <1312464973.1806.78.camel@farscape>
- Reply-to: will_schmidt at vnet dot ibm dot com
Reposting. This patch has been checked into the ibm/2.13/master branch,
commit a7e0baec8c61a6bdf3b8fcb4ccb725477254f1d3
and has been running string in our internal testing, etc.
Hi,
The following code change provides a throughput boost to the 64-bit
power7 strncmp code of approx 15%. The 32-bit throughput is not notably
affected by this change, so the change to the 32-bit code is done to
keep the two files in sync with each other.
2011-08-04 Will Schmidt <will_schmidt@vnet.ibm.com>
* sysdeps/powerpc/powerpc32/power7/strncmp.S: Adjust the alignment
and add nop instructions for throughput optimization.
* sysdeps/powerpc/powerpc64/power7/strncmp.S: Adjust the alignment
and nop instructions for throughput optimization.
diff --git a/sysdeps/powerpc/powerpc32/power7/strncmp.S b/sysdeps/powerpc/powerpc32/power7/strncmp.S
index 7ee9e03..db466f0 100644
--- a/sysdeps/powerpc/powerpc32/power7/strncmp.S
+++ b/sysdeps/powerpc/powerpc32/power7/strncmp.S
@@ -27,7 +27,7 @@
const char *s2 [r4],
size_t size [r5]) */
-EALIGN (BP_SYM(strncmp),4,0)
+EALIGN (BP_SYM(strncmp),5,0)
#define rTMP r0
#define rRTN r3
@@ -47,9 +47,11 @@ EALIGN (BP_SYM(strncmp),4,0)
#define rBITDIF r11 /* bits that differ in s1 & s2 words */
dcbt 0,rSTR1
+ nop
or rTMP,rSTR2,rSTR1
lis r7F7F,0x7f7f
dcbt 0,rSTR2
+ nop
clrlwi. rTMP,rTMP,30
cmplwi cr1,rN,0
lis rFEFE,-0x101
diff --git a/sysdeps/powerpc/powerpc64/power7/strncmp.S b/sysdeps/powerpc/powerpc64/power7/strncmp.S
index 5ee5e2e..eace179 100644
--- a/sysdeps/powerpc/powerpc64/power7/strncmp.S
+++ b/sysdeps/powerpc/powerpc64/power7/strncmp.S
@@ -27,7 +27,7 @@
const char *s2 [r4],
size_t size [r5]) */
-EALIGN (BP_SYM(strncmp),4,0)
+EALIGN (BP_SYM(strncmp),5,0)
CALL_MCOUNT 3
#define rTMP r0
@@ -48,9 +48,11 @@ EALIGN (BP_SYM(strncmp),4,0)
#define rBITDIF r11 /* bits that differ in s1 & s2 words */
dcbt 0,rSTR1
+ nop
or rTMP,rSTR2,rSTR1
lis r7F7F,0x7f7f
dcbt 0,rSTR2
+ nop
clrldi. rTMP,rTMP,61
cmpldi cr1,rN,0
lis rFEFE,-0x101