This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: strcasecmp/strncasecmp optmization for power8 [BZ 20327]




On 07/05/2016 07:31 PM, Tulio Magno Quites Machado Filho wrote:
Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:

On 07/04/2016 07:46 PM, Florian Weimer wrote:
On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
Subject: [PATCH] POWER8: Fix return code of strcasecmp for unaligned inputs
Could you replace POWER8 by powerpc, please?

If the input values are unaligned and if there are null characters in the
memory before the starting address of the input values, strcasecmp
gives incorrect return code. Fixed it by adding mask the bits that
are not part of the string.

Tested on ppc64 and ppc64le.
Despite this being a bug fix, I believe we need the approval from Adhemerval
before integrating it during the freeze window.

	[BZ #20327]
	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
	are not part of the string.
This is a very important case.  Can we improve the current testcase to
validate this scenario too?

---
  sysdeps/powerpc/powerpc64/power8/strcasecmp.S | 8 ++++++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
index 63f6217..d6a4df2 100644
--- a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
+++ b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
@@ -44,7 +44,9 @@
  #ifdef __LITTLE_ENDIAN__
  #define GET16BYTES(reg1, reg2, reg3) \
  	lvx	reg1, 0, reg2; \
-	vcmpequb.	v8, v0, reg1; \
+	vspltisb	v8, -1; \
+	vperm	v8, v8, reg1, reg3; \
+	vcmpequb.	v8, v0, v8; \
  	beq	cr6, 1f; \
  	vspltisb	v9, 0; \
  	b	2f; \
@@ -57,7 +59,9 @@
  #else
  #define GET16BYTES(reg1, reg2, reg3) \
  	lvx	reg1, 0, reg2; \
-	vcmpequb.	v8, v0, reg1; \
+	vspltisb	 v8, -1; \
+	vperm	v8, reg1, v8,  reg3; \
+	vcmpequb.	v8, v0, v8; \
  	beq	cr6, 1f; \
  	vspltisb	v9, 0; \
  	b	2f; \
Although this code is simple, I believe this macro is missing more comments.

I suggest to explain the following:
  - How does this macro use reg1, reg2, reg3 and v8?
  - Why is it setting v9 to 0?

Added comments and Committed it as
30e4cc5413f72c2c728a544389da0c48500d9904

--
Thanks
Rajalakshmi S


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]