This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: strcasecmp/strncasecmp optmization for power8 [BZ 20327]


Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:

> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
> Subject: [PATCH] POWER8: Fix return code of strcasecmp for unaligned inputs

Could you replace POWER8 by powerpc, please?

> If the input values are unaligned and if there are null characters in the
> memory before the starting address of the input values, strcasecmp
> gives incorrect return code. Fixed it by adding mask the bits that
> are not part of the string.
>
> Tested on ppc64 and ppc64le.

Despite this being a bug fix, I believe we need the approval from Adhemerval
before integrating it during the freeze window.

> 	[BZ #20327]
> 	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
> 	are not part of the string.

This is a very important case.  Can we improve the current testcase to
validate this scenario too?

> ---
>  sysdeps/powerpc/powerpc64/power8/strcasecmp.S | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> index 63f6217..d6a4df2 100644
> --- a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> +++ b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> @@ -44,7 +44,9 @@
>  #ifdef __LITTLE_ENDIAN__
>  #define GET16BYTES(reg1, reg2, reg3) \
>  	lvx	reg1, 0, reg2; \
> -	vcmpequb.	v8, v0, reg1; \
> +	vspltisb	v8, -1; \
> +	vperm	v8, v8, reg1, reg3; \
> +	vcmpequb.	v8, v0, v8; \
>  	beq	cr6, 1f; \
>  	vspltisb	v9, 0; \
>  	b	2f; \
> @@ -57,7 +59,9 @@
>  #else
>  #define GET16BYTES(reg1, reg2, reg3) \
>  	lvx	reg1, 0, reg2; \
> -	vcmpequb.	v8, v0, reg1; \
> +	vspltisb	 v8, -1; \
> +	vperm	v8, reg1, v8,  reg3; \
> +	vcmpequb.	v8, v0, v8; \
>  	beq	cr6, 1f; \
>  	vspltisb	v9, 0; \
>  	b	2f; \

Although this code is simple, I believe this macro is missing more comments.

I suggest to explain the following:
 - How does this macro use reg1, reg2, reg3 and v8?
 - Why is it setting v9 to 0?

-- 
Tulio Magno


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]