This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: strcasecmp/strncasecmp optmization for power8 [BZ 20327]
- From: "Tulio Magno Quites Machado Filho" <tuliom at linux dot vnet dot ibm dot com>
- To: Rajalakshmi Srinivasaraghavan <raji at linux dot vnet dot ibm dot com>, Florian Weimer <fweimer at redhat dot com>, libc-alpha at sourceware dot org, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc:
- Date: Tue, 05 Jul 2016 11:01:12 -0300
- Subject: Re: [PATCH] powerpc: strcasecmp/strncasecmp optmization for power8 [BZ 20327]
- Authentication-results: sourceware.org; auth=none
- References: <1461919871-30348-1-git-send-email-raji@linux.vnet.ibm.com> <87eg809a1i.fsf@totoro.br.ibm.com> <575FD251.7010705@linux.vnet.ibm.com> <6b38e864-0607-0bba-ff18-617f536f4146@redhat.com> <577BA36D.5060601@linux.vnet.ibm.com>
Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:
> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
> Subject: [PATCH] POWER8: Fix return code of strcasecmp for unaligned inputs
Could you replace POWER8 by powerpc, please?
> If the input values are unaligned and if there are null characters in the
> memory before the starting address of the input values, strcasecmp
> gives incorrect return code. Fixed it by adding mask the bits that
> are not part of the string.
>
> Tested on ppc64 and ppc64le.
Despite this being a bug fix, I believe we need the approval from Adhemerval
before integrating it during the freeze window.
> [BZ #20327]
> * sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
> are not part of the string.
This is a very important case. Can we improve the current testcase to
validate this scenario too?
> ---
> sysdeps/powerpc/powerpc64/power8/strcasecmp.S | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> index 63f6217..d6a4df2 100644
> --- a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> +++ b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> @@ -44,7 +44,9 @@
> #ifdef __LITTLE_ENDIAN__
> #define GET16BYTES(reg1, reg2, reg3) \
> lvx reg1, 0, reg2; \
> - vcmpequb. v8, v0, reg1; \
> + vspltisb v8, -1; \
> + vperm v8, v8, reg1, reg3; \
> + vcmpequb. v8, v0, v8; \
> beq cr6, 1f; \
> vspltisb v9, 0; \
> b 2f; \
> @@ -57,7 +59,9 @@
> #else
> #define GET16BYTES(reg1, reg2, reg3) \
> lvx reg1, 0, reg2; \
> - vcmpequb. v8, v0, reg1; \
> + vspltisb v8, -1; \
> + vperm v8, reg1, v8, reg3; \
> + vcmpequb. v8, v0, v8; \
> beq cr6, 1f; \
> vspltisb v9, 0; \
> b 2f; \
Although this code is simple, I believe this macro is missing more comments.
I suggest to explain the following:
- How does this macro use reg1, reg2, reg3 and v8?
- Why is it setting v9 to 0?
--
Tulio Magno