This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] powerpc: Optimized strcmp for POWER8/PPC64

From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
To: Ondrej Bilka <ondra at kam dot mff dot cuni dot cz>
Cc: "GNU C. Library" <libc-alpha at sourceware dot org>
Date: Wed, 07 Jan 2015 15:10:25 -0200
Subject: Re: [PATCH] powerpc: Optimized strcmp for POWER8/PPC64
Authentication-results: sourceware.org; auth=none
References: <54AD48A6 dot 9000105 at linux dot vnet dot ibm dot com> <20150107163026 dot GA24290 at kam dot mff dot cuni dot cz>

On 07-01-2015 14:30, Ondrej Bilka wrote:
>> +	/* For short string up to 16 bytes, load both s1 and s2 using
>> +	   unaligned dwords and compare.  */
>> +	ld	r8,0(r3)
>> +	ld	r10,0(r4)
>> +	li	r9,0
>> +	cmpb	r7,r8,r9
>> +	cmpdi	cr7,r7,0
>> +	mr	r9,r7
>> +	bne 	cr7,L(null_found)
>> +	cmpld	cr7,r8,r10
>> +	bne	cr7,L(different)
>> +	ld	r8,8(r3)
>> +	ld	r10,8(r4)
>> +	cmpb	r9,r8,r7
>> +	cmpdi	cr7,r9,r0
>> +	bne	cr7,L(null_found)
>> +	cmpld	cr7,r8,r10
>> +	bne	cr7,L(different)
>> +	addi	r7,r3,16
>> +	addi	r4,r4,16
> It makes no sense to do two separate checks which create pretty unpredictable branches.
>
> Just or these two check and look at first nonzero byte. Either they differ at that offset or both are zero and easily get result from that.
>
Which two checks are you referring exactly? The first two:

+	ld	r8,0(r3)
+	ld	r10,0(r4)
+	li	r9,0
+	cmpb	r7,r8,r9
+	cmpdi	cr7,r7,0
+	mr	r9,r7
+	bne 	cr7,L(null_found)
+	cmpld	cr7,r8,r10
+	bne	cr7,L(different)

First cmpb instruction is not a branch instruction (and thus has no affect on 
branch prediction).  Also, in this code is it has to check for NULL first 
before start to check different bytes at second dword.  For instance, for
strings:

source1:
0x3fffb7fbffe8: 0x01    0x18    0x2f    0x46    0x5d    0x74    0x0c    0x23
0x3fffb7fbfff0: 0x00    0x17    0xa5    0xa5    0xa5    0xa5    0xa5    0xa5

source2:
0x3fffb7f7ffe8: 0x01    0x18    0x2f    0x46    0x5d    0x74    0x0c    0x23
0x3fffb7f7fff0: 0x00    0x18    0x5a    0x5a    0x5a    0x5a    0x5a    0x5a

If it omits the first '\0' check the second check (0x3fffb7fbfff0 vs 
0x3fffb7f7fff0) will indicate different string and handled by L(different)
label. And the result would be -1 (0x17-0x18), instead of intended 0.  
I could add another '\0' on L(different) label, but it will be just move
the branch to elsewhere, not eliminating it.

Follow-Ups:
- Re: [PATCH] powerpc: Optimized strcmp for POWER8/PPC64
  - From: OndÅej BÃlka

References:
- [PATCH] powerpc: Optimized strcmp for POWER8/PPC64
  - From: Adhemerval Zanella

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]