[PATCH]: Optimization for strpbrk on PowerPC
Adhemerval Zanella
azanella@linux.vnet.ibm.com
Tue Mar 4 20:43:00 GMT 2014
On 03-03-2014 08:35, Adhemerval Zanella wrote:
> +EALIGN (strpbrk, 4, 0)
> + CALL_MCOUNT 3
> +
> + /* The idea to speed up the algorithm is to create a lookup table
> + for fast check if input character should be considered. For ASCII
> + or ISO-8859-X character sets it has 256 positions. */
> + lbz r10,0(r4)
> +
> + /* First the table should be cleared and to avoid unaligned accesses
> + when using the VSX stores the table address is aligned to 16
> + bytes. */
> + xxlxor v0,v0,v0
> + addi r9,r1,-272 /* Allocates ALIGN(256 + 15, 8) bytes */
I realized that this stack allocation is not really align to 16, but rather allocation
256+16 bytes. Below it is a correct way to do it (I changed the patch for it):
@@ -33,7 +33,12 @@ EALIGN (strpbrk, 4, 0)
when using the VSX stores the table address is aligned to 16
bytes. */
xxlxor v0,v0,v0
- addi r9,r1,-272 /* Allocates ALIGN(256 + 15, 8) bytes */
+
+ /* PPC64 ELF ABI stack is aligned to 8 bytes, so allocates 264 bytes
+ (256 for the the table plus 8), and aligned it to 16 bytes. */
+ addi r9,r1,-264
+ rldicr r9,r9,0,59
+
li r5,16
li r6,32
li r8,48
More information about the Libc-alpha
mailing list