This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PowerPC LE memchr and memrchr
- From: Will Schmidt <will_schmidt at vnet dot ibm dot com>
- To: Alan Modra <amodra at gmail dot com>, "Ryan S. Arnold" <ryan dot arnold at gmail dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Wed, 14 Aug 2013 10:32:52 -0500
- Subject: Re: PowerPC LE memchr and memrchr
- References: <20130809053000 dot GP3294 at bubble dot grove dot modra dot org>
- Reply-to: will_schmidt at vnet dot ibm dot com
On Fri, 2013-08-09 at 15:00 +0930, Alan Modra wrote:
> Like strnlen, memchr and memrchr had a number of defects fixed by this
> patch as well as adding little-endian support. The first one I
> noticed was that the entry to the main loop needlessly checked for
> "are we done yet?" when we know the size is large enough that we can't
> be done. The second defect I noticed was that the main loop count was
> wrong, which in turn meant that the small loop needed to handle an
> extra word. Thirdly, there is nothing to say that the string can't
> wrap around zero, except of course that we'd normally hit a segfault
> on trying to read from address zero. Fixing that simplified a number
> of places:
Thanks for the detailed explanations. :-) A question or two below.
Thanks,
-Will
> - /* Main loop to look for BYTE backwards in the string. Since
> - it's a small loop (< 8 instructions), align it to 32-bytes. */
> - .p2align 5
> +
> + /* Main loop to look for BYTE in the string. Since
> + it's a small loop (8 instructions), align it to 32-bytes. */
> + .align 5
".p2align alignment" versus ".align alignment" - Cosmetic or deliberate
to avoid some unintended .p2align alignment[,fill[,max]] behavior?
Looks like we initially have a mix of the two, so would be good to be
clean those up regardless.
> diff --git a/sysdeps/powerpc/powerpc64/power7/memrchr.S b/sysdeps/powerpc/powerpc64/power7/memrchr.S
> index d24fbbb..7a0050e 100644
> --- a/sysdeps/powerpc/powerpc64/power7/memrchr.S
> +++ b/sysdeps/powerpc/powerpc64/power7/memrchr.S
<...>
> L(loop_setup):
> - li r0,-16
> - sub r5,r8,r7
> - srdi r9,r5,4 /* Number of loop iterations. */
> + /* The last dword we want to read in the loop below is the one
> + containing the first byte of the string, ie. the dword at
> + s & ~7, or r0. The first dword read is at r8 - 8, we
> + read 2 * cnt dwords, so the last dword read will be at
> + r8 - 8 - 16 * cnt + 8. Solving for cnt gives
> + cnt = (r8 - r0) / 16 */
> + sub r5,r8,r0
> + addi r8,r8,-8
> + srdi r9,r5,4 /* Number of loop iterations. */
> mtctr r9 /* Setup the counter. */
> - b L(loop)
> - /* Main loop to look for BYTE backwards in the string. Since it's a
> - small loop (< 8 instructions), align it to 32-bytes. */
> - .p2align 5
> +
> + /* Main loop to look for BYTE backwards in the string. */
> + .align 4
deliberate switch from 5 to 4 ?
I've read over the rest of this patch - Nothing else jumps out at me, as
long as this passed test, should be good.
Thanks,
-Will