This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC PATCH] aarch64: improve memset
- From: Richard Henderson <rth at twiddle dot net>
- To: Wilco Dijkstra <wdijkstr at arm dot com>
- Cc: will dot newton at linaro dot org, marcus dot shawcroft at gmail dot com, libc-alpha at sourceware dot org
- Date: Tue, 11 Nov 2014 15:30:33 +0100
- Subject: Re: [RFC PATCH] aarch64: improve memset
- Authentication-results: sourceware.org; auth=none
- References: <002701cffaa0$77623570$6626a050$ at com> <002801cffaa5$eb2852f0$c178f8d0$ at com> <545F237A dot 8070808 at twiddle dot net> <002901cffd22$3fa9ed10$befdc730$ at com> <5461C50C dot 1020508 at twiddle dot net> <000001cffdae$4ae656a0$e0b303e0$ at com>
On 11/11/2014 01:52 PM, Wilco Dijkstra wrote:
> No - in the worst case we need to write 64 bytes. The proof is trivial,
> dst = x0 & -64, tmp2 = x0 & -16, so tmp2 = dst + (x0 & 0x30) or tmp2 >= dst.
> Since we start doing the dc's at dst + 64, the stp to [tmp2 + 64] is redundant.
Quite right, my mistake.
r~