This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC PATCH] aarch64: improve memset
- From: Richard Henderson <rth at twiddle dot net>
- To: Wilco Dijkstra <wdijkstr at arm dot com>
- Cc: will dot newton at linaro dot org, marcus dot shawcroft at gmail dot com, libc-alpha at sourceware dot org
- Date: Tue, 11 Nov 2014 09:13:00 +0100
- Subject: Re: [RFC PATCH] aarch64: improve memset
- Authentication-results: sourceware.org; auth=none
- References: <002701cffaa0$77623570$6626a050$ at com> <002801cffaa5$eb2852f0$c178f8d0$ at com> <545F237A dot 8070808 at twiddle dot net> <002901cffd22$3fa9ed10$befdc730$ at com>
On 11/10/2014 09:09 PM, Wilco Dijkstra wrote:
> I spotted one issue in the alignment code:
>
> + stp xzr, xzr, [tmp2, #64]
> +
> + /* Store up to first SIZE, aligned 16. */
> +.ifgt \size - 64
> + stp xzr, xzr, [tmp2, #80]
> + stp xzr, xzr, [tmp2, #96]
> + stp xzr, xzr, [tmp2, #112]
> + stp xzr, xzr, [tmp2, #128]
> +.ifgt \size - 128
> +.err
> +.endif
> +.endif
>
> This should be:
>
> + /* Store up to first SIZE, aligned 16. */
> +.ifgt \size - 64
> + stp xzr, xzr, [tmp2, #64]
> + stp xzr, xzr, [tmp2, #80]
> + stp xzr, xzr, [tmp2, #96]
> + stp xzr, xzr, [tmp2, #112]
> +.ifgt \size - 128
> +.err
> +.endif
Incorrect.
tmp2 is backward aligned from dst_in, which means that tmp2+0 may be before
dst_in. Thus we write the first 16 bytes, unaligned, then write to tmp2+16
through tmp2+N to clear the first N+1 to N+16 bytes.
However, if we stop at tmp2+48 (or tmp2+112) we could be leaving up to 15 bytes
uninitialized.
r~