This is the mail archive of the
mailing list for the glibc project.
RE: [RFC PATCH] aarch64: improve memset
- From: "Wilco Dijkstra" <wdijkstr at arm dot com>
- To: "'Marcus Shawcroft'" <marcus dot shawcroft at gmail dot com>
- Cc: "Richard Henderson" <rth at twiddle dot net>, <will dot newton at linaro dot org>, "GNU C Library" <libc-alpha at sourceware dot org>
- Date: Tue, 11 Nov 2014 14:16:04 -0000
- Subject: RE: [RFC PATCH] aarch64: improve memset
- Authentication-results: sourceware.org; auth=none
- References: <002701cffaa0$77623570$6626a050$ at com> <002801cffaa5$eb2852f0$c178f8d0$ at com> <CAFqB+Pw4oEhmORJGSjBNtaTn9ZOgWS6-25p=4AYFwGuv72jddg at mail dot gmail dot com>
> Marcus Shawcroft wrote:
> On 7 November 2014 16:14, Wilco Dijkstra <firstname.lastname@example.org> wrote:
> >> Richard Henderson wrote:
> > I've got a few comments on this patch:
> > * Do we really need variants for cache line sizes that are never going to be used?
> > I'd say just support 64 and 128, and default higher sizes to no_zva.
> We shouldn't be removing support for the other sizes already supported
> by the existing implementation. If the other sizes were deprecated
> from the architecture then fair game, but that is not the case. From
> offline conversation with Wilco I gather part of the motivation to
> remove is that the none 64 cases cannot be readily tested on HW.
> That particular issue was solved in the original implementation using
> a hacked qemu.
The architecture allows dc zva of 4..2048 bytes. Most of these are useless and would
not result in a performance gain. Sizes 4-16 cannot be useful as an stp can write
more data... Larger sizes incur an ever increasing alignment overhead and there are
fewer memsets where dc zva could be used.
It would certainly be a good idea to deprecate useless small and overly large sizes,
but I don't see the reasoning for supporting every legal size without evidence of a
performance gain on an actual implementation. It's not like memset will crash on
an implementation with an unsupported size, it just won't use dc.