This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH][AArch64] Optimized memset

From: Marcus Shawcroft <marcus dot shawcroft at gmail dot com>
To: Wilco Dijkstra <wdijkstr at arm dot com>
Cc: GNU C Library <libc-alpha at sourceware dot org>
Date: Fri, 25 Sep 2015 15:55:27 +0100
Subject: Re: [PATCH][AArch64] Optimized memset
Authentication-results: sourceware.org; auth=none
References: <004c01d0cba1$e15ac5a0$a41050e0$ at com>

On 31 July 2015 at 16:02, Wilco Dijkstra <wdijkstr@arm.com> wrote:
> This is an optimized memset for AArch64. Memset is split into 4 main cases: small sets of up to 16
> bytes, medium of 16..96 bytes which are fully unrolled. Large memsets of more than 96 bytes align
> the destination and use an unrolled loop processing 64 bytes per iteration. Memsets of zero of more
> than 256 use the dc zva instruction, and there are faster versions for the common ZVA sizes 64 or
> 128. STP of Q registers is used to reduce codesize without loss of performance.
>
> Speedup on test-memset is 1% on Cortex-A57 and 8% on Cortex-A53. On a random test with varying sizes
> and alignment the new version is 50% faster.
>
> OK for commit?
>
> ChangeLog:
> 2015-07-31  Wilco Dijkstra  <wdijkstr@arm.com>
>
>         * sysdeps/aarch64/memset.S (__memset):
>         Rewrite of optimized memset.
>

-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU

Please drop this unrelated white space change when you commit.  OK /Marcus

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]