This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH 0/3] memset zva optimization
- From: Siddhesh Poyarekar <siddhesh at sourceware dot org>
- To: libc-alpha at sourceware dot org
- Cc: Wilco dot Dijkstra at arm dot com, szabolcs dot nagy at arm dot com
- Date: Thu, 9 Nov 2017 10:43:25 +0530
- Subject: [PATCH 0/3] memset zva optimization
- Authentication-results: sourceware.org; auth=none
This patchset updates the benchmarks to walk uniformly backwards and finally
adds multiarch implementation for memset.
Based on feedback, I have reduced the change to just having a separate memset
implementation for ZVA == 64 which roughly doubles performance for
~size=256-512 bytes and results in a net improvement for all sizes larger than
256 bytes due to not having to read zva on every function call. The net gain
reduces as sizes increase since the impact of the zva read is minimal for
larger sizes.
Siddhesh Poyarekar (3):
benchtests: Fix walking sizes and directions for *-walk benchmarks
benchtests: Bump start size since smaller sizes are noisy
aarch64: Hoist ZVA check out of the memset function
benchtests/bench-memcpy-walk.c | 16 ++++-----
benchtests/bench-memmove-walk.c | 17 ++++-----
benchtests/bench-memset-walk.c | 6 ++--
sysdeps/aarch64/memset-reg.h | 30 ++++++++++++++++
sysdeps/aarch64/memset.S | 27 +++++---------
sysdeps/aarch64/multiarch/Makefile | 2 +-
sysdeps/aarch64/multiarch/ifunc-impl-list.c | 3 ++
sysdeps/aarch64/multiarch/init-arch.h | 8 +++--
sysdeps/aarch64/multiarch/memset.c | 41 +++++++++++++++++++++
sysdeps/aarch64/multiarch/memset_generic.S | 27 ++++++++++++++
sysdeps/aarch64/multiarch/memset_zva_64.S | 49 ++++++++++++++++++++++++++
sysdeps/aarch64/multiarch/rtld-memset.S | 23 ++++++++++++
sysdeps/unix/sysv/linux/aarch64/cpu-features.c | 10 ++++++
sysdeps/unix/sysv/linux/aarch64/cpu-features.h | 1 +
14 files changed, 214 insertions(+), 46 deletions(-)
create mode 100644 sysdeps/aarch64/memset-reg.h
create mode 100644 sysdeps/aarch64/multiarch/memset.c
create mode 100644 sysdeps/aarch64/multiarch/memset_generic.S
create mode 100644 sysdeps/aarch64/multiarch/memset_zva_64.S
create mode 100644 sysdeps/aarch64/multiarch/rtld-memset.S
--
2.7.5