This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH 0/2] aarch64,falkor: memcpy/memmove performance improvements


Hi,

Here are a couple of patches to improve performance of the falkor memcpy
and memmove implementations based on testing on the latest hardware.
The theme of the optimization is to avoid trying to train the hardware
prefetcher for smaller sizes and in the loop tail since that just
mis-trains the prefetcher.  Instead, use multiple registers to aid
reordering wherever possible.  Testing showed that regressions in these
sizes compared to generic memcpy are resolved with this patch.

Siddhesh

Siddhesh Poyarekar (2):
  aarch64,falkor: Ignore prefetcher hints for memmove tail
  Ignore prefetcher tagging for smaller copies

 sysdeps/aarch64/multiarch/memcpy_falkor.S  | 68 ++++++++++++++++++------------
 sysdeps/aarch64/multiarch/memmove_falkor.S | 48 ++++++++++++---------
 2 files changed, 70 insertions(+), 46 deletions(-)

-- 
2.14.3


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]