This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: Use aligned stores in memset




On 10/03/2017 11:59 PM, Adhemerval Zanella wrote:
I think one way to provide a slight better memcpy implementation for POWER8
and still be able to circumvent the non-aligned on non-cacheable memory
is to use tunables.

The branch azanella/memcpy-power8 [1] has a power8 memcpy optimization which
uses unaligned load and stores that I created some time ago but never actually
send upstream.  It shows better performance on both bench-memcpy and
bench-memcpy-random (about 10% on latter) and mixed results on bench-memcpy-large
(which it is mainly dominated by memory throughput and on the environment I am
using, a shared PowerKVM instance, the results does not seem to be reliable).

It could use some tunning, specially on some the range I used for unrolling
the load/stores and it also does not care for unaligned access on cross-page
boundary (which tend to be quite slow on current hardware, but also on
current page size of usual 64k also uncommon).

This first patch does not enable this option as a default for POWER8, it just
add on string tests as an option.  The second patch changes the selection to:

   1. If glibc is configure with tunables, set the new implementation as the
      default for ISA 2.07 (power8).

   2. Also if tunable is active, add the parameter glibc.tune.aligned_memopt
      to disable the new implementation selection.

So programs that rely on aligned loads can set:

GLIBC_TUNABLES=glibc.tune.aligned_memopt=1

And then the memcpy ifunc selection would pick the power7 one which uses
only aligned load and stores.

This is a RFC patch and if the idea sounds to powerpc arch mantainers I can
work on finishing the patch with more comments and send upstream.  I tried
to apply same unaligned idea for memset and memmove, but I could get any real
improvement in neither.

[1]https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/memcpy-power8

Thanks for sharing the patches. At this point we are also working on
memcpy for power8 with a different approach and we are planning
to post it soon. We can choose the better performing version and
use your tunables patch too.

--
Thanks
Rajalakshmi S


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]