This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PowerPC: stpcpy optimization for PPC64/POWER7


On Mon, Sep 16, 2013 at 11:30:06AM -0300, Adhemerval Zanella wrote:
> Hi all,
> 
> Following Alan Modra suggestion, it is a stpcpy optimization patch for PPC64.
> This patch optimizes the default PPC64 by adding doubleword stores/loads
> increasing aligned throughput for large sizes.
>
A obvious question here is why it needs be keept separate from strcpy
implementation. A sysdeps/powerpc/powerpc64/st[pr]cpy.S are quite
similar and I do not see a sysdeps/powerpc/powerpc64/power7/strcpy.S
file. Would same optimization apply to strcpy?

Also I noted in implementation that provided that if we handle case of
writing less than 8 bytes separately a best way how finish on x64 would
be compute end and do overlapping store of last 8 bytes.

There are two things I do not know, first one is computing end, on wikipedia
I found that it this could be handled by cntlz on mask and dividing it by 8.

Second is how slow are overlapping stores versus branch misprediction.
You need a benchmark that will vary sizes to check this, I could supply
one.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]