[PATCH] powerpc64le: Optimize memset for POWER10

Raoni Fassina Firmino raoni@linux.ibm.com
Thu Apr 29 18:40:49 GMT 2021


Thanks for the review Lucas, please let me know if I missed something.

On Wed, Apr 28, 2021 at 03:48:28PM -0300, Lucas A. M. Magalhaes wrote:
> > +       /* After alignment, if there is 127B or less left
> s/127B/64B/

Done.

This is an awkward position/block, this comment is about the whole
block, up until the `beq  L(tail_128)`, but there is the label
'L(aligned)' in the middle that makes it hard to underhand. But it truly
is that, after the alignment if there is less than 128 bytes is goes to
the tail, but there is this optimization to go straight to the part of
the tail depending on the amount left.

any way, refrased it to be a single line (less obstrusive) and add a new
one for the second branch:

> > +          go directly to the tail.  */
> > +       cmpldi  r5,64
> > +       blt     L(tail_64)
> > +
> > +       .balign 16
> > +L(aligned):
> > +       srdi.   r0,r5,7
> > +       beq     L(tail_128)

Here^, added another after L(aligned).


> > +
> > +       cmpldi  cr5,r5,255
> > +       cmpldi  cr6,r4,0
> > +       crand   27,26,21
> > +       bt      27,L(dcbz)
> Maybe add a comment to explain this branch.

Done.

I was counting on the comment on the label itself, but I guess it makes
sense to add a brief comment  here also, avoid going back and forward to
understand the condition check.


> > +       .balign 16
> > +L(tail_128):
> The label tail_128 made me think that here would be copied 128 bytes.
> Maybe add a comment here.

Done.

Sorry, yes, all this "tail_*" sections are "up to", the number being the
maximum that it will write. But this one is in fact from 64 up to 128.


> 
> > +       stxv    v0+32,0(r6)
> > +       stxv    v0+32,16(r6)
> > +       stxv    v0+32,32(r6)
> > +       stxv    v0+32,48(r6)
> > +       addi    r6,r6,64
> > +       andi.   r5,r5,63
> > +       beqlr
> > +
> > +       .balign 16
> > +L(tail_64):
> Maybe add a comment here to explay this section as well.

Done.


o/
Raoni


More information about the Libc-alpha mailing list