This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v4] aarch64: thunderx2 memcpy optimizations for ext-based code path


Hi Anton,

> I appreciate you comments very much. Here is the patch
> considering the points you made.
>
> 1. Always taken conditional branch at the beginning is
> removed.
>
> 2. Epilogue code is placed after the end of the loop to
> reduce the number of branches.
>
> 3. The redundant "mov" instructions inside the loop are
> gone due to the changed order of the registers in the ext
> instructions inside the loop.
>
> 4. Invariant code in the loop epilogue is no more
> repeated for each ext chunk.

That looks much better indeed! The alignment can still be improved
though:

   819d0:       6e037840        ext     v0.16b, v2.16b, v3.16b, #15
   819d4:       6e047861        ext     v1.16b, v3.16b, v4.16b, #15
   819d8:       6e057887        ext     v7.16b, v4.16b, v5.16b, #15
   819dc:       ac810460        stp     q0, q1, [x3], #32
   819e0:       f9814021        prfm    pldl1strm, [x1, #640]
   819e4:       acc10c22        ldp     q2, q3, [x1], #32
   819e8:       6e0678b0        ext     v16.16b, v5.16b, v6.16b, #15
   819ec:       ac814067        stp     q7, q16, [x3], #32
   819f0:       6e0278c0        ext     v0.16b, v6.16b, v2.16b, #15
   819f4:       6e037841        ext     v1.16b, v2.16b, v3.16b, #15
   819f8:       acc11825        ldp     q5, q6, [x1], #32
   819fc:       6e057867        ext     v7.16b, v3.16b, v5.16b, #15
   81a00:       f1010042        subs    x2, x2, #0x40
   81a04:       54fffeca        b.ge    819dc <__GI___memcpy_thunderx2+0x27c>

So rather than aligning the first instruction as currently done:

#define EXT_CHUNK(shft) \
.p2align 4 ;\

Align the loop instead. If you also add 2 nops after the bx instruction then
everything should work out perfectly.

Cheers,
Wilco
    

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]