This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/2] aarch64: Optimized memcpy and memmove for Kunpeng processor


Hi Wilco,

> So is there a way to force write streaming, for example by aligning
> the source rather than the destination or use particular instructions?

Difference walk direactions of dst make the align offset different, further resulted in the different performance between memcpy-walk and memmove-walk. 

In new patch, we use dst aligned rather than src aligned to solve this problem, and now both memcpy-walk and memmove-walk performe well as before with dst_unaligned code removed.

> In order to select the right memmove implementation, multiarch/
> memmove.c needs similar changes as multiarch/memcpy.c.
>
> Also since the memmove entry sequence does both check for medium
> and large cases, the full overlap check should be done in both.

As reminded, full overlap check is done now in large cases, and memmove.c is also added to new patch.

In addition, is there any reviews for the latest memset_kunpeng patch as below:
https://sourceware.org/ml/libc-alpha/2019-11/msg00044.html

Cheers,
Xuelei


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]