[PATCH] aarch64: revert memcpy optimze for kunpeng to avoid performance degradation
Adhemerval Zanella
adhemerval.zanella@linaro.org
Thu Jan 21 16:41:42 GMT 2021
On 20/01/2021 22:55, Zhangxuelei (Derek) wrote:
> Hi,
>
> They are my colleagues and we have certified this results together. It would be better to revert the original selection according to the negative performance of a specific product. And we will still study for a better or more balanced version of memcpy on Kunpeng.
>
> Thank you~
This is ok for 2.33, please commit.
>
> -----邮件原件-----
> 发件人: Adhemerval Zanella [mailto:adhemerval.zanella@linaro.org]
> 发送时间: 2021年1月20日 21:09
> 收件人: wangshuo (AF) <wangshuo47@huawei.com>; Zhangxuelei (Derek) <zhangxuelei4@huawei.com>; libc-alpha@sourceware.org
> 抄送: Hushiyuan <hushiyuan@huawei.com>; liqingqing (C) <liqingqing3@huawei.com>
> 主题: Re: [PATCH] aarch64: revert memcpy optimze for kunpeng to avoid performance degradation
>
> Hi,
>
> Since I don't have access to this specific hardware, it would be good if the original author, Xuelei Zhang, of the change could certify this reversion is ok.
>
> It should be ok during the freeze since it just a selection of an already tested implementation for an specific chip implementation.
>
> On 20/01/2021 04:20, Shuo Wang wrote:
>> In commit 863d775c481704baaa41855fc93e5a1ca2dc6bf6, kunpeng920 is
>> added to default memcpy version, however, there is performance degradation when the copy size is some large bytes, eg: 100k.
>> This is the result, tested in glibc-2.28:
>> before backport after backport Performance improvement
>> memcpy_1k 0.005 0.005 0.00%
>> memcpy_10k 0.032 0.029 10.34%
>> memcpy_100k 0.356 0.429 -17.02%
>> memcpy_1m 7.470 11.153 -33.02%
>>
>> This is the demo
>> #include "stdio.h"
>> #include "string.h"
>> #include "stdlib.h"
>>
>> char a[1024*1024] = {12};
>> char b[1024*1024] = {13};
>> int main(int argc, char *argv[])
>> {
>> int i = atoi(argv[1]);
>> int j;
>> int size = atoi(argv[2]);
>>
>> for (j = 0; j < i; j++)
>> memcpy(b, a, size*1024);
>> return 0;
>> }
>>
>> # gcc -g -O0 memcpy.c -o memcpy
>> # time taskset -c 10 ./memcpy 100000 1024
>>
>> Co-authored-by: liqingqing <liqingqing3@huawei.com>
>>
>> ---
>> sysdeps/aarch64/multiarch/memcpy.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/sysdeps/aarch64/multiarch/memcpy.c
>> b/sysdeps/aarch64/multiarch/memcpy.c
>> index 27259d3386..0e0a5cbcfb 100644
>> --- a/sysdeps/aarch64/multiarch/memcpy.c
>> +++ b/sysdeps/aarch64/multiarch/memcpy.c
>> @@ -37,7 +37,7 @@ extern __typeof (__redirect_memcpy) __memcpy_falkor
>> attribute_hidden; libc_ifunc (__libc_memcpy,
>> (IS_THUNDERX (midr)
>> ? __memcpy_thunderx
>> - : (IS_FALKOR (midr) || IS_PHECDA (midr) || IS_KUNPENG920 (midr)
>> + : (IS_FALKOR (midr) || IS_PHECDA (midr)
>> ? __memcpy_falkor
>> : (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr)
>> ? __memcpy_thunderx2
>>
More information about the Libc-alpha
mailing list