This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] aarch64: Optimized implementation of strcpy
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Thu, 19 Dec 2019 16:44:47 -0300
- Subject: Re: [PATCH v2] aarch64: Optimized implementation of strcpy
- References: <20191022093930.10588-1-zhangxuelei4@huawei.com> <VI1PR0801MB21270831D971D744F8B5EDFD83680@VI1PR0801MB2127.eurprd08.prod.outlook.com>
On 22/10/2019 14:54, Wilco Dijkstra wrote:
> Hi Xuelei,
>
>> Optimize the strcpy implementation by using vector loads and operations
>> in main loop.Compared to aarch64/strcpy.S, it reduces latency of cases
>> in bench-strlen by 5%~18% when the length of src is greater than 64
>> bytes, with gains throughout the benchmark.
>
> This is OK. I tried it on a few microarchitectures, and it's either as fast or
> faster on long strings.
>
> Wilco
I pushed it upstream.