This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH v2 2/2] aarch64: Optimized memcpy and memmove for Kunpeng processor
- From: "Zhangxuelei (Derek)" <zhangxuelei4 at huawei dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, Yikun Jiang <yikunkero at gmail dot com>
- Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, nd <nd at arm dot com>, Siddhesh Poyarekar <siddhesh at gotplt dot org>, jiangyikun <jiangyikun at huawei dot com>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>
- Date: Mon, 21 Oct 2019 14:25:38 +0000
- Subject: Re: [PATCH v2 2/2] aarch64: Optimized memcpy and memmove for Kunpeng processor
Hi Wilco, thaks for your rely and suggestion.
> So this makes it highly desirable to improve the generic versions
> of string functions.
We completely agree, we also like to contribute our changes in to generic version, because the most of our changes is based on generic version.
And we had some misunderstanding, we thought the ifunc is the general implenments in glibc. :)
However, there are two type patches:
1. The improvement based on generic version. There is no doubt that, we should contribute it into generic version.
2. Kunpeng specific implement, just like the memcpy patch, it is used to solve the specific of Kunpeng CPU, so we hope we can add it in ifunc to enbale this kind of patch.
In addition, is there any other work to cover if we contribute as generic version?
> Note that memchr_strlen significantly outperforms the fastest strlen
> on sizes larger than 256, so I don't think that using uminv to test
> for zeroes is the fastest approach.
Indeedly, but memchr_strlen really has poor performance before 256 bytes, and if we mix this method into current version, we may need a length count and judge it more than 256 bytes or not in each loop, is this way cheap? And we think small size is more important for strlen.
Finally, we will submit other generic implenments as soon as possible, and it would be good if you could review this two patches firstly:)
. memrchr: it's already submited as generic version. see link:
. memcpy/memmove: it's the specific kunpeng