This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] aarch64: Optimized memset for Kunpeng processor.
- From: "Zhangxuelei (Derek)" <zhangxuelei4 at huawei dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, "siddhesh at gotplt dot org" <siddhesh at gotplt dot org>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>, jiangyikun <jiangyikun at huawei dot com>, "yikunkero at gmail dot com" <yikunkero at gmail dot com>
- Cc: nd <nd at arm dot com>
- Date: Fri, 1 Nov 2019 13:17:58 +0000
- Subject: Re: [PATCH] aarch64: Optimized memset for Kunpeng processor.
Hi Wilco,
> For the 64..128 case it is always safe to copy 64 bytes from the
> start and 64 bytes from the end - the tail overlap means you never
> can go outside the bounds.
>
> Generally it's faster that way due to avoiding unnecessary branches
> which may mispredict.
Yeah, you are right. I misunderstood before, and now I have just executed it rather than risk a misprediction in new patch.
Cheers,
Xuelei