This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 0/2] Multiarch hooks for memcpy variants
- From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- To: Siddhesh Poyarekar <siddhesh at gotplt dot org>, Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Cc: nd at arm dot com
- Date: Fri, 11 Aug 2017 18:58:03 +0100
- Subject: Re: [PATCH 0/2] Multiarch hooks for memcpy variants
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Szabolcs dot Nagy at arm dot com;
- Nodisclaimer: True
- References: <DB6PR0801MB20534ED1010DDF1B033821EE83890@DB6PR0801MB2053.eurprd08.prod.outlook.com> <18d2fdf8-ca55-1ded-fa66-3509b3bcf8fe@gotplt.org>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On 11/08/17 12:22, Siddhesh Poyarekar wrote:
> On Friday 11 August 2017 04:41 PM, Wilco Dijkstra wrote:
>> Siddhesh Poyarekar wrote:
>>> Functions like mempcpy, __mempcpy_chk and __memcpy_chk continue to call the
>>> generic memcpy implementation. These two patches fix this by adding ifunc
>>> entry points for these functions for generic, thunderx and falkor.
>>
>> I don't understand what the goal of this is since on AArch64 we always transform
>> mempcpy to memcpy. Also why use ifuncs on the _chk variants? Are they ever
>> used in cases where the last 1% of performance matters?
>
> I started off by writing this for __memcpy_chk because gcc transforms
> memcpy to __memcpy_chk in some cases with -O3 and then extended it to
> mempcpy for completeness. I'm not going to push too hard for mempcpy if
> you strongly oppose it, but I am definitely interested in getting
> __memcpy_chk ifuncs in.
>
ifuncs and asm implementations have a lot of non-trivial
maintainance costs.
as far as i understand *_chk calls are only generated when
compiling with -D_FORTIFY_SOURCE=* and most of these checks
could be inlined by the compiler (i.e. there is no need
for runtime support in principle).
may be the generic __memcpy_chk should call the ifunced
memcpy so it goes through an extra plt indirection, but
at least less target specific code is needed.