This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 0/2] Multiarch hooks for memcpy variants
On Monday 14 August 2017 06:52 PM, Wilco Dijkstra wrote:
> 66% of memcpy calls are <=16 bytes. Assuming you can even get a 15% gain
> for these small sizes (there is very little you can do different), that's at most 1
> cycle faster, so the PLT indirection is going to be more expensive.
Yeah, I won't argue for copies of that size.
> Note that the falkor version does quite well in memcpy-random across several
> micro architectures so I think parts of it could be moved into the generic code.
That's interesting. Not surprising though, since a lot of it was just
issue slot usage and alignments and nothing else. I don't expect those
to be widely different between cores.
> I still can't see any reason to even support these entry points in GLIBC, let
> alone optimize them using ifuncs. The _chk functions should obviously be
> inlined to avoid all the target specific complexity for no benefit. I think this
> could trivially be done via the GLIBC headers already. (That's assuming they
> are in any way performance critical.)
These entry points are supported in the ABI, so you don't have a choice
in terms of supporting them. Inlining by default has a different
problem - it will take effect only when a distribution does a full
rebuild and that happens very infrequently. This will completely
discount backporting of these routines to any stable distribution.