This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 3/4] S390: Do not call memcpy, memcmp, memset within libc.so via ifunc-plt.
- From: Stefan Liebler <stli at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Wed, 11 May 2016 15:40:54 +0200
- Subject: Re: [PATCH 3/4] S390: Do not call memcpy, memcmp, memset within libc.so via ifunc-plt.
- Authentication-results: sourceware.org; auth=none
- References: <1461672469-2107-1-git-send-email-stli at linux dot vnet dot ibm dot com> <1461672469-2107-3-git-send-email-stli at linux dot vnet dot ibm dot com> <571F6EA0 dot 7080201 at linaro dot org> <nfpsdv$aac$1 at ger dot gmane dot org> <572A139E dot 10302 at linaro dot org>
On 05/04/2016 05:22 PM, Adhemerval Zanella wrote:
On 27/04/2016 05:14, Stefan Liebler wrote:
On 04/26/2016 03:35 PM, Adhemerval Zanella wrote:
On 26/04/2016 09:07, Stefan Liebler wrote:
On s390, the memcpy, memcmp, memset functions are IFUNC symbols,
which are created with s390_libc_ifunc-macro.
This macro creates a __GI_ symbol which is set to the
ifunced symbol. Thus calls within libc.so to e.g. memcpy
result in a call to *ABS*+0x954c0@plt stub and afterwards
to the resolved memcpy-ifunc-variant.
This patch sets the __GI_ symbol to the default-ifunc-variant
to avoid the plt call. The __GI_ symbols are now created at the
default variant of ifunced function.
Is the internal ifunc plt usage leading to a failure in s390/s390x
(as for powerpc32 and i686) or is it an optimization fix?
No it does not lead to a failure on s390.
It is an optimization to avoid the extra call to the plt-stub if called within libc.so.
Right because on 19c4bec0f43599eecc2f32de96ae179cd7d64053 I did the
exact opposite because for POWER it shows that the gains of using
optimized version over default one shows a good improvement in
algorithms that use these symbol internally (like regex).
It might be the case where the default s390 version is good enough
and shows no performance difference.
I've retested the patchset in context of regex with these functions
called with/without ifunc-plt.
The version without ifunc-plt is slightly faster.
But you are right. This decision has to be redetermined with further
variants or newer machines.
In case of the string-functions, calling them via ifunc within libc.so
could gain improvement.