This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.


On Sun, Aug 9, 2015 at 9:51 AM, Zack Weinberg <zackw@panix.com> wrote:
> On 08/09/2015 11:39 AM, H.J. Lu wrote:
>> On Sun, Aug 9, 2015 at 8:20 AM, Zack Weinberg <zackw@panix.com> wrote:
>>> On further investigation it appears not to -- specifically, internal
>>> calls using __GI_foo appear to go straight to the default implementation
>>> of 'foo'.
>>>
>>> If so, I am inclined to think that that is a bug -- there are a *lot* of
>>> internal calls to memset and memcpy in libc, they should not miss out on
>>> architectural tuning.  I don't particularly understand how IFUNC works,
>>> but wouldn't it be sufficient to send internal calls to anything with an
>>> IFUNC through the PLT?  (I suppose there would then be a question of
>>> whether the architectural optimizations made up for the PLT overhead.)
>>
>> Here is a description of IFUNC:
>>
>> https://sites.google.com/site/x32abi/documents/ifunc.txt?attredirects=0&d=1
>
> Thanks, that clarifies what IFUNC _does_, but it doesn't help me
> understand how it interacts with the libc_hidden_* optimization.  I see
> in the code that e.g. __GI_memset is pointed directly at __memset_sse2
> (for amd64) but I do not understand whether that is a limitation of the
> current implementation, a a deliberate choice to avoid indirection at
> the cost of missing out on AVX2 tuning, or both.  And if it is a
> limitation, I don't know what options we might have for lifting that
> limitation.  I'm sure this was discussed when these patches originally
> landed, but it was long enough ago that I am having trouble finding them
> in the mailing list archive.

Those comments were made when the first IFUNC implementation
was done.  We have improved IFUNC implementation since then
and those comments may not be true today.  But we have to verify
that at least the extra indirect via PLT doesn't hurt performance on
most of current processors.


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]