This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.

From: "H.J. Lu" <hjl dot tools at gmail dot com>
To: Zack Weinberg <zackw at panix dot com>
Cc: Andreas Schwab <schwab at linux-m68k dot org>, GNU C Library <libc-alpha at sourceware dot org>
Date: Sun, 9 Aug 2015 10:56:02 -0700
Subject: Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.
Authentication-results: sourceware.org; auth=none
References: <20150809013434 dot 0B16814B9A at panix1 dot panix dot com> <m28u9lotfk dot fsf at linux-m68k dot org> <55C76FCD dot 5020607 at panix dot com> <CAMe9rOoAWjRma_mG_FazVh3FGOyiGJ=g82=bsfGqa-COnt5p1g at mail dot gmail dot com> <55C78525 dot 40402 at panix dot com>

On Sun, Aug 9, 2015 at 9:51 AM, Zack Weinberg <zackw@panix.com> wrote:
> On 08/09/2015 11:39 AM, H.J. Lu wrote:
>> On Sun, Aug 9, 2015 at 8:20 AM, Zack Weinberg <zackw@panix.com> wrote:
>>> On further investigation it appears not to -- specifically, internal
>>> calls using __GI_foo appear to go straight to the default implementation
>>> of 'foo'.
>>>
>>> If so, I am inclined to think that that is a bug -- there are a *lot* of
>>> internal calls to memset and memcpy in libc, they should not miss out on
>>> architectural tuning.  I don't particularly understand how IFUNC works,
>>> but wouldn't it be sufficient to send internal calls to anything with an
>>> IFUNC through the PLT?  (I suppose there would then be a question of
>>> whether the architectural optimizations made up for the PLT overhead.)
>>
>> Here is a description of IFUNC:
>>
>> https://sites.google.com/site/x32abi/documents/ifunc.txt?attredirects=0&d=1
>
> Thanks, that clarifies what IFUNC _does_, but it doesn't help me
> understand how it interacts with the libc_hidden_* optimization.  I see
> in the code that e.g. __GI_memset is pointed directly at __memset_sse2
> (for amd64) but I do not understand whether that is a limitation of the
> current implementation, a a deliberate choice to avoid indirection at
> the cost of missing out on AVX2 tuning, or both.  And if it is a
> limitation, I don't know what options we might have for lifting that
> limitation.  I'm sure this was discussed when these patches originally
> landed, but it was long enough ago that I am having trouble finding them
> in the mailing list archive.

Those comments were made when the first IFUNC implementation
was done.  We have improved IFUNC implementation since then
and those comments may not be true today.  But we have to verify
that at least the extra indirect via PLT doesn't hurt performance on
most of current processors.


-- 
H.J.

Follow-Ups:
- Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.
  - From: Zack Weinberg

References:
- [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.
  - From: Zack Weinberg
- Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.
  - From: Andreas Schwab
- Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.
  - From: Zack Weinberg
- Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.
  - From: H.J. Lu
- Re: [PATCH] Remove unnecessary IFUNC dispatch for __memset_chk.
  - From: Zack Weinberg

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]