This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: memcpy performance regressions 2.19 -> 2.24(5)


On Tue, May 23, 2017 at 5:56 PM, Erich Elsen <eriche@google.com> wrote:
> Ok.  Do you have any specific concerns?  It would help make it easier
> for us to do the testing internally to switch to memcpy.c.

We use libc_ifunc to implement IFUNC, like x86_64/multiarch/strstr.c. It may be
a good idea to switch to a different format and require all IFUNCs in
C for x86-64
if compilers with IFUNC attribute are required to build glibc. But this is
independent to tunables.

> Interesting, thanks for the info.  More reason for being able to
> select the implementation!
> On Tue, May 23, 2017 at 3:55 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, May 23, 2017 at 3:12 PM, Erich Elsen <eriche@google.com> wrote:
>>> Sounds good to me.  Even if tunables aren't added, does memcpy.S ->
>>> memcpy.c seem reasonable?
>>
>> I prefer not to do it for now.  We can revisit it later after tunable is added
>> to cpu_features.
>>
>> BTW,  REP MOV is expected to have lower bandwidth on multi-socket
>> systems, but has the benefit of lower cache disruption throughout the
>> cache hierarchy.   This is trade off of between overall system throughput
>> and single program performance.
>>
>>
>>> On Tue, May 23, 2017 at 3:07 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Tue, May 23, 2017 at 1:57 PM, Erich Elsen <eriche@google.com> wrote:
>>>>> Maybe there's room for both?
>>>>>
>>>>> Setting the cpu_features would affect everything; it would be useful
>>>>> to be able to target only specific (and very important) routines.
>>>>
>>>> I prefer to do the cpu_features first.  If it turns out not
>>>> sufficient, we then do
>>>> the IFUNC implementation.
>>>>
>>>>> On Tue, May 23, 2017 at 1:46 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>> On Tue, May 23, 2017 at 1:39 PM, Erich Elsen <eriche@google.com> wrote:
>>>>>>> I was also thinking that it might be nice to have a TUNABLE that sets
>>>>>>> the implementation of memcpy directly.  It would be easier to do this
>>>>>>> if memcpy.S was memcpy.c.  Attached is a patch that does the
>>>>>>> conversion but doesn't add the tunables.  How would you feel about
>>>>>>> this?  It has no runtime impact, probably increases the size slightly,
>>>>>>> and makes the code easier to read / modify.
>>>>>>>
>>>>>>
>>>>>> It depends on how far you want to go.  We can add TUNABLE support
>>>>>> to each IFUNC implementation or we can add TUNABLE support to
>>>>>> cpu_features to update processor features.  I prefer latter.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> H.J.
>>>>
>>>>
>>>>
>>>> --
>>>> H.J.
>>
>>
>>
>> --
>> H.J.



-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]