This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: memcpy performance regressions 2.19 -> 2.24(5)


Ok.  Do you have any specific concerns?  It would help make it easier
for us to do the testing internally to switch to memcpy.c.

Interesting, thanks for the info.  More reason for being able to
select the implementation!

On Tue, May 23, 2017 at 3:55 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, May 23, 2017 at 3:12 PM, Erich Elsen <eriche@google.com> wrote:
>> Sounds good to me.  Even if tunables aren't added, does memcpy.S ->
>> memcpy.c seem reasonable?
>
> I prefer not to do it for now.  We can revisit it later after tunable is added
> to cpu_features.
>
> BTW,  REP MOV is expected to have lower bandwidth on multi-socket
> systems, but has the benefit of lower cache disruption throughout the
> cache hierarchy.   This is trade off of between overall system throughput
> and single program performance.
>
>
>> On Tue, May 23, 2017 at 3:07 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, May 23, 2017 at 1:57 PM, Erich Elsen <eriche@google.com> wrote:
>>>> Maybe there's room for both?
>>>>
>>>> Setting the cpu_features would affect everything; it would be useful
>>>> to be able to target only specific (and very important) routines.
>>>
>>> I prefer to do the cpu_features first.  If it turns out not
>>> sufficient, we then do
>>> the IFUNC implementation.
>>>
>>>> On Tue, May 23, 2017 at 1:46 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Tue, May 23, 2017 at 1:39 PM, Erich Elsen <eriche@google.com> wrote:
>>>>>> I was also thinking that it might be nice to have a TUNABLE that sets
>>>>>> the implementation of memcpy directly.  It would be easier to do this
>>>>>> if memcpy.S was memcpy.c.  Attached is a patch that does the
>>>>>> conversion but doesn't add the tunables.  How would you feel about
>>>>>> this?  It has no runtime impact, probably increases the size slightly,
>>>>>> and makes the code easier to read / modify.
>>>>>>
>>>>>
>>>>> It depends on how far you want to go.  We can add TUNABLE support
>>>>> to each IFUNC implementation or we can add TUNABLE support to
>>>>> cpu_features to update processor features.  I prefer latter.
>>>>>
>>>>>
>>>>> --
>>>>> H.J.
>>>
>>>
>>>
>>> --
>>> H.J.
>
>
>
> --
> H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]