This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH x86_64] Update memcpy, mempcpy and memmove selection order for Excavator CPU BZ #19583


On Fri, Mar 18, 2016 at 6:55 AM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
> On 18-03-2016 10:51, H.J. Lu wrote:
>> On Fri, Mar 18, 2016 at 6:22 AM, Pawar, Amit <Amit.Pawar@amd.com> wrote:
>>>> No, it isn't fixed.  Avoid_AVX_Fast_Unaligned_Load should disable __memcpy_avx_unaligned and nothing more.  Also you need to fix ALL selections.
>>>
>>> diff --git a/sysdeps/x86_64/multiarch/memcpy.S b/sysdeps/x86_64/multiarch/memcpy.S
>>> index 8882590..a5afaf4 100644
>>> --- a/sysdeps/x86_64/multiarch/memcpy.S
>>> +++ b/sysdeps/x86_64/multiarch/memcpy.S
>>> @@ -39,6 +39,8 @@ ENTRY(__new_memcpy)
>>>         ret
>>>  #endif
>>>  1:     lea     __memcpy_avx_unaligned(%rip), %RAX_LP
>>> +       HAS_ARCH_FEATURE (Avoid_AVX_Fast_Unaligned_Load)
>>> +       jnz     3f
>>>         HAS_ARCH_FEATURE (AVX_Fast_Unaligned_Load)
>>>         jnz     2f
>>>         lea     __memcpy_sse2_unaligned(%rip), %RAX_LP
>>> @@ -52,6 +54,8 @@ ENTRY(__new_memcpy)
>>>         jnz     2f
>>>         lea     __memcpy_ssse3(%rip), %RAX_LP
>>>  2:     ret
>>> +3:     lea     __memcpy_ssse3(%rip), %RAX_LP
>>> +       ret
>>>  END(__new_memcpy)
>>>
>>>  # undef ENTRY
>>>
>>> Will update all IFUNC's if this ok else please suggest.
>>>
>>
>> Better, but not OK.  Try something like
>>
>> iff --git a/sysdeps/x86_64/multiarch/memcpy.S
>> b/sysdeps/x86_64/multiarch/memcpy.S
>> index ab5998c..2abe2fd 100644
>> --- a/sysdeps/x86_64/multiarch/memcpy.S
>> +++ b/sysdeps/x86_64/multiarch/memcpy.S
>> @@ -42,9 +42,11 @@ ENTRY(__new_memcpy)
>>    ret
>>  #endif
>>  1:   lea   __memcpy_avx_unaligned(%rip), %RAX_LP
>> +  HAS_ARCH_FEATURE (Avoid_AVX_Fast_Unaligned_Load)
>> +  jnz   3f
>>    HAS_ARCH_FEATURE (AVX_Fast_Unaligned_Load)
>>    jnz   2f
>> -  lea   __memcpy_sse2_unaligned(%rip), %RAX_LP
>> +3:   lea   __memcpy_sse2_unaligned(%rip), %RAX_LP
>>    HAS_ARCH_FEATURE (Fast_Unaligned_Load)
>>    jnz   2f
>>    lea   __memcpy_sse2(%rip), %RAX_LP
>>
>>
>
> I know this is not related to this patch, but any reason to not code the
> resolver using the libc_ifunc macros?

Did you mean writing them in C?  It can be done.  Someone
needs to write patches.

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]