This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86: Use AVX2 memcpy/memset on Skylake server [BZ #21396]
On Tue, Apr 25, 2017 at 8:27 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Apr 18, 2017 at 11:37 AM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> On Skylake server, AVX512 load/store instructions in memcpy/memset may
>> lead to lower CPU turbo frequency in certain situations. Use of AVX2
>> in memcpy/memset has been observed to have improved overall performance
>> in many workloads due to the higher frequency.
>>
>> Since AVX512ER is unique to Xeon Phi, this patch sets Prefer_No_AVX512
>> if AVX512ER isn't available so that AVX2 versions of memcpy/memset are
>> used on Skylake server.
>>
>> Any comments?
>>
>>
>> H.J.
>> ---
>> [BZ #21396]
>> * sysdeps/x86/cpu-features.c (init_cpu_features): Set
>> Prefer_No_AVX512 if AVX512ER isn't available.
>> * sysdeps/x86/cpu-features.h (bit_arch_Prefer_No_AVX512): New.
>> (index_arch_Prefer_No_AVX512): Likewise.
>> * sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Don't use
>> AVX512 version if Prefer_No_AVX512 is set.
>> * sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk):
>> Likewise.
>> * sysdeps/x86_64/multiarch/memmove.S (__libc_memmove): Likewise.
>> * sysdeps/x86_64/multiarch/memmove_chk.S (__memmove_chk):
>> Likewise.
>> * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
>> * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk):
>> Likewise.
>> * sysdeps/x86_64/multiarch/memset.S (memset): Likewise.
>> * sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk):
>> Likewise.
>
> Since this issue has significant impact on Skylake server, I'd like to
> backport it to 2.24 and 2.25 branches together with the prerequisite
> patch. Any comments
>
> Thanks.
I will check them into 2.24 and 2.25 branches shortly.
--
H.J.