This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] x86_64: memset optimized with AVX512


On Fri, Dec 11, 2015 at 5:26 AM, Andrew Senkevich
<andrew.n.senkevich@gmail.com> wrote:
> 2015-12-10 22:34 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>> On Thu, Dec 10, 2015 at 10:28 AM, Andrew Senkevich
>> <andrew.n.senkevich@gmail.com> wrote:
>>>>>  END (MEMSET)
>>>>> +libc_hidden_def (__memset_avx2)
>>>>
>>>> Why is this change needed?  If it is needed, please submit
>>>> a separate patch.
>>>
>>> We can avoid this change if hide implementation, test and IFUNC branch
>>> under HAVE_AVX512_ASM_SUPPORT.
>>>
>>>> Should __memset_chk_avx512 also be provided?
>>>
>>> It will be the same as AVX2 version, is it really needed?
>>
>> __memset_chk_avx2 calls __memset_avx2.   Don't you want
>> __memset_chk to call __memset_avx512, instead of  __memset_avx2,
>> on KNL?
>
> Oh yes, surely we need it.
>
> Is patch below Ok for trunk?
>
> 2015-12-11  Andrew Senkevich  <andrew.senkevich@intel.com>
>
>         * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new file.
>         * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests.
>         * sysdeps/x86_64/multiarch/memset-avx512.S: New file.
>         * sysdeps/x86_64/multiarch/memset.S: Added new IFUNC branch.
>         * sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
>

> diff --git a/sysdeps/x86_64/multiarch/memset.S
> b/sysdeps/x86_64/multiarch/memset.S
> index dbc00d2..9f16b7e 100644
> --- a/sysdeps/x86_64/multiarch/memset.S
> +++ b/sysdeps/x86_64/multiarch/memset.S
> @@ -30,6 +30,13 @@ ENTRY(memset)
>         HAS_ARCH_FEATURE (AVX2_Usable)
>         jz      2f
>         leaq    __memset_avx2(%rip), %rax
> +#ifdef HAVE_AVX512_ASM_SUPPORT
> +       HAS_ARCH_FEATURE (AVX512DQ_Usable)
> +       jnz     2f
> +       HAS_ARCH_FEATURE (AVX512F_Usable)
> +       jz      2f
> +       leaq    __memset_avx512(%rip), %rax
> +#endif
>  2:     ret
>  END(memset)
>  #endif
> diff --git a/sysdeps/x86_64/multiarch/memset_chk.S
> b/sysdeps/x86_64/multiarch/memset_chk.S
> index e2abb15..5115dfb 100644
> --- a/sysdeps/x86_64/multiarch/memset_chk.S
> +++ b/sysdeps/x86_64/multiarch/memset_chk.S
> @@ -30,6 +30,13 @@ ENTRY(__memset_chk)
>         HAS_ARCH_FEATURE (AVX2_Usable)
>         jz      2f
>         leaq    __memset_chk_avx2(%rip), %rax
> +#ifdef HAVE_AVX512_ASM_SUPPORT
> +       HAS_ARCH_FEATURE (AVX512DQ_Usable)
> +       jnz     2f
> +       HAS_ARCH_FEATURE (AVX512F_Usable)
> +       jz      2f
> +       leaq    __memset_chk_avx512(%rip), %rax
> +#endif
>  2:     ret
>  END(__memset_chk)
>

What is the purpose of checking AVX512DQ_Usable? To
avoid using it on SKX? Is  __memset_avx512 slower than
 __memset_avx2 on SKX?


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]