This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] x86-64: Add sincosf with vector FMA


On Wed, Dec 20, 2017 at 4:31 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> On Wed, Dec 20, 2017 at 02:00:54PM -0800, H.J. Lu wrote:
>> Since the x86-64 assembly version of sincosf is higly optimized with
>> vector instructions, there isn't much room for improvement.  However
>> s_sincosf.c written in C with vector math and intrinsics can be
>> optimized by GCC with FMA.
>>
>> On Skylake, bench-sincosf reports performance improvement:
>>
>>            Assembly       FMA         improvement
>> max        104.042       106.614         -2%
>> min        9.426         8.586           10%
>> mean       20.6209       18.803          10%
>>
>> Any coments?
>>
>> H.J.
>>       * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
>>       Add s_sincosf-sse2 and s_sincosf-fma.
>>       (CFLAGS-s_sincosf-fma.c): New.
>>       * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file.
>>       * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise.
>>       * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise.
>>       * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if
>>       __sincosf is defined.
>
> Updated patch without typedef of __v2df and __v4sf which have been
> provided in <x86intrin.h>.  Tested with GCC 4.9/5/6/7 on x86-64.
>
> H.J.
> ----
> Since the x86-64 assembly version of sincosf is higly optimized with
> vector instructions, there isn't much room for improvement.  However
> s_sincosf.c written in C with vector math and intrinsics can be
> optimized by GCC with FMA.
>
> On Skylake, bench-sincosf reports performance improvement:
>
>            Assembly       FMA         improvement
> max        104.042       106.614         -2%
> min        9.426         8.586           10%
> mean       20.6209       18.803          10%
>
>         * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
>         Add s_sincosf-sse2 and s_sincosf-fma.
>         (CFLAGS-s_sincosf-fma.c): New.
>         * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file.
>         * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise.
>         * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise.
>         * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if
>         __sincosf is defined.

If there are no objections, I am checking it in.

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]