This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86-64: Add sincosf with vector FMA
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 8 Jan 2018 04:43:28 -0800
- Subject: Re: [PATCH] x86-64: Add sincosf with vector FMA
- Authentication-results: sourceware.org; auth=none
- References: <20171220220054.GA16094@intel.com> <20171221003115.GA2338@intel.com>
On Wed, Dec 20, 2017 at 4:31 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> On Wed, Dec 20, 2017 at 02:00:54PM -0800, H.J. Lu wrote:
>> Since the x86-64 assembly version of sincosf is higly optimized with
>> vector instructions, there isn't much room for improvement. However
>> s_sincosf.c written in C with vector math and intrinsics can be
>> optimized by GCC with FMA.
>>
>> On Skylake, bench-sincosf reports performance improvement:
>>
>> Assembly FMA improvement
>> max 104.042 106.614 -2%
>> min 9.426 8.586 10%
>> mean 20.6209 18.803 10%
>>
>> Any coments?
>>
>> H.J.
>> * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
>> Add s_sincosf-sse2 and s_sincosf-fma.
>> (CFLAGS-s_sincosf-fma.c): New.
>> * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file.
>> * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise.
>> * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise.
>> * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if
>> __sincosf is defined.
>
> Updated patch without typedef of __v2df and __v4sf which have been
> provided in <x86intrin.h>. Tested with GCC 4.9/5/6/7 on x86-64.
>
> H.J.
> ----
> Since the x86-64 assembly version of sincosf is higly optimized with
> vector instructions, there isn't much room for improvement. However
> s_sincosf.c written in C with vector math and intrinsics can be
> optimized by GCC with FMA.
>
> On Skylake, bench-sincosf reports performance improvement:
>
> Assembly FMA improvement
> max 104.042 106.614 -2%
> min 9.426 8.586 10%
> mean 20.6209 18.803 10%
>
> * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
> Add s_sincosf-sse2 and s_sincosf-fma.
> (CFLAGS-s_sincosf-fma.c): New.
> * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file.
> * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise.
> * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise.
> * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if
> __sincosf is defined.
If there are no objections, I am checking it in.
--
H.J.