This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86-64: Add sinf with FMA
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 4 Dec 2017 10:51:06 -0800
- Subject: Re: [PATCH] x86-64: Add sinf with FMA
- Authentication-results: sourceware.org; auth=none
- References: <20171204180905.GA31592@gmail.com> <3c53189f-818f-0473-9ccd-1c0ecf40ab1c@linaro.org>
On Mon, Dec 4, 2017 at 10:38 AM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
> On 04/12/2017 16:09, H.J. Lu wrote:
>> On Skylake, bench-sinf reports performance improvement:
>>
>> Before After Improvement
>> max 153.996 100.094 54%
>> min 8.546 6.852 25%
>> mean 18.1223 14.4616 25%
>>
>> Any comments?
>>
>> H.J.
>> ---
>> * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
>> Add s_sinf-sse2 and s_sinf-fma.
>> (CFLAGS-s_sinf-fma.c): New.
>> * sysdeps/x86_64/fpu/s_sinf.S (sinf): Add alias only if __sinf
>> is undefined.
>> * sysdeps/x86_64/fpu/multiarch/s_sinf-fma.c: New file.
>> * sysdeps/x86_64/fpu/multiarch/s_sinf-sse2.S: Likewise.
>> * sysdeps/x86_64/fpu/multiarch/s_sinf.c: Likewise.
>> ---
>> sysdeps/x86_64/fpu/multiarch/Makefile | 5 ++++-
>> sysdeps/x86_64/fpu/multiarch/s_sinf-fma.c | 3 +++
>> sysdeps/x86_64/fpu/multiarch/s_sinf-sse2.S | 3 +++
>
> With new s_sinf.c generic implementation, does x86_64 still require an
> assembly one?
They are very close. But assembly version is a little bit faster.
Assembly:
"sinf": {
"": {
"duration": 3.40466e+10,
"iterations": 1.88083e+09,
"max": 137.398,
"min": 8.546,
"mean": 18.1019
}
}
Generic:
"sinf": {
"": {
"duration": 3.40946e+10,
"iterations": 1.54362e+09,
"max": 205.012,
"min": 7.704,
"mean": 22.0875
}
}
I think we should keep assembly version.
--
H.J.