This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] x86-64: Optimize e_expf with FMA [BZ #21912]


On 16/08/17 15:31, Arjan van de Ven wrote:
> On 8/16/2017 7:04 AM, Carlos O'Donell wrote:
>> On 08/16/2017 09:34 AM, H.J. Lu wrote:
>>> FMA optimized e_expf improves performance by more than 50% on Skylake.
>>>
>>> Any comments?
>>
>> Exactly how much of e_expf-fma.S do you need to achieve that 50% speedup?
> 
> the core "fast path"
> (the bit after    /* Main path: here if 2^(-28)<=|x|<125*log(2) */ )
> 
> 
>>
>> How does this algorithm compare to what is already implemented for e_expf?
> 
> I started with the SSE version of that e_expf, turned it into AVX, used FMA where possible and fixed a few
> glass jaws in the fast path that you hit on skylake.
> 
> the slow path is more a direct 1:1 translation from SSE to AVX (because mixing SSE and AVX
> is generally a bad idea)
> 

based on my benchmarks portable c code can
easily beat the hand written sse asm
(i haven't tested with avx+fma though).

the idea is that the x86 asm has overkill
precision (very close to 0.5 ulp error, but
not correctly rounded), we can debate this
later, but i think the polynomial can be
reduced and there should not be much difference
between asm and c performance (only the
round/convert to int operation is tricky:
for different targets the optimal code is
different, but that can be a target specific
macro hook).

anyway i posted my code to the arm
optimized-routines github repo, i'll start
posting the patches to glibc soon.

(one of the reasons posting glibc patches is
difficult is the nonsensical target specific
asm codes and ifunc resolvers that break when
i update the generic code in a way that
bypasses the wrapper function which is another
source of improvements.)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]