RFC: Improve hypot performance
Wilco Dijkstra
Wilco.Dijkstra@arm.com
Thu Nov 18 12:37:24 GMT 2021
Hi Paul,
> On Power10, this implementation is still has a large delta compared to the
> current implementation:
It's obvious something went wrong here since it is slower than Adhemerval's
version - did you look at the generated code?
It should use fma, and no division or function calls (if you see calls to fmin/fmax
you need to set FAST_FMINMAX to 0).
Cheers,
Wilco
More information about the Libc-alpha
mailing list