[PATCH] Speedup tanf range reduction

Wed Aug 22 23:54:00 GMT 2018

Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:

> Joseph Myers wrote:
>
>> +
>> +static inline int32_t
>> +rem_pio2f (float x, float *y)
>
>> Please put a comment on this function documenting its semantics.
>
> Done, see below.
>
>
> Speedup tanf range reduction by using the new sincosf range
> reduction algorithm.  Overall code quality is improved due to
> inlining, so there is a speedup even if no range reduction is
> required.
>
> Passes GLIBC testsuite on AArch64.  Some files are no longer
> required which are removed in the next patch.
>
> tanf througput gains on Cortex-A72:
> * |x| < M_PI_4  : 1.1x
> * |x| < M_PI_2  : 1.2x
> * |x| < 2 * M_PI: 1.5x
> * |x| < 120.0   : 1.6x
> * |x| < Inf     : 12.1x

LGTM too.

If we were to have a benchtest for tanf with drand48 inputs, should we group
the entries according to __kernel_tanf() ? e.g.

 * |x|>=0.6744 - fast path for __kernel_tanf
 * |x|<=0.6744

-- 
Tulio Magno