Fix ldbl-128ibm nearbyintl in non-default rounding modes (bug 19790) [committed]
Anton Blanchard
anton@samba.org
Fri Sep 16 00:35:00 GMT 2016
Hi Joseph,
> The ldbl-128ibm implementation of nearbyintl uses logic that only
> works in round-to-nearest mode. This contrasts with rintl, which
> works in all rounding modes.
I see a huge slow down in a number of maths functions as a result of
this. A profile of a simple exp2() microbenchmark shows the issue:
21.78% exp2 [kernel.kallsyms] [k] system_call_common
20.93% exp2 [kernel.kallsyms] [k] system_call
11.14% exp2 [kernel.kallsyms] [k] system_call_relon_pSeries
5.29% exp2 [kernel.kallsyms] [k] yama_task_prctl
3.51% exp2 [kernel.kallsyms] [k] sys_prctl
3.10% exp2 [kernel.kallsyms] [k] security_task_prctl
We call prctl(... PR_FP_EXC_DISABLED) for every call to exp2().
> @@ -240,7 +240,7 @@ libc_feholdsetround_ppc_ctx (struct rm_ctx *ctx,
> int r) fenv_union_t old, new;
>
> old.fenv = fegetenv_register ();
> - new.l = (old.l & ~0x3) | r;
> + new.l = (old.l & ~0x3 & ~_FPU_MASK_ALL) | r;
> ctx->env = old.fenv;
> if (__glibc_unlikely (new.l != old.l))
> {
>
new.1 never matches old.l, because we mask out the FPU_MASK_ALL bits.
Anton
More information about the Libc-alpha
mailing list