Re: Fix ldbl-128ibm nearbyintl in non-default rounding modes (bug 19790) [committed]

Hi Joseph,

> The ldbl-128ibm implementation of nearbyintl uses logic that only
> works in round-to-nearest mode.  This contrasts with rintl, which
> works in all rounding modes.

I see a huge slow down in a number of maths functions as a result of
this. A profile of a simple exp2() microbenchmark shows the issue:

  21.78%  exp2     [kernel.kallsyms]  [k] system_call_common
  20.93%  exp2     [kernel.kallsyms]  [k] system_call
  11.14%  exp2     [kernel.kallsyms]  [k] system_call_relon_pSeries
   5.29%  exp2     [kernel.kallsyms]  [k] yama_task_prctl
   3.51%  exp2     [kernel.kallsyms]  [k] sys_prctl
   3.10%  exp2     [kernel.kallsyms]  [k] security_task_prctl

We call prctl(... PR_FP_EXC_DISABLED) for every call to exp2().

> @@ -240,7 +240,7 @@ libc_feholdsetround_ppc_ctx (struct rm_ctx *ctx,
> int r) fenv_union_t old, new;
>    old.fenv = fegetenv_register ();
> -  new.l = (old.l & ~0x3) | r;
> +  new.l = (old.l & ~0x3 & ~_FPU_MASK_ALL) | r;
>    ctx->env = old.fenv;
>    if (__glibc_unlikely (new.l != old.l))
>      {

new.1 never matches old.l, because we mask out the FPU_MASK_ALL bits.


