[PATCH v2] fma vs gcc 4.9
Richard Henderson
rth@twiddle.net
Wed Jul 16 22:50:00 GMT 2014
On 07/16/2014 01:56 PM, Joseph S. Myers wrote:
> On Thu, 3 Jul 2014, Richard Henderson wrote:
>
>> It seems to me that there's a typo on that exact zero test: a2 should be used,
>> not m2. Correct, or have I mis-read the code?
>
> The existing exact zero test seems correct to me. The conditions for such
> an exact zero are that the result of the multiplication is exactly
> representable in 53 bits (i.e., m2 == 0, with m1 being the exact result of
> the multiplication), and that the result of the addition (of z to the high
> part of the multipliation result) is an exact zero (i.e. a1 == 0, which
> implies a2 == 0).
>
Thanks. This second version, then, does not s/m2/a2/, but merely adds the
appropriate barriers. It also makes sure that m2 is complete before clearing
inexact, even though I saw no evidence of it being scheduled after the call.
Ok?
r~
-------------- next part --------------
diff --git a/sysdeps/ieee754/dbl-64/s_fma.c b/sysdeps/ieee754/dbl-64/s_fma.c
index 389acd4..77065aa 100644
--- a/sysdeps/ieee754/dbl-64/s_fma.c
+++ b/sysdeps/ieee754/dbl-64/s_fma.c
@@ -198,16 +198,17 @@ __fma (double x, double y, double z)
t1 = m1 - t1;
t2 = z - t2;
double a2 = t1 + t2;
+ /* Ensure the arithmetic is not scheduled after feclearexcept call. */
+ math_force_eval (m2);
+ math_force_eval (a2);
feclearexcept (FE_INEXACT);
- /* If the result is an exact zero, ensure it has the correct
- sign. */
+ /* If the result is an exact zero, ensure it has the correct sign. */
if (a1 == 0 && m2 == 0)
{
libc_feupdateenv (&env);
- /* Ensure that round-to-nearest value of z + m1 is not
- reused. */
- asm volatile ("" : "=m" (z) : "m" (z));
+ /* Ensure that round-to-nearest value of z + m1 is not reused. */
+ z = math_opt_barrier (z);
return z + m1;
}
More information about the Libc-alpha
mailing list