Sources Bugzilla – Bug 13304
fma, fmaf, fmal produce wrong results
Last modified: 2012-11-22 17:39:07 UTC
Created attachment 5990 [details] test suite for fma() The functions fma(), fmaf(), fmal() are defined in POSIX:2008 <http://pubs.opengroup.org/onlinepubs/9699919799/functions/fma.html>: "These functions shall compute (x * y) + z, rounded as one ternary operation: they shall compute the value (as if) to infinite precision and round once to the result format" Find attached a test suite for fma(), easily adapted for fmaf() or fmal() through a couple of #defines. I have checked this test suite by running it against a slow but correct multi-precision implementation of fma(). This test suite gives - 7 failures for fma(), - 1 more failure for fmaf(), - 9 more failures for fmal(), on a glibc 2.11.3 system where - fma() and fmaf() are implemented through a complex floating-point instruction rich code, - fmal() does just a x*y+z computation. Find attached the bugs, each in a separate program, to be compiled and linked with "-lm". Feel free to add the test suite to glibc. I have a copyright assignment for glibc on file, and am giving this file to the FSF.
Created attachment 5991 [details] test case #1 for fma()
Created attachment 5992 [details] test case #2 for fma()
Created attachment 5993 [details] test case #3 for fma()
Created attachment 5994 [details] test case #4 for fma()
Created attachment 5995 [details] test case #5 for fma()
Created attachment 5996 [details] test case #6 for fma()
Created attachment 5997 [details] test case #7 for fmaf()
Created attachment 5998 [details] test case #8 for fmal()
Created attachment 5999 [details] test case #9 for fmal()
Created attachment 6000 [details] test case #10 for fmal()
Created attachment 6001 [details] test case #11 for fmal()
Created attachment 6002 [details] test case #12 for fmal()
Created attachment 6003 [details] test case #13 for fmal()
Created attachment 6004 [details] test case #14 for fmal()
Created attachment 6005 [details] test case #15 for fmal()
Created attachment 6006 [details] test case #16 for fmal()
glibc 2.11 is very old, fma{,f,l} has been rewritten for glibc 2.14+. Please test that instead.
Jakub, the testsuite and the first two test fail for me on x86-64. The testsuite fails on 32-bit x86 as well but the two first tests pass. Test 1 on x86-64: Expected: 0X1.0000000000001P+54 Actual: 0X1P+54 Test 2 Expected: 0X1.FFFFFFFFFFFFFP+53 Actual: 0X1P+54
(In reply to comment #18) > Jakub, the testsuite and the first two test fail for me on x86-64. The > testsuite fails on 32-bit x86 as well but the two first tests pass. > > Test 1 on x86-64: > Expected: 0X1.0000000000001P+54 > Actual: 0X1P+54 > > Test 2 > Expected: 0X1.FFFFFFFFFFFFFP+53 > Actual: 0X1P+54 Neither fma-bug1.c nor fma-bug2.c fails for me (x86_64, glibc 2.14).
Indeed, works for me now as well after removing some buggy routines. Let's close as Fixed for glibc 2.14+
Please add the first attachment to glibc's test suite. I see 6 different implementations of fma(), 4 implementations of fmaf(), and 4 implementations of fmal() in the glibc source code. How can you guarantee that all of them are thoroughly tested? The ones in math/s_fma.c, math/s_fmaf.c, math/s_fmal.c are definitely buggy.
I've done quite a lot of testing on it back when it was added to glibc, see e.g. the http://sources.redhat.com/ml/libc-alpha/2010-10/txt00000.txt testcase (which has been tweaked for the other floating point formats (float, 80-bit long double, 128-bit long double). Yes, math/s_fma*.c won't DTRT, they are just fallback implementations for architectures without sufficient exception support (the real implementations rely on exceptions working properly and various fenv.h functions being available).
(In reply to comment #22) > I've done quite a lot of testing on it back when it was added to glibc, see > e.g. the > http://sources.redhat.com/ml/libc-alpha/2010-10/txt00000.txt > testcase Good. It would be even better - for the quality of future modifications - if a good unit test was in the glibc source tree. My proposed one runs in less than a second and was able to catch 16 different bugs (and does not need to link against mpfr or gmp). > Yes, math/s_fma*.c won't DTRT, they are just fallback implementations for > architectures without sufficient exception support Find attached a code that produces the correct result also regardless of exception support. Will soon be contributed to GNU gnulib. Should we replace math/s_fma{,f,l}.c with this code (after some polishing for glibc coding style)? Bruno
Created attachment 6008 [details] portable implementation of fma(), fmaf(), fmal(), code written for gnulib
I'll leave the decision whether to include the testcase into glibc or not to Ulrich, I'm not involved with glibc these days too much. And I'm not sure if there are any supported targets that do currently use math/s_fma*.c. BTW, your implementation will do the wrong thing if you call say fma (DBL_MAX, DBL_MAX, -__builtin_inf ("")); will I think in your implementation return NaN, while it must be -Inf.
Among libc targets, there is one that fails to build fma* at present for lack of rounding mode support (and not being configured to use the incorrect fallbacks): no-FPU SH.
Has it been tested in the directed rounding modes?
Created attachment 6014 [details] test suite for fma(), version 2
Created attachment 6015 [details] portable implementation of fma(), fmaf(), fmal(), code written for gnulib, version 2
(In reply to comment #25) Jakub, > your implementation will do the wrong thing if you call say > fma (DBL_MAX, DBL_MAX, -__builtin_inf ("")); > will I think in your implementation return NaN, while it must be -Inf. Good point, yes. I've updated the attached test suite to check this, and fixed the attached implementation code. Thanks.
Created attachment 6017 [details] portable implementation of fma(), fmaf(), fmal(), code written for gnulib, version 3 (In reply to comment #27) > Has it been tested in the directed rounding modes? A new version of the implementation code (attached) supports the 4 rounding modes.
Let's reopen to discuss and decide on adding the portable implementation to glibc.
Besides my implementation of a correct fma() on platforms without rounding mode support, there is also the one by Steven Munroe, in bug #3268: http://sourceware.org/bugzilla/attachment.cgi?id=1664 http://sourceware.org/bugzilla/attachment.cgi?id=1665 I cannot judge whether the glibc maintainers will prefer one or the other. I'm ready to convert the implementation code here to glibc style. Just let me know. Bruno
I think we should go with something like this version (but of course under LGPLv2.1+ for glibc) for now - with the source file set up so the same file (apart from license notices and space/tab differences) can be used in both glibc and gnulib. A soft-fp version might be faster - but the one in bug 3268 is not a clean implementation using soft-fp (it relies on soft-fp knowing about a wider "quad" floating-point type, rather than having an internal _FP_FMA operation that uses more precision internally but without requiring a wider type). So I think we should use the portable implementation for math/s_fma{,f,l}.c for now. If in future there is a clean soft-fp implementation, that can then be used instead (the portable version would still be useful for ldbl-128ibm - as I understand, it will work as if that format uses a fixed 106 bit mantissa - unless and until there is an implementation specific to that format).
*** Bug 3268 has been marked as a duplicate of this bug. ***
Created attachment 6657 [details] test case #17 fma (when using sysdeps/ieee754/dbl-64/s_fma.c) computes the wrong sign in some cases, such as (0.0 * -1.0) + -0.0.
On Thu, 27 Sep 2012, law at redhat dot com wrote: > fma (when using sysdeps/ieee754/dbl-64/s_fma.c) computes the wrong sign in some > cases, such as (0.0 * -1.0) + -0.0. Please file a separate bug for this, and keep the present bug just for the need for a better fallback implementation (where we're hoping for Bruno to contribute a version of the gnulib code to glibc at some point).
FWIW, the ldbl-128ibm version of fmal, which I think would better be replaced by a generic version such as that discussed here (which would also be useful for platforms without FE_INEXACT / FE_TOWARDZERO), not only is incorrect in that it doesn't provide a fused operation, but also appears to produce results that are actually invalid long double values, when in directed rounding modes, e.g.: Failure: Test: fma_towardzero (max_value, max_value, min_value) == max_value Result: is: 1.79769313486231590772e+308 0x1.ffffffffffffffffffffp+1023 should be: 1.79769313486231580793e+308 0x1.fffffffffffff7ffffffp+1023 difference: 1.79769313486231550856e+308 0x1.ffffffffffffe0000000p+1023 ulp : 81129638414606663681390495662080.0000 max.ulp : 0.0000 (Such edges cases for ldbl-128ibm might end up as separate issues, but it probably doesn't make sense to file such separate issues for them until fmal for ldbl-128ibm is basically right, most likely through use of a generic implementation.)