The current ./math/s_fma[fl].c implementation simply implements: double __fma (double x, double y, double z) { return (x * y) + z; } Which does mot meet the requirement of 754r "fused multiply-add: The operation fma(x,y,z) computes (x ×y )+z as if with unbounded range and precision, rounding only once to the destination format; see subclause 5.1." This implies (for example) that for double that the 106 bit result of the multiply be passed unrounded to the add (53 bit) and then final result found to 53 bits.
Created attachment 1326 [details] Initial patch for discussion.
Created attachment 1327 [details] Initial fma patch for PPC32 soft-fp for discussion
These patches create fmasf4.c and fmadf4.c in soft-fp. These function have to built with soft-fp and exported from libc. Otherwise they can not access the simulated FPU events, masks and rounding modes which are defined in libc but not exports. The soft-fp implementation of s_fma.c and s_fmaf.c reside in libm and are exported from there. Currently there does not seem to be a good way to override the generic (and incorrect) ./math/s_fms[f].c from the soft-fp directory. So for now I placed the overrides in ports/sysdeps/powerpc/nofpu for testing. I have not implemented a s_fmal.c/fmatf4.c for now. This would imply a soft-fp (256-bit) extented implementation not supported by the current soft-fp.
Corrected title per Joseph Myer comment
Created attachment 1348 [details] updated Ports patch for powerpc32 soft-fp FMA Updated patch that resolves search patch build problem.
Created attachment 1349 [details] Updated FMA soft-fp implementation Cleaned up source and updated comments
Created attachment 1350 [details] New fma_test to verify the intermediate multiply product has full percision. Add a new fma_test that verifies that: "The operation fma(x,y,z) computes (x ×y )+z as if with unbounded range and precision, rounding only once to the destination format" This implies that for double that the 106 bit result of the multiply be passed unrounded to the add (53 bit) and then final result found to 53 bits.
Created attachment 1511 [details] Updated patch to ports to add fma for PPC32 soft-fp
Created attachment 1512 [details] Updated patch to add single and double fma to soft-fp This adds macros to double.h and quad.h to copy internal soft-fp values between the RAW, SEMIRAW, and CANNONICAL formats without using any float types. Specifically this patch allows fmasf4.c and fmadf4.c to be implemented without requiring TF type support.
Created attachment 1513 [details] Updated fma_test to verify that the intermediate multiply product has full percision
Created attachment 1514 [details] Updated patch to ports to add fma for PPC32 soft-fp
Created attachment 1515 [details] Updated fma_test to verify that the intermediate multiply product has full percision This one includes the change log entry.
Why do you need a soft-fp fmaf other than generic float fmaf (float x, float y, float z) { return ((double) x * y) + z; } ? As DFmode has more than twice as wide mantissa as SFmode and bigger exponent range as well, (double) x * (double) y will be IMHO always precise, so no rounding will ever happen there. Similarly for fma and IEEE quad long double. Only fmal needs soft-fp or gmp implementation, on all architectures that don't have it in hardware, and fma for IEEE extended long double or IBM 2x double long double.
Yes what you suggest should work for the generic s_fmaf.c, but that is not the current implmentation. Also fmasf4.c should be a little faster as it avoids some of the more expensive intermediate conversions. s_fma.c would be more problematic because it would require full IEEE quad and TF type support. I believe that you youself pointed out that not all platforms support IEEE quad/TF type, but those platforms still need fma() to work correctly. That is why I reworked the patch to avoid internal use of TF typs and support direct RAW/SEMIRAW/CANNONICAL conversion for the intermediate steps of the soft-fp fma. Again fmadf4.c should be faster then the generic. Yes in the long run we will need to address soft-fp fmal but that requires the invention of "oct or long long double" format. Most platforms that need soft-fp don't implement long double 128-bit so we stage that function. But we have a fix for soft-fp float and double now. I don't see any reason to delay.
No one else is working on this so, I'll assign it to my self.
I think this bug is fixed. If you want I can open a separate bug for soft-fp fmal support. Anyone disagree?
following comment #15.
Created attachment 1664 [details] Updated ports soft-fp fma patch for PPC32 Verified with CVS from April 3rd.
Subject: Re: fma for all targets without hardware fma instructions is incorrect. Thank you for your message. I am away from my email from 29 March to 2 April. Messages sent before 29 March will be read before I go away; messages sent during that period will be read after I return. If you need a response before my return, please contact Nathan Sidwell <nathan@codesourcery.com>.
Created attachment 1665 [details] updated patch to add float/double fma to soft-fp verified with glibc cvs from April 3rd.
Created attachment 1666 [details] Updated fma_test for current CVS Verfied on CVS from April 3rd
Created attachment 4569 [details] Example demonstrating fma failure on x86_64 The fma function is still incorrect on x86_64 (with glibc version 2.11) when the multiplication overflows but the final result would not: the result is an infinity instead of the rounded true result. See the attached C program for a demonstration.
glibc 2.14 later should have correct fma{,f,l} implementation, except for targets which don't support exceptions properly and thus round-to-odd trick doesn't work there, and except for PowerPC long double (where it is even unclear what actually should fmal do because of the floating point format weird properties).
Bug 13304 is the bug with the most recent discussion of generic fma implementations for systems without exceptions / rounding modes, so marking this one as a duplicate; I don't think we need both open. *** This bug has been marked as a duplicate of bug 13304 ***
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.