Bugs 2541 and 4108 report problems with the accuracy of erfcf for
arguments slightly below 4 and 8.
The problems arise from the code
GET_FLOAT_WORD(ix,x);
SET_FLOAT_WORD(z,ix&0xfffff000);
r = __ieee754_expf(-z*z-(float)0.5625)*
__ieee754_expf((z-x)*(z+x)+R/S);
which splits x into upper and lower parts before squaring, rather than
directly computing expf (-x*x), because a direct computation of x*x
would be inexact. Computing exp of an inexact value magifies rounding
error roughly if the inexact value has positive exponent (so fewer
than 24 bits after the point, for float). The idea is that z*z is
exact; (z-x)*(z+x) may not be exact but the inexactness is in bits low
enough not to matter for the final result.
However, while z*z is exact, the problem is that adding 0.5625
increases the exponent, so (-z*z-(float)0.5625) is not exact - one bit
has been lost - and expf then magnifies the effect of losing that one
bit into the errors reported in those bug reports. As long as,
roughly, z has as many bits after the point as before it, and
z*z+0.5625 is exact, the precise number of bits masked off does not
matter. This patch fixes the problem by putting one fewer bit in z
(so z has 11 of the 24 bits of x).
Examining the other implementations shows that the only one that looks
like it has a similar problem is the ldbl-128ibm implementation. (The
long double test added by this patch had a 23ulp error before the
patch (and 1ulp after) - there are probably cases which had larger
errors.) The code in that implementation zeroed 45 bits of the low
double - but that is not enough for z*z to be exact. Since addition
and multiplication of IBM long doubles are not always exact when the
low double is involved, the safest approach seemed to be to zero
enough bits for z*z+0.5625 to be exact as a double (which still leaves
enough nonzeroed bits for the algorithm to work), which this patch
does.
Tested and ulps updated for x86_64, i386, powerpc (on the basis that
other ulps files will be updated before 2.16 release, whether by
architecture maintainers or by others testing on those architectures).
2012-02-26 Joseph Myers<joseph@codesourcery.com>
[BZ #2541]
[BZ #4108]
* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Mask out one more bit
before squaring exponent.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfcl): Mask out whole
bottom long double and 27 bits of top long double before squaring
exponent.
* math/libm-test.inc (erfc_test): Add more tests.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/powerpc/fpu/libm-test-ulps: Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.