[PATCH] replace tgammaf by the CORE-MATH implementation (correctly rounded)
Joseph Myers
josmyers@redhat.com
Mon Jul 29 22:30:20 GMT 2024
On Mon, 29 Jul 2024, Paul Zimmermann wrote:
> The CORE-MATH implementation is correctly rounded (for any rounding mode).
> This can be checked by exhaustive tests in a few minutes since there are
> less than 2^32 values to check (against GNU MPFR for example).
What platforms have you tested on? (That is, tested *with the glibc
testsuite*, which covers correct exceptions and errno setting in
accordance with glibc expectations.) Those should include at least one
platform without excess precision (e.g. x86_64) and one with excess
precision (32-bit x86).
> diff --git a/math/w_tgammaf_compat.c b/math/w_tgammaf_compat.c
> index 34e0e096e0..b5208c54d5 100644
> --- a/math/w_tgammaf_compat.c
> +++ b/math/w_tgammaf_compat.c
These compat wrapper changes are suspect. The ABI for the compat wrappers
is that they support SVID error handling (calling __kernel_standard* on
error in order to call a user-provided matherr function). (This is only
relevant for old binaries, not new ones, as the symbols required for SVID
error handling are compat symbols so you can't link new binaries to use it
without symbol versioning tricks that shouldn't be used outside the glibc
testsuite.)
We don't systematically test SVID error handling for all functions in the
glibc testsuite, only a few (see test-matherr.c and test-matherr-2.c).
Maybe we *should* add tests of SVID error handling for all functions (and
all the supported error cases for each function) to avoid patches
mistakenly changing the compat wrappers in ways that break compatibility -
if we'd had a test of SVID error handling for tgammaf, I expect it would
have detected breakage from thie change to a compat wrapper.
> + b32u32_u t = {.f = x};
> + uint32_t ax = t.u<<1;
> + if(__builtin_expect(ax>=(0xffu<<24), 0)){
> + if(ax==(0xffu<<24)){
> + if(t.u>>31){
> + errno = EDOM;
> + return __builtin_nanf("12");
> + }
I think this is about the -Inf case?
(a) Please include comments explaining what cases are being handled
whenever the conditions are determined by bit manipulation rather than
straight comparisons of floating-point values. The old code has comments
such as
/* Return value for integer x < 0 is NaN with invalid exception. */
and
/* x == -Inf. According to ISO this is NaN. */
and
/* Positive infinity (return positive infinity) or NaN (return NaN). */
and it's not appropriate to regress this by introducing less-documented
code. More generally, lots more comments throughout the code, explaining
what it's doing (particular cases being handled, overall evaluation
strategy for the function in those cases), would be helpful. Not comments
repeating the obvious meaning of an individual line of C code, but
comments of the level of "handle special case X" or "evaluate tgammaf on
interval [A, B] using a degree-C polynomial approximation" or "handle hard
cases for correct rounding, listed in table X" or similar, so that the
reader can quickly understand the overall intent of each part of the code.
(b) errno is handled by the wrapper (w_tgamma_template.c); you shouldn't
need to handle it here.
(c) On the other hand, the wrapper assumes that the individual function
implementation raised appropriate exceptions. How does this code ensure
that "invalid" is raised? Have you tested with the glibc testsuite, which
verifies the "invalid" exception for negative infinity? I'd expect
something like "return x - x;" to ensure an appropriate exception is
raised.
TEST_f_f (tgamma, 0, plus_infty, DIVIDE_BY_ZERO_EXCEPTION|ERRNO_ERANGE),
> + return x; // nan
And this is for negative NaN. But you need to ensure that a signaling NaN
is converted to quiet with "invalid" raised, so something more like
"return x - x;" would be appropriate here as well. The glibc testsuite
includes tests for this as well.
> + }
> + double z = x;
> + if(__builtin_expect(ax<0x6d000000u, 0)){
Again, please add comments explaining any comparisons based on bit
patterns.
> + volatile double d = (0x1.fa658c23b1578p-1 - 0x1.d0a118f324b63p-1*z)*z - 0x1.2788cfc6fb619p-1;
> + double f = 1.0/z + d;
> + float r = f;
> + if(__builtin_fabs(r)>0x1.fffffep+127f) errno = ERANGE;
Any time you call a double function on a float value is suspect
(__builtin_fabs here), because doing so is usually a bug. If it's
deliberate, it should have a comment. If you're meaning to deal with
excess precision in some way, use of math_narrow_eval is normal in glibc
code.
> + float fx = __builtin_floor(x);
Likewise, if you need to use the double function rather than the float one
then please explain why in a comment. And normally we'd use floorf etc.
directly rather than __builtin_*.
> + int k = fx;
> + if(__builtin_expect(x >= 0x1.18522p+5f, 0)){
> + float r = 0x1p127f * 0x1p127f;
> + if(r>0x1.fffffep+127) errno = ERANGE;
> + return r;
Definitely a case for math_narrow_eval. As everywhere, there should be no
need to set errno because the wrappers handle that.
> + }
> + if(__builtin_expect(fx==x, 0)){
Throughout, in glibc code it's better to use GNU coding style, and glibc
macros such as __glibc_unlikely rather than raw __builtin_expect, unless
you expect this code to be copied verbatim from upstream in future with no
local changes at all.
> + if(x < 0.0f) {
> + errno = EDOM;
> + return __builtin_nanf("12");
As elsewere, you need to compute a NaN in a way that raises an exception.
> + double m = z - 0x1.7p+1, i = __builtin_roundeven(m), step = __builtin_copysign(1.0,i);
Normally the relevant functions would be used directly without __builtin.
When benchmarking, try to include platforms where roundeven is not inlined
and so the out-of-line roundeven function in glibc is called here -
especially where that means the C implementation rather than a
processor-specific IFUNC / assembly implementation - as that may
significantly affect performance. And state explicitly what platform you
benchmarked on (both how the compiler building glibc is configured, and
what processor is in use at runtime as that may affect IFUNCs used).
It's likely this is substantially faster than the existing code even with
the use of the C roundeven implementation, but the choice of roundeven
implementation could be significant enough it's important to consider it
in benchmarking.
> + if(__builtin_expect(((rt.u+2)&0xfffffff) < 8, 0)){
> + for(unsigned i=0;i<sizeof(tb)/sizeof(tb[0]);i++) {
> + if(t.u==tb[i].x.u) return tb[i].f + tb[i].df;
> }
I don't think shadowing variables is very good style (you have a double
variable i in scope from just above this).
> diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps
> index 37d8998c71..29e3ca926a 100644
> --- a/sysdeps/x86_64/fpu/libm-test-ulps
> +++ b/sysdeps/x86_64/fpu/libm-test-ulps
> @@ -2263,25 +2263,21 @@ double: 1
>
> Function: "tgamma":
> double: 9
> -float: 8
> float128: 4
> ldouble: 5
Removing ulps entries like this is something purely mechanical it seems
appropriate to apply to *all* libm-test-ulps files, not just one. (Cf.
how when adding the logp1 aliases I copied ulps from log1p for all
libm-test-ulps files.)
--
Joseph S. Myers
josmyers@redhat.com
More information about the Libc-alpha
mailing list