This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCH 1/2] AArch64 glibc port
Thanks for the review, Ive looked into a couple of the FP issues and
would appreciate your thoughts on the following....
> While the optimized fma using fma instructions would no doubt fix those
> failures, they do suggest you have a bug in your sfp-machine.h - either it
> isn't setting INEXACT exceptions correctly from the addition, so
> fetestexcept fails to detect them, or it isn't handling underflow
> correctly on conversion to double, or both. So if you investigate these
> failures before adding optimized fma, they may also solve some of your
> test-ldouble failures as well.
The first couple of failures we have in test-double.out are:
Failure: fma (min_value, min_value, +0) == +0: Exception "Underflow" not set
Failure: fma (min_value, min_value, -0) == +0: Exception "Underflow" not set
The addition cannot result in an UNDERFLOW, which suggests that
truncation is not correctly detecting the UNDERFLOW.
Digging into the behaviour of trunctfdf2 (provided by libgcc) on the
value LDBL_MIN, I expected to see INEXACT and UNDERFLOW but only get
INEXACT.
The FP_TRUNC() macro detects the situation where the result exponent
is too small and zero's the result fractional bits, explicitly setting
the least significant working bit:
op-common.h:
1223 if (D##_e < 1 - _FP_FRACBITS_##dfs)
\
1224 {
\
1225 _FP_FRAC_SET_##swc(S, _FP_ZEROFRAC_##swc);
\
1226 _FP_FRAC_LOW_##swc(S) |= 1;
\
1227 }
Subsequently FP_ROUND() detects the non zero working bits and raises
INEXACT before applying 'round to nearest'. After rounding the working
bits are 0x5. The logic in _FP_PACKSEMIRAW() shifts out the working
bits leaving 0 in the fractional bits. The detection of underflow
fails because the fractional bits are now all zero.
145 _FP_FRAC_SRL_##wc(X, _FP_WORKBITS); \
146 if (!_FP_EXP_NORMAL(fs, wc, X) && !_FP_FRAC_ZEROP_##wc(X)) \
147 { \
148 if (X##_e == 0) \
149 FP_SET_EXCEPTION(FP_EX_UNDERFLOW); \
The logic in FP_TRUNC highlighted above that sets the least
significant working bit makes the detection of underflow dependent on
rounding mode. Is this intentional?
Any insight into this welcome.
Thanks
/Marcus