incorrectly rounded square root
Jeff Johnston
jjohnstn@redhat.com
Fri Jun 4 18:44:24 GMT 2021
Ok, I now know exactly what is happening.
The compiler is optimizing out the rounding check in ef_sqrt.c, probably
due to the operation using two constants.
86 ix += (m <<23);
(gdb) list
81 else
82 q += (q&1);
When I debug, it always does the else at line 81 without performing the
one-tiny operation. The difference in the mxcsr
register is the PE bit which I believe gets set when you do the one-tiny
operation. Since we aren't doing it, it never gets
set on and the difference of 0x20 in the mxcsr register is explained.
By making the constants volatile, I am able to get the code working as it
should. I have pushed a patch for this.
-- Jeff J.
On Fri, Jun 4, 2021 at 3:14 AM Paul Zimmermann <Paul.Zimmermann@inria.fr>
wrote:
> Hi Jeff,
>
> > I figured the values were off when I had to hard-code them in my own
> > test_sqrt.c but forgot to include that info in my note.
> >
> > Now, that said, using the code I attached earlier, I am seeing the exact
> > values you are quoting above for glibc for the mxcsr register and the
> round
> > is working. Have your
> > tried running that code?
>
> yes it works as expected, but it doesn't work with Newlib's fenv.h and
> libm.a
> (see below).
>
> > The mxcsr values you are seeing that are different are not due to the
> > fesetround code. The code is shifting the round value 13 bits
> > and for 3, that ends up being 0x6000. It is masking mxcsr with
> 0xffff9fff
> > first so when you start with 0x1fxx and end up with 0x7fxx, the code is
> > doing what is supposed to do.
> > The difference in values above is 0x20 (e.g. 0x7fa0 vs 0x7f80) which is a
> > bit in the last 2 hex digits which isn't touched by the code logic.
>
> here is how to reproduce the issue:
>
> tar xf newlib-4.1.0.tar.gz
> cd newlib-4.1.0
> mkdir build
> cd build
> ../configure --prefix=/tmp --disable-multilib --target=x86_64
> make -j4
> make install
>
> $ cat test_sqrt_2.c
> #include <stdio.h>
> #include <math.h>
> #include <fenv.h>
>
> #ifdef NEWLIB
> /* RedHat's libm claims:
> undefined reference to `__errno' in j1f/y1f */
> int errno;
> int* __errno () { return &errno; }
> #endif
>
> int main()
> {
> int rnd[4] = { FE_TONEAREST, FE_TOWARDZERO, FE_UPWARD, FE_DOWNWARD };
> char Rnd[4] = "NZUD";
> float x = 0x1.ff07fep+127f;
> float y;
> for (int i = 0; i < 4; i++)
> {
> unsigned short cw;
> unsigned int mxcsr = 0;
> fesetround (rnd[i]);
> __asm__ volatile ("fnstcw %0" : "=m" (cw) : );
> __asm__ volatile ("stmxcsr %0" : "=m" (mxcsr) : );
> y = sqrtf (x);
> printf ("RND%c: %a cw=%u mxcsr=%u\n", Rnd[i], y, cw, mxcsr);
> }
> }
>
> With GNU libc:
> $ gcc -fno-builtin test_sqrt_2.c -lm
> $ ./a.out
> RNDN: 0x1.ff83fp+63 cw=895 mxcsr=8064
> RNDZ: 0x1.ff83eep+63 cw=3967 mxcsr=32672
> RNDU: 0x1.ff83fp+63 cw=2943 mxcsr=24480
> RNDD: 0x1.ff83eep+63 cw=1919 mxcsr=16288
>
> With Newlib:
> $ gcc -I/tmp/x86_64/include -DNEWLIB -fno-builtin test_sqrt_2.c /tmp/libm.a
> $ ./a.out
> RNDN: 0x1.ff83fp+63 cw=895 mxcsr=8064
> RNDZ: 0x1.ff83fp+63 cw=3967 mxcsr=32640
> RNDU: 0x1.ff83fp+63 cw=2943 mxcsr=24448
> RNDD: 0x1.ff83fp+63 cw=1919 mxcsr=16256
>
> Can you reproduce that on x86_64 Linux?
>
> Paul
>
>
More information about the Newlib
mailing list