incorrectly rounded square root

Joel Sherrill joel@rtems.org
Fri Jun 4 18:59:55 GMT 2021


On Fri, Jun 4, 2021, 1:44 PM Jeff Johnston <jjohnstn@redhat.com> wrote:

> Ok, I now know exactly what is happening.
>
> The compiler is optimizing out the rounding check in ef_sqrt.c, probably
> due to the operation using two constants.
>
> 86 ix += (m <<23);
> (gdb) list
> 81 else
> 82    q += (q&1);
>
> When I debug, it always does the else at line 81 without performing the
> one-tiny operation.  The difference in the mxcsr
> register is the PE bit which I believe gets set when you do the one-tiny
> operation.  Since we aren't doing it, it never gets
> set on and the difference of 0x20 in the mxcsr register is explained.
>
> By making the constants volatile, I am able to get the code working as it
> should.  I have pushed a patch for this.
>

Awesome catch Paul and great eye to spot the problem Jeff!

--joel

>
> -- Jeff J.
>
> On Fri, Jun 4, 2021 at 3:14 AM Paul Zimmermann <Paul.Zimmermann@inria.fr>
> wrote:
>
>>        Hi Jeff,
>>
>> > I figured the values were off when I had to hard-code them in my own
>> > test_sqrt.c but forgot to include that info in my note.
>> >
>> > Now, that said, using the code I attached earlier, I am seeing the exact
>> > values you are quoting above for glibc for the mxcsr register and the
>> round
>> > is working.  Have your
>> > tried running that code?
>>
>> yes it works as expected, but it doesn't work with Newlib's fenv.h and
>> libm.a
>> (see below).
>>
>> > The mxcsr values you are seeing that are different are not due to the
>> > fesetround code.  The code is shifting the round value 13 bits
>> > and for 3, that ends up being 0x6000.  It is masking mxcsr with
>> 0xffff9fff
>> > first so when you start with 0x1fxx and end up with 0x7fxx, the code is
>> > doing what is supposed to do.
>> > The difference in values above is 0x20 (e.g. 0x7fa0 vs 0x7f80) which is
>> a
>> > bit in the last 2 hex digits which isn't touched by the code logic.
>>
>> here is how to reproduce the issue:
>>
>> tar xf newlib-4.1.0.tar.gz
>> cd newlib-4.1.0
>> mkdir build
>> cd build
>> ../configure --prefix=/tmp --disable-multilib --target=x86_64
>> make -j4
>> make install
>>
>> $ cat test_sqrt_2.c
>> #include <stdio.h>
>> #include <math.h>
>> #include <fenv.h>
>>
>> #ifdef NEWLIB
>> /* RedHat's libm claims:
>>    undefined reference to `__errno' in j1f/y1f */
>> int errno;
>> int* __errno () { return &errno; }
>> #endif
>>
>> int main()
>> {
>>   int rnd[4] = { FE_TONEAREST, FE_TOWARDZERO, FE_UPWARD, FE_DOWNWARD };
>>   char Rnd[4] = "NZUD";
>>   float x = 0x1.ff07fep+127f;
>>   float y;
>>   for (int i = 0; i < 4; i++)
>>   {
>>     unsigned short cw;
>>     unsigned int mxcsr = 0;
>>     fesetround (rnd[i]);
>>     __asm__ volatile ("fnstcw %0" : "=m" (cw) : );
>>     __asm__ volatile ("stmxcsr %0" : "=m" (mxcsr) : );
>>     y = sqrtf (x);
>>     printf ("RND%c: %a cw=%u mxcsr=%u\n", Rnd[i], y, cw, mxcsr);
>>   }
>> }
>>
>> With GNU libc:
>> $ gcc -fno-builtin test_sqrt_2.c -lm
>> $ ./a.out
>> RNDN: 0x1.ff83fp+63 cw=895 mxcsr=8064
>> RNDZ: 0x1.ff83eep+63 cw=3967 mxcsr=32672
>> RNDU: 0x1.ff83fp+63 cw=2943 mxcsr=24480
>> RNDD: 0x1.ff83eep+63 cw=1919 mxcsr=16288
>>
>> With Newlib:
>> $ gcc -I/tmp/x86_64/include -DNEWLIB -fno-builtin test_sqrt_2.c
>> /tmp/libm.a
>> $ ./a.out
>> RNDN: 0x1.ff83fp+63 cw=895 mxcsr=8064
>> RNDZ: 0x1.ff83fp+63 cw=3967 mxcsr=32640
>> RNDU: 0x1.ff83fp+63 cw=2943 mxcsr=24448
>> RNDD: 0x1.ff83fp+63 cw=1919 mxcsr=16256
>>
>> Can you reproduce that on x86_64 Linux?
>>
>> Paul
>>
>>


More information about the Newlib mailing list