[RFA] lrint(f) bug and code correctness.

Dave Korn dave.korn@artimi.com
Tue Jun 21 12:08:00 GMT 2005

     Morning all!

  This is a bug report / progress report / RFA all in one: I've analysed
this bug and I'm fairly sure I've grokked it, but I could really use a
second (and even third) pair of eyes.  The patch attached is FYI but is not
a formal submission, I have grave doubts that it is correct even though it
solves the reported problem.  For anyone who doesn't want to wade through
the detail but could feed me a bit of information, the main thing I need to
know is "How do I change the x87 fpu rounding mode in newlib/cygwin?"

  Anyway, we got this bug report over on the cygwin list:


the essence of which is that lrint has a bug:

#include <iostream>
#include <cmath>

int main(void) {
   std::cout << "lrintf(0.5f)\t"  << lrintf(0.5f)  << "\n"
                "lrintf(-0.5f)\t" << lrintf(-0.5f) << "\n"
                "lrint(0.5)\t"    << lrint(0.5)    << "\n"
                "lrint(-0.5)\t"   << lrint(0.5)    << std::endl;
   return 0;

The output is

lrintf(0.5f)    0
lrintf(-0.5f)   0
lrint(0.5)      2
lrint(-0.5)     2

Running the same code under Linux (fedora core 1) gives the expected results

lrintf(0.5f)    0
lrintf(-0.5f)   0
lrint(0.5)      0
lrint(-0.5)     0

  I've been investigating and I've found a couple of interesting quirks in
s_lrint.c/sf_lrint.c (newlib/libm/common/...), and here's a patch that I
think fixes the problems  (CORRECTION: Although it fixes the symptoms of the
test case, I have lost confidence in its correctness while I have been doing
this writeup; read on for full details.  The patch is still attached anyway,
just FYI, but is *NOT* a patch submission).  First, off, let me explain why
lrint has a bug, then I'll explain why the same bug exists (but is latent)
in lrintf.  Here's the source:

#ifdef __STDC__
static const double
static double 

/* Adding a double, x, to 2^52 will cause the result to be rounded based on
   the fractional part of x, according to the implementation's current
   mode.  2^52 is the smallest double that can be represented using all 52
   digits. */
  4.50359962737049600000e+15, /* 0x43300000, 0x00000000 */
 -4.50359962737049600000e+15, /* 0xC3300000, 0x00000000 */

#ifdef __STDC__
	long int lrint(double x)
	long int lrint(x)
	double x;
  __int32_t i0,j0,sx;
  __uint32_t i1;
  double t;
  volatile double w;
  long int result;
  sx = (i0>>31)&1;
  j0 = ((i0 & 0x7ff00000) >> 20) - 1023;	**----- (1)
  if(j0 < 20)   			**----- (2)
      if(j0 < -1)			**----- (3)
        return 0;
          w = TWO52[sx] + x;
          t = w - TWO52[sx];
          GET_HIGH_WORD(i0, t);
          j0 = ((i0 & 0x7ff00000) >> 20) - 1023;	**----- (4)
          i0 &= 0x000fffff;
          i0 |= 0x00100000;
          result = i0 >> (20 - j0);			**----- (5)
  else if (j0 < (8 * sizeof (long int)) - 1)
      if (j0 >= 52)
        result = ((long int) i0 << (j0 - 20)) | (i1 << (j0 - 52));
          w = TWO52[sx] + x;
          t = w - TWO52[sx];
          EXTRACT_WORDS (i0, i1, t);
          j0 = ((i0 & 0x7ff00000) >> 20) - 1023;
          i0 &= 0x000fffff;
          i0 |= 0x00100000;
          result = ((long int) i0 << (j0 - 20)) | (i1 >> (52 - j0));
      return (long int) x;
  return sx ? -result : result;

  We enter this code with x == 0.5, which is ($3fe00000, $00000000) in hex.

  At point (1), j0 is set to the exponent extracted from the hex
representation as x, calculated as (0x3fe - 1023), which is -1.

  We enter the if (...) clause at (2), since -1 < 20.  This leg of the code
is meant to deal with the case where we can tell from the exponent that all
the significant bits of the floating point number that are valued > 1.0 (or
do I mean 0.5?  Here's where I lost confidence in the attached patch!) are
contained within the high word of the mantissa (52 - 32 == 20), and all the
bits in the low word of the double correspond to fractional bits, so they
can be safely ignored; we add and then subtract the maximal value in order
to lose all the fractional bits out the bottom end and only be left with
integer valued bits, and then at (4) we attempt to extract the remaining
bits from the high word of the mantissa and right align them into a long

  However this is where we lose for value of |x| < 1.0.  That's because
after adding 2^52 and subtracting it again, the rounded result is zero.
Zero is represented differently from other FP numbers, with all-zeros in
both mantissa and exponent.  So the code at (4) calculates j0 as (-1023) and
then at (5) we do

          result = i0 >> 1043;

which is of course undefined behaviour.  In practice, what happens is that
the shift amount is taken modulo-32 by the x86 "sall %cl, %edx" instruction
used to perform the shift, and since (1043%32) == 19, the resulting value is
((0 & 0x000fffff) | 0x00100000) >> 19 == 0x00100000 >> 19 == 2.  

  To fix this, we could take special account of the fact that zero has an
unusual representation and just return 0 if we detect all-zeros in the
exponent/mantissa of 't' at (4), or we can return zero when we have an
exponent (j0) value that equals -1 as well as one that is less than -1 at
(3).  Both approaches work in the current test case, but I'm not sure if the
second one would be correct in different rounding modes; it was my initial
approach, but I've lost confidence in it while doing this write up, which is
why the attached patch is an RFC rather than a submission.  I need some
advice!  How do I change the x87 fpu rounding mode in newlib?  I'm not an
expert at this, but I began to think while writing this report that an
exponent of -1 means that there's a significant bit in the 0.5's place which
in turn might be rounded to a non-zero value if the rounding mode was other
than round-to-even, and that in turn means it would be wrong to return zero
if the exponent is equal to -1 (but still correct if the exponent is less
than minus one, since then the highest possible significant bit would be the
0.25's bit, which couldn't push any kind of rounding away from zero) but
that we still have to do the add-a-very-large-number-and-subtract-it-again
dance and explicitly test for and handle a zero if that's the rounded result
we get.

  Now, the second point in this bug report is that lrintf does not have this
bug, but it is latent in the lrintf implementation and only hidden by a bug
in the code; let me demonstrate.

#ifdef __STDC__
static const float
static float 
  8.3886080000e+06, /* 0x4b000000 */
 -8.3886080000e+06, /* 0xcb000000 */

#ifdef __STDC__
	long int lrintf(float x)
	long int lrintf(x)
	float x;
  __int32_t j0,sx;
  __uint32_t i0;
  float t;
  volatile float w;
  long int result;


  /* Extract sign bit. */
  sx = (i0 >> 31);

  /* Extract exponent field. */
  j0 = ((i0 & 0x7f800000) >> 23) - 127;		**----- (6)
  if (j0 < (sizeof (long int) * 8) - 1)		**----- (7)
      if (j0 < -1)
        return 0;
      else if (j0 >= 23)
        result = (long int) i0 << (j0 - 23);
      else						**----- (8)
          w = TWO23[sx] + x;
          t = w - TWO23[sx];
          GET_FLOAT_WORD (i0, t);
          j0 = ((i0 >> 23) & 0xff) - 0x7f;
          i0 &= 0x7fffff;
          i0 |= 0x800000;
          result = i0 >> (23 - j0);
      return (long int) x;				**----- (9)
  return sx ? -result : result;

  Because floats are only one word, there's no need to try the same
optimisation of only bit-shifting the high word when all the significant
bits are contained within it.  For that reason, there's no test that is the
equivalent of the "if (j0 < 20)" at (2) in lrint above, and we proceed
directly to testing whether the result will fit in a long int at all.  Now,
when we call lrintf (0.5), j0 still ends up with the value -1 at (6).  But
then, surprisingly, the test "if (-1 < 31)" fails at (7), and we end up at
(8) where we cast the value to a long int, and correctly return zero.

  The test at (7) fails because the compiler emits an unsigned comparison,
which surprised me.  size_t (the return type from sizeof) is of course
unsigned, but j0 is signed, and I would at least have expected a warning.
It turns out newlib is built without any -W options.  So I think the correct
thing to do is to cast the sizeof expression to int, because there's code in
that leg to handle negative values, so it's clearly intended to be a signed
comparison.  However, once we do that, we introduce the same bug as in
lrint, because the shift-and-masking code at (8) also fails to account for
the special representation of zeros!

  Um.  So, I've got to prepare another version of this patch, that replaces
the "<= -1" tests with "if (t == 0.0) return 0;" tests, and I've got to
retest in different rounding modes.  Can anyone tell me how to change the
rounding mode?

Can't think of a witty .sigline today....
-------------- next part --------------
A non-text attachment was scrubbed...
Name: probably-incorrect-lrintf-patch-DO-NOT-APPLY.diff
Type: application/octet-stream
Size: 1415 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/newlib/attachments/20050621/86f3fd53/attachment.obj>

More information about the Newlib mailing list