This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
IBM long double fixes

From: Alan Modra <amodra at gmail dot com>
To: libc-alpha at sourceware dot org
Cc: Steve Munroe <sjmunroe at us dot ibm dot com>, Ryan Arnold <ryanarn at us dot ibm dot com>
Date: Tue, 25 Jun 2013 16:08:40 +0930
Subject: IBM long double fixes
This patch starts the process of supporting powerpc64 little-endian
long double in glibc.  IBM long double is an array of two ieee
doubles, so making union ibm_extended_long_double reflect this fact is
the correct way to access fields of the doubles IMO, rather than
adding a little-endian version of the current union.  That of course
results in a much larger but mechanical patch.

On top of this change, I discovered a number of places where denormals
and other corner cases were being handled wrongly.

Notes on the patch.
- printf_fphex.c: Testing for the low double exponent being zero is
unnecessary.  If the difference in exponents is less than 53 then the
high double exponent must be nearing the low end of its range, and the
low double exponent hit rock bottom.

- ldbl2mpn.c: A denormal (ie. exponent of zero) value is treated as
if the exponent was one, so shift left mantissa by one.  Code handling
normalisation of the low double mantissa lacked a test for shift count
greater than bits in type being shifted, and lacked anything to handle
the case where the difference in exponents is less than 53 as in
printf_fphex.c.  (Why do we have so many copies of this code?)  

- math_ldbl.h (ldbl_extract_mantissa): Same as above, but worse, with
code testing for exponent > 1 for some reason, probably a typo for
>= 1.

- math_ldbl.h (ldbl_insert_mantissa): Round the high double as per
mpn2ldbl.c (hi is odd or explicit mantissas non-zero) so that the
number we return won't change when applying ldbl_canonicalize().
Add missing overflow checks and normalisation of high mantissa.  
Correct misleading comment: "The hidden bit of the lo mantissa is
zero" is not always true as can be seen from the code rounding the hi
mantissa.  Also by inspection, lzcount can never be less than zero so
remove that test.  Lastly, masking bitfields to their widths can be
left to the compiler.

- mpn2ldbl.c: The overflow checks here on rounding of high double were
just plain wrong.  Incrementing the exponent must be accompanied by a
shift right of the mantissa to keep the value unchanged.  Above notes
for ldbl_insert_mantissa are also relevant.

On applying this patch you will see changes in test-ldouble.out and
test-ildoubl.out in the display of long double values.  eg.

 Failure: Test: pow10 (-305)
 Result:
- is:          9.99999999999999996282e-306   0x1.c16c5c52535750700000p-1014
- should be:   9.99999999999999996282e-306   0x1.c16c5c52535750800000p-1014
+ is:          9.99999999999999999741e-306   0x1.c16c5c52535750700000p-1014
+ should be:   1.00000000000000000023e-305   0x1.c16c5c52535750800000p-1014
  difference:  4.94065645841246544177e-324   0x0.00000000000010000000p-1022
  ulp       :  1.0000
  max.ulp   :  0.0000

This is to be expected and is a result of the denormal fixes.  I
checked every single change in output, and in every case the value
displayed is closer to the ieee854 long double value.


	* sysdeps/ieee754/ldbl-128ibm/ieee754.h
	(union ibm_extended_long_double): Define as an array of ieee754_double.
	(IBM_EXTENDED_LONG_DOUBLE_BIAS): Delete.
	* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Update all references
	to ibm_extended_long_double and IBM_EXTENDED_LONG_DOUBLE_BIAS.
	* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/strtold_l.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise.
	* sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c: Comment fix.

	* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
	(PRINT_FPHEX_LONG_DOUBLE): Tidy code by moving -53 into ediff
	calculation.  Remove unnecessary test for denormal exponent.
	* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c (__mpn_extract_long_double):
	Correct handling of denormals.  Avoid undefined shift behaviour.
	Correct normalisation of low mantissa when low double is denormal.
	* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
	(ldbl_extract_mantissa): Likewise.  Comment.
	(ldbl_insert_mantissa): Correct normalisation of low mantissa.
	Test for overflow of high mantissa and normalise.
	* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
	(__mpn_construct_long_double): Fix test for overflow of high
	mantissa and correct normalisation.  Avoid undefined shift.

diff --git a/sysdeps/ieee754/ldbl-128ibm/ieee754.h b/sysdeps/ieee754/ldbl-128ibm/ieee754.h
index e5644f5..0778b1f 100644
--- a/sysdeps/ieee754/ldbl-128ibm/ieee754.h
+++ b/sysdeps/ieee754/ldbl-128ibm/ieee754.h
@@ -179,30 +179,10 @@ union ieee854_long_double
 
 union ibm_extended_long_double
   {
-    long double d;
-    double dd[2];
-
-    /* This is the IBM extended format long double.  */
-    struct
-      { /* Big endian.  There is no other.  */
-
-	unsigned int negative:1;
-	unsigned int exponent:11;
-	/* Together Mantissa0-3 comprise the mantissa.  */
-	unsigned int mantissa0:20;
-	unsigned int mantissa1:32;
-
-	unsigned int negative2:1;
-	unsigned int exponent2:11;
-	/* There is an implied 1 here?  */
-	/* Together these comprise the mantissa.  */
-	unsigned int mantissa2:20;
-	unsigned int mantissa3:32;
-      } ieee;
+    long double ld;
+    union ieee754_double d[2];
    };
 
-#define IBM_EXTENDED_LONG_DOUBLE_BIAS 0x3ff /* Added to exponent.  */
-
 __END_DECLS
 
 #endif /* ieee754.h */
diff --git a/sysdeps/ieee754/ldbl-128ibm/printf_fphex.c b/sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
index 247dc20..453c2be 100644
--- a/sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
+++ b/sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
@@ -26,31 +26,31 @@ do {									      \
       unsigned long long int num0, num1;				      \
       unsigned long long hi, lo;					      \
       int ediff;							      \
-      union ibm_extended_long_double eldbl;				      \
-      eldbl.d = fpnum.ldbl.d;						      \
+      union ibm_extended_long_double u;					      \
+      u.ld = fpnum.ldbl.d;						      \
 									      \
       assert (sizeof (long double) == 16);				      \
 									      \
-      lo = ((long long)eldbl.ieee.mantissa2 << 32) | eldbl.ieee.mantissa3;    \
-      hi = ((long long)eldbl.ieee.mantissa0 << 32) | eldbl.ieee.mantissa1;    \
+      lo = ((long long)u.d[1].ieee.mantissa0 << 32) | u.d[1].ieee.mantissa1;  \
+      hi = ((long long)u.d[0].ieee.mantissa0 << 32) | u.d[0].ieee.mantissa1;  \
       lo <<= 7; /* pre-shift lo to match ieee854.  */			      \
       /* If the lower double is not a denomal or zero then set the hidden     \
 	 53rd bit.  */							      \
-      if (eldbl.ieee.exponent2 != 0)					      \
+      if (u.d[1].ieee.exponent != 0)					      \
 	lo |= (1ULL << (52 + 7));					      \
       else								      \
 	lo <<= 1;							      \
       /* The lower double is normalized separately from the upper.  We	      \
 	 may need to adjust the lower manitissa to reflect this.  */	      \
-      ediff = eldbl.ieee.exponent - eldbl.ieee.exponent2;		      \
-      if (ediff > 53 + 63)						      \
+      ediff = u.d[0].ieee.exponent - u.d[1].ieee.exponent - 53;		      \
+      if (ediff > 63)							      \
 	lo = 0;								      \
-      else if (ediff > 53)						      \
-	lo = lo >> (ediff - 53);					      \
-      else if (eldbl.ieee.exponent2 == 0 && ediff < 53)			      \
-	lo = lo << (53 - ediff);					      \
-      if (eldbl.ieee.negative != eldbl.ieee.negative2			      \
-	  && (eldbl.ieee.exponent2 != 0 || lo != 0L))			      \
+      else if (ediff > 0)						      \
+	lo = lo >> ediff;						      \
+      else if (ediff < 0)						      \
+	lo = lo << -ediff;						      \
+      if (u.d[0].ieee.negative != u.d[1].ieee.negative			      \
+	  && lo != 0)							      \
 	{								      \
 	  lo = (1ULL << 60) - lo;					      \
 	  if (hi == 0L)							      \
@@ -58,7 +58,7 @@ do {									      \
 	      /* we have a borrow from the hidden bit, so shift left 1.  */   \
 	      hi = 0xffffffffffffeLL | (lo >> 59);			      \
 	      lo = 0xfffffffffffffffLL & (lo << 1);			      \
-	      eldbl.ieee.exponent--;					      \
+	      u.d[0].ieee.exponent--;					      \
 	    }								      \
 	  else								      \
 	    hi--;							      \
@@ -109,9 +109,9 @@ do {									      \
 	  *--wnumstr = L'0';						      \
 	}								      \
 									      \
-      leading = eldbl.ieee.exponent == 0 ? '0' : '1';			      \
+      leading = u.d[0].ieee.exponent == 0 ? '0' : '1';			      \
 									      \
-      exponent = eldbl.ieee.exponent;					      \
+      exponent = u.d[0].ieee.exponent;					      \
 									      \
       if (exponent == 0)						      \
 	{								      \
@@ -121,18 +121,18 @@ do {									      \
 	    {								      \
 	      /* This is a denormalized number.  */			      \
 	      expnegative = 1;						      \
-	      exponent = IBM_EXTENDED_LONG_DOUBLE_BIAS - 1;		      \
+	      exponent = IEEE754_DOUBLE_BIAS - 1;			      \
 	    }								      \
 	}								      \
-      else if (exponent >= IBM_EXTENDED_LONG_DOUBLE_BIAS)		      \
+      else if (exponent >= IEEE754_DOUBLE_BIAS)				      \
 	{								      \
 	  expnegative = 0;						      \
-	  exponent -= IBM_EXTENDED_LONG_DOUBLE_BIAS;			      \
+	  exponent -= IEEE754_DOUBLE_BIAS;				      \
 	}								      \
       else								      \
 	{								      \
 	  expnegative = 1;						      \
-	  exponent = -(exponent - IBM_EXTENDED_LONG_DOUBLE_BIAS);	      \
+	  exponent = -(exponent - IEEE754_DOUBLE_BIAS);			      \
 	}								      \
 } while (0)
 
diff --git a/sysdeps/ieee754/ldbl-128ibm/e_exp10l.c b/sysdeps/ieee754/ldbl-128ibm/e_exp10l.c
index 1eaf2fe..49121ca 100644
--- a/sysdeps/ieee754/ldbl-128ibm/e_exp10l.c
+++ b/sysdeps/ieee754/ldbl-128ibm/e_exp10l.c
@@ -36,9 +36,9 @@ __ieee754_exp10l (long double arg)
   else if (arg > LDBL_MAX_10_EXP + 1)
     return LDBL_MAX * LDBL_MAX;
 
-  u.d = arg;
-  arg_high = u.dd[0];
-  arg_low = u.dd[1];
+  u.ld = arg;
+  arg_high = u.d[0].d;
+  arg_low = u.d[1].d;
   exp_high = arg_high * log10_high;
   exp_low = arg_high * log10_low + arg_low * M_LN10l;
   return __ieee754_expl (exp_high) * __ieee754_expl (exp_low);
diff --git a/sysdeps/ieee754/ldbl-128ibm/e_expl.c b/sysdeps/ieee754/ldbl-128ibm/e_expl.c
index b599f36..c5fc1e2 100644
--- a/sysdeps/ieee754/ldbl-128ibm/e_expl.c
+++ b/sysdeps/ieee754/ldbl-128ibm/e_expl.c
@@ -162,39 +162,39 @@ __ieee754_expl (long double x)
       x = x + xl;
 
       /* Compute ex2 = 2^n_0 e^(argtable[tval1]) e^(argtable[tval2]).  */
-      ex2_u.d = __expl_table[T_EXPL_RES1 + tval1]
-		* __expl_table[T_EXPL_RES2 + tval2];
+      ex2_u.ld = (__expl_table[T_EXPL_RES1 + tval1]
+		  * __expl_table[T_EXPL_RES2 + tval2]);
       n_i = (int)n;
       /* 'unsafe' is 1 iff n_1 != 0.  */
       unsafe = fabsl(n_i) >= -LDBL_MIN_EXP - 1;
-      ex2_u.ieee.exponent += n_i >> unsafe;
+      ex2_u.d[0].ieee.exponent += n_i >> unsafe;
       /* Fortunately, there are no subnormal lowpart doubles in
 	 __expl_table, only normal values and zeros.
 	 But after scaling it can be subnormal.  */
-      exponent2 = ex2_u.ieee.exponent2 + (n_i >> unsafe);
-      if (ex2_u.ieee.exponent2 == 0)
-	/* assert ((ex2_u.ieee.mantissa2|ex2_u.ieee.mantissa3) == 0) */;
+      exponent2 = ex2_u.d[1].ieee.exponent + (n_i >> unsafe);
+      if (ex2_u.d[1].ieee.exponent == 0)
+	/* assert ((ex2_u.d[1].ieee.mantissa0|ex2_u.d[1].ieee.mantissa1) == 0) */;
       else if (exponent2 > 0)
-	ex2_u.ieee.exponent2 = exponent2;
+	ex2_u.d[1].ieee.exponent = exponent2;
       else if (exponent2 <= -54)
 	{
-	  ex2_u.ieee.exponent2 = 0;
-	  ex2_u.ieee.mantissa2 = 0;
-	  ex2_u.ieee.mantissa3 = 0;
+	  ex2_u.d[1].ieee.exponent = 0;
+	  ex2_u.d[1].ieee.mantissa0 = 0;
+	  ex2_u.d[1].ieee.mantissa1 = 0;
 	}
       else
 	{
 	  static const double
 	    two54 = 1.80143985094819840000e+16, /* 4350000000000000 */
 	    twom54 = 5.55111512312578270212e-17; /* 3C90000000000000 */
-	  ex2_u.dd[1] *= two54;
-	  ex2_u.ieee.exponent2 += n_i >> unsafe;
-	  ex2_u.dd[1] *= twom54;
+	  ex2_u.d[1].d *= two54;
+	  ex2_u.d[0].ieee.exponent += n_i >> unsafe;
+	  ex2_u.d[1].d *= twom54;
 	}
 
       /* Compute scale = 2^n_1.  */
-      scale_u.d = 1.0L;
-      scale_u.ieee.exponent += n_i - (n_i >> unsafe);
+      scale_u.ld = 1.0L;
+      scale_u.d[0].ieee.exponent += n_i - (n_i >> unsafe);
 
       /* Approximate e^x2 - 1, using a seventh-degree polynomial,
 	 with maximum error in [-2^-16-2^-53,2^-16+2^-53]
@@ -204,7 +204,7 @@ __ieee754_expl (long double x)
       /* Return result.  */
       fesetenv (&oldenv);
 
-      result = x22 * ex2_u.d + ex2_u.d;
+      result = x22 * ex2_u.ld + ex2_u.ld;
 
       /* Now we can test whether the result is ultimate or if we are unsure.
 	 In the later case we should probably call a mpn based routine to give
@@ -238,7 +238,7 @@ __ieee754_expl (long double x)
       if (!unsafe)
 	return result;
       else
-	return result * scale_u.d;
+	return result * scale_u.ld;
     }
   /* Exceptional cases:  */
   else if (isless (x, himark))
diff --git a/sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c b/sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c
index 6a72d6a..8885def 100644
--- a/sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c
+++ b/sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c
@@ -243,7 +243,7 @@ int32_t __ieee754_rem_pio2l(long double x, long double *y)
      We split the 113 bits of the mantissa into 5 24bit integers
      stored in a double array.  */
   /* Make the IBM extended format 105 bit mantissa look like the ieee854 112
-     bit mantissa so the next operatation will give the correct result.  */
+     bit mantissa so the next operation will give the correct result.  */
   ldbl_extract_mantissa (&ixd, &lxd, &exp, x);
   exp = exp - 23;
   /* This is faster than doing this in floating point, because we
diff --git a/sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c b/sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c
index 00e44b8..e46fde7 100644
--- a/sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c
+++ b/sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c
@@ -36,34 +36,44 @@ __mpn_extract_long_double (mp_ptr res_ptr, mp_size_t size,
   union ibm_extended_long_double u;
   unsigned long long hi, lo;
   int ediff;
-  u.d = value;
 
-  *is_neg = u.ieee.negative;
-  *expt = (int) u.ieee.exponent - IBM_EXTENDED_LONG_DOUBLE_BIAS;
+  u.ld = value;
 
-  lo = ((long long) u.ieee.mantissa2 << 32) | u.ieee.mantissa3;
-  hi = ((long long) u.ieee.mantissa0 << 32) | u.ieee.mantissa1;
-  /* If the lower double is not a denomal or zero then set the hidden
+  *is_neg = u.d[0].ieee.negative;
+  *expt = (int) u.d[0].ieee.exponent - IEEE754_DOUBLE_BIAS;
+
+  lo = ((long long) u.d[1].ieee.mantissa0 << 32) | u.d[1].ieee.mantissa1;
+  hi = ((long long) u.d[0].ieee.mantissa0 << 32) | u.d[0].ieee.mantissa1;
+
+  /* If the lower double is not a denormal or zero then set the hidden
      53rd bit.  */
-  if (u.ieee.exponent2 > 0)
-    {
-      lo |= 1LL << 52;
+  if (u.d[1].ieee.exponent != 0)
+    lo |= 1ULL << 52;
+  else
+    lo = lo << 1;
 
-      /* The lower double is normalized separately from the upper.  We may
-	 need to adjust the lower manitissa to reflect this.  */
-      ediff = u.ieee.exponent - u.ieee.exponent2;
-      if (ediff > 53)
-	lo = lo >> (ediff-53);
+  /* The lower double is normalized separately from the upper.  We may
+     need to adjust the lower manitissa to reflect this.  */
+  ediff = u.d[0].ieee.exponent - u.d[1].ieee.exponent - 53;
+  if (ediff > 0)
+    {
+      if (ediff < 64)
+	lo = lo >> ediff;
+      else
+	lo = 0;
     }
+  else if (ediff < 0)
+    lo = lo << -ediff;
+
   /* The high double may be rounded and the low double reflects the
      difference between the long double and the rounded high double
      value.  This is indicated by a differnce between the signs of the
      high and low doubles.  */
-  if ((u.ieee.negative != u.ieee.negative2)
-      && ((u.ieee.exponent2 != 0) && (lo != 0L)))
+  if (u.d[0].ieee.negative != u.d[1].ieee.negative
+      && lo != 0)
     {
       lo = (1ULL << 53) - lo;
-      if (hi == 0LL)
+      if (hi == 0)
 	{
 	  /* we have a borrow from the hidden bit, so shift left 1.  */
 	  hi = 0x0ffffffffffffeLL | (lo >> 51);
@@ -92,7 +102,7 @@ __mpn_extract_long_double (mp_ptr res_ptr, mp_size_t size,
 #define NUM_LEADING_ZEROS (BITS_PER_MP_LIMB \
 			   - (LDBL_MANT_DIG - ((N - 1) * BITS_PER_MP_LIMB)))
 
-  if (u.ieee.exponent == 0)
+  if (u.d[0].ieee.exponent == 0)
     {
       /* A biased exponent of zero is a special case.
 	 Either it is a zero or it is a denormal number.  */
diff --git a/sysdeps/ieee754/ldbl-128ibm/math_ldbl.h b/sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
index 046293e..69bb2e6 100644
--- a/sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
+++ b/sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
@@ -6,85 +6,131 @@
 #include <ieee754.h>
 #include <stdint.h>
 
+/* To suit our callers we return *hi64 and *lo64 as if they came from
+   an ieee854 112 bit mantissa, that is, 48 bits in *hi64 (plus one
+   implicit bit) and 64 bits in *lo64.  */
+
 static inline void
-ldbl_extract_mantissa (int64_t *hi64, uint64_t *lo64, int *exp, long double x)
+ldbl_extract_mantissa (uint64_t *hi64, uint64_t *lo64, int *exp, long double x)
 {
   /* We have 105 bits of mantissa plus one implicit digit.  Since
      106 bits are representable we use the first implicit digit for
      the number before the decimal point and the second implicit bit
      as bit 53 of the mantissa.  */
   uint64_t hi, lo;
-  int ediff;
-  union ibm_extended_long_double eldbl;
-  eldbl.d = x;
-  *exp = eldbl.ieee.exponent - IBM_EXTENDED_LONG_DOUBLE_BIAS;
-
-  lo = ((int64_t)eldbl.ieee.mantissa2 << 32) | eldbl.ieee.mantissa3;
-  hi = ((int64_t)eldbl.ieee.mantissa0 << 32) | eldbl.ieee.mantissa1;
-  /* If the lower double is not a denomal or zero then set the hidden
-     53rd bit.  */
-  if (eldbl.ieee.exponent2 > 0x001)
-    {
-      lo |= (1ULL << 52);
-      lo = lo << 7; /* pre-shift lo to match ieee854.  */
-      /* The lower double is normalized separately from the upper.  We
-	 may need to adjust the lower manitissa to reflect this.  */
-      ediff = eldbl.ieee.exponent - eldbl.ieee.exponent2;
-      if (ediff > 53)
-	lo = lo >> (ediff-53);
-      hi |= (1ULL << 52);
-    }
+  union ibm_extended_long_double u;
 
-  if ((eldbl.ieee.negative != eldbl.ieee.negative2)
-      && ((eldbl.ieee.exponent2 != 0) && (lo != 0LL)))
+  u.ld = x;
+  *exp = u.d[0].ieee.exponent - IEEE754_DOUBLE_BIAS;
+
+  lo = ((uint64_t) u.d[1].ieee.mantissa0 << 32) | u.d[1].ieee.mantissa1;
+  hi = ((uint64_t) u.d[0].ieee.mantissa0 << 32) | u.d[0].ieee.mantissa1;
+
+  if (u.d[0].ieee.exponent != 0)
     {
-      hi--;
-      lo = (1ULL << 60) - lo;
-      if (hi < (1ULL << 52))
+      int ediff;
+
+      /* If not a denormal or zero then we have an implicit 53rd bit.  */
+      hi |= (uint64_t) 1 << 52;
+
+      if (u.d[1].ieee.exponent != 0)
+	lo |= (uint64_t) 1 << 52;
+      else
+	/* A denormal is to be interpreted as having a biased exponent
+	   of 1.  */
+	lo = lo << 1;
+
+      /* We are going to shift 4 bits out of hi later, because we only
+	 want 48 bits in *hi64.  That means we want 60 bits in lo, but
+	 we currently only have 53.  Shift the value up.  */
+      lo = lo << 7;
+
+      /* The lower double is normalized separately from the upper.
+	 We may need to adjust the lower mantissa to reflect this.
+	 The difference between the exponents can be larger than 53
+	 when the low double is much less than 1ULP of the upper
+	 (in which case there are significant bits, all 0's or all
+	 1's, between the two significands).  The difference between
+	 the exponents can be less than 53 when the upper double
+	 exponent is nearing its minimum value (in which case the low
+	 double is denormal ie. has an exponent of zero).  */
+      ediff = u.d[0].ieee.exponent - u.d[1].ieee.exponent - 53;
+      if (ediff > 0)
+	{
+	  if (ediff < 64)
+	    lo = lo >> ediff;
+	  else
+	    lo = 0;
+	}
+      else if (ediff < 0)
+	lo = lo << -ediff;
+
+      if (u.d[0].ieee.negative != u.d[1].ieee.negative
+	  && lo != 0)
 	{
-	  /* we have a borrow from the hidden bit, so shift left 1.  */
-	  hi = (hi << 1) | (lo >> 59);
-	  lo = 0xfffffffffffffffLL & (lo << 1);
-	  *exp = *exp - 1;
+	  hi--;
+	  lo = ((uint64_t) 1 << 60) - lo;
+	  if (hi < (uint64_t) 1 << 52)
+	    {
+	      /* We have a borrow from the hidden bit, so shift left 1.  */
+	      hi = (hi << 1) | (lo >> 59);
+	      lo = (((uint64_t) 1 << 60) - 1) & (lo << 1);
+	      *exp = *exp - 1;
+	    }
 	}
     }
+  else
+    /* If the larger magnitude double is denormal then the smaller
+       one must be zero.  */
+    hi = hi << 1;
+
   *lo64 = (hi << 60) | lo;
   *hi64 = hi >> 4;
 }
 
 static inline long double
-ldbl_insert_mantissa (int sign, int exp, int64_t hi64, u_int64_t lo64)
+ldbl_insert_mantissa (int sign, int exp, uint64_t hi64, uint64_t lo64)
 {
   union ibm_extended_long_double u;
-  unsigned long hidden2, lzcount;
-  unsigned long long hi, lo;
+  int expnt2;
+  uint64_t hi, lo;
+
+  u.d[0].ieee.negative = sign;
+  u.d[1].ieee.negative = sign;
+  u.d[0].ieee.exponent = exp + IEEE754_DOUBLE_BIAS;
+  u.d[1].ieee.exponent = 0;
+  expnt2 = exp - 53 + IEEE754_DOUBLE_BIAS;
 
-  u.ieee.negative = sign;
-  u.ieee.negative2 = sign;
-  u.ieee.exponent = exp + IBM_EXTENDED_LONG_DOUBLE_BIAS;
-  u.ieee.exponent2 = exp-53 + IBM_EXTENDED_LONG_DOUBLE_BIAS;
   /* Expect 113 bits (112 bits + hidden) right justified in two longs.
      The low order 53 bits (52 + hidden) go into the lower double */
-  lo = (lo64 >> 7)& ((1ULL << 53) - 1);
-  hidden2 = (lo64 >> 59) &  1ULL;
+  lo = (lo64 >> 7) & (((uint64_t) 1 << 53) - 1);
   /* The high order 53 bits (52 + hidden) go into the upper double */
-  hi = (lo64 >> 60) & ((1ULL << 11) - 1);
-  hi |= (hi64 << 4);
+  hi = lo64 >> 60;
+  hi |= hi64 << 4;
 
-  if (lo != 0LL)
+  if (lo != 0)
     {
-      /* hidden2 bit of low double controls rounding of the high double.
-	 If hidden2 is '1' then round up hi and adjust lo (2nd mantissa)
+      int lzcount;
+
+      /* hidden bit of low double controls rounding of the high double.
+	 If hidden is '1' and either the explicit mantissa is non-zero
+	 or hi is odd, then round up hi and adjust lo (2nd mantissa)
 	 plus change the sign of the low double to compensate.  */
-      if (hidden2)
+      if ((lo & ((uint64_t) 1 << 52)) != 0
+	  && ((hi & 1) != 0 || (lo & (((uint64_t) 1 << 52) - 1)) != 0))
 	{
 	  hi++;
-	  u.ieee.negative2 = !sign;
-	  lo = (1ULL << 53) - lo;
+	  if ((hi & ((uint64_t) 1 << 53)) != 0)
+	    {
+	      hi = hi >> 1;
+	      u.d[0].ieee.exponent++;
+	    }
+	  u.d[1].ieee.negative = !sign;
+	  lo = ((uint64_t) 1 << 53) - lo;
 	}
-      /* The hidden bit of the lo mantissa is zero so we need to
-	 normalize the it for the low double.  Shift it left until the
-	 hidden bit is '1' then adjust the 2nd exponent accordingly.  */
+
+      /* Normalize the low double.  Shift the mantissa left until
+	 the hidden bit is '1' and adjust the exponent accordingly.  */
 
       if (sizeof (lo) == sizeof (long))
 	lzcount = __builtin_clzl (lo);
@@ -92,35 +138,31 @@ ldbl_insert_mantissa (int sign, int exp, int64_t hi64, u_int64_t lo64)
 	lzcount = __builtin_clzl ((long) (lo >> 32));
       else
 	lzcount = __builtin_clzl ((long) lo) + 32;
-      lzcount = lzcount - 11;
-      if (lzcount > 0)
+      lzcount = lzcount - (64 - 53);
+      lo <<= lzcount;
+      expnt2 -= lzcount;
+
+      if (expnt2 >= 1)
+	/* Not denormal.  */
+	u.d[1].ieee.exponent = expnt2;
+      else
 	{
-	  int expnt2 = u.ieee.exponent2 - lzcount;
-	  if (expnt2 >= 1)
-	    {
-	      /* Not denormal.  Normalize and set low exponent.  */
-	      lo = lo << lzcount;
-	      u.ieee.exponent2 = expnt2;
-	    }
+	  /* Is denormal.  Note that biased exponent of 0 is treated
+	     as if it was 1, hence the extra shift.  */
+	  if (expnt2 > -53)
+	    lo >>= 1 - expnt2;
 	  else
-	    {
-	      /* Is denormal.  */
-	      lo = lo << (lzcount + expnt2);
-	      u.ieee.exponent2 = 0;
-	    }
+	    lo = 0;
 	}
     }
   else
-    {
-      u.ieee.negative2 = 0;
-      u.ieee.exponent2 = 0;
-    }
+    u.d[1].ieee.negative = 0;
 
-  u.ieee.mantissa3 = lo & ((1ULL << 32) - 1);
-  u.ieee.mantissa2 = (lo >> 32) & ((1ULL << 20) - 1);
-  u.ieee.mantissa1 = hi & ((1ULL << 32) - 1);
-  u.ieee.mantissa0 = (hi >> 32) & ((1ULL << 20) - 1);
-  return u.d;
+  u.d[1].ieee.mantissa1 = lo;
+  u.d[1].ieee.mantissa0 = lo >> 32;
+  u.d[0].ieee.mantissa1 = hi;
+  u.d[0].ieee.mantissa0 = hi >> 32;
+  return u.ld;
 }
 
 /* Handy utility functions to pack/unpack/cononicalize and find the nearbyint
@@ -129,18 +171,18 @@ static inline long double
 default_ldbl_pack (double a, double aa)
 {
   union ibm_extended_long_double u;
-  u.dd[0] = a;
-  u.dd[1] = aa;
-  return u.d;
+  u.d[0].d = a;
+  u.d[1].d = aa;
+  return u.ld;
 }
 
 static inline void
 default_ldbl_unpack (long double l, double *a, double *aa)
 {
   union ibm_extended_long_double u;
-  u.d = l;
-  *a = u.dd[0];
-  *aa = u.dd[1];
+  u.ld = l;
+  *a = u.d[0].d;
+  *aa = u.d[1].d;
 }
 
 #ifndef ldbl_pack
@@ -163,13 +205,13 @@ ldbl_canonicalize (double *a, double *aa)
   *aa = xl;
 }
 
-/* Simple inline nearbyint (double) function .
+/* Simple inline nearbyint (double) function.
    Only works in the default rounding mode
    but is useful in long double rounding functions.  */
 static inline double
 ldbl_nearbyint (double a)
 {
-  double two52 = 0x10000000000000LL;
+  double two52 = (uint64_t) 1 << 52;
 
   if (__builtin_expect ((__builtin_fabs (a) < two52), 1))
     {
diff --git a/sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c b/sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
index 3df42c5..c96852d 100644
--- a/sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
+++ b/sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
@@ -33,11 +33,11 @@ __mpn_construct_long_double (mp_srcptr frac_ptr, int expt, int sign)
   unsigned long long hi, lo;
   int exponent2;
 
-  u.ieee.negative = sign;
-  u.ieee.negative2 = sign;
-  u.ieee.exponent = expt + IBM_EXTENDED_LONG_DOUBLE_BIAS;
-  u.ieee.exponent2 = 0;
-  exponent2 = expt - 53 + IBM_EXTENDED_LONG_DOUBLE_BIAS;
+  u.d[0].ieee.negative = sign;
+  u.d[1].ieee.negative = sign;
+  u.d[0].ieee.exponent = expt + IEEE754_DOUBLE_BIAS;
+  u.d[1].ieee.exponent = 0;
+  exponent2 = expt - 53 + IEEE754_DOUBLE_BIAS;
 
 #if BITS_PER_MP_LIMB == 32
   /* The low order 53 bits (52 + hidden) go into the lower double */
@@ -69,19 +69,19 @@ __mpn_construct_long_double (mp_srcptr frac_ptr, int expt, int sign)
       else
 	lzcount = __builtin_clzl ((long) val) + 32;
       if (hi)
-	lzcount = lzcount - 11;
+	lzcount = lzcount - (64 - 53);
       else
-	lzcount = lzcount + 42;
+	lzcount = lzcount + 53 - (64 - 53);
 
-      if (lzcount > u.ieee.exponent)
+      if (lzcount > u.d[0].ieee.exponent)
 	{
-	  lzcount = u.ieee.exponent;
-	  u.ieee.exponent = 0;
+	  lzcount = u.d[0].ieee.exponent;
+	  u.d[0].ieee.exponent = 0;
 	  exponent2 -= lzcount;
 	}
       else
 	{
-	  u.ieee.exponent -= (lzcount - 1);
+	  u.d[0].ieee.exponent -= (lzcount - 1);
 	  exponent2 -= (lzcount - 1);
 	}
 
@@ -97,29 +97,27 @@ __mpn_construct_long_double (mp_srcptr frac_ptr, int expt, int sign)
 	}
     }
 
-  if (lo != 0L)
+  if (lo != 0)
     {
-      /* hidden2 bit of low double controls rounding of the high double.
-	 If hidden2 is '1' and either the explicit mantissa is non-zero
+      /* hidden bit of low double controls rounding of the high double.
+	 If hidden is '1' and either the explicit mantissa is non-zero
 	 or hi is odd, then round up hi and adjust lo (2nd mantissa)
 	 plus change the sign of the low double to compensate.  */
       if ((lo & (1LL << 52)) != 0
-	  && ((hi & 1) != 0 || (lo & ((1LL << 52) - 1))))
+	  && ((hi & 1) != 0 || (lo & ((1LL << 52) - 1)) != 0))
 	{
 	  hi++;
-	  if ((hi & ((1LL << 52) - 1)) == 0)
+	  if ((hi & (1LL << 53)) != 0)
 	    {
-	      if ((hi & (1LL << 53)) != 0)
-		hi -= 1LL << 52;
-	      u.ieee.exponent++;
+	      hi >>= 1;
+	      u.d[0].ieee.exponent++;
 	    }
-	  u.ieee.negative2 = !sign;
+	  u.d[1].ieee.negative = !sign;
 	  lo = (1LL << 53) - lo;
 	}
 
-      /* The hidden bit of the lo mantissa is zero so we need to normalize
-	 it for the low double.  Shift it left until the hidden bit is '1'
-	 then adjust the 2nd exponent accordingly.  */
+      /* Normalize the low double.  Shift the mantissa left until
+	 the hidden bit is '1' and adjust the exponent accordingly.  */
 
       if (sizeof (lo) == sizeof (long))
 	lzcount = __builtin_clzl (lo);
@@ -127,24 +125,24 @@ __mpn_construct_long_double (mp_srcptr frac_ptr, int expt, int sign)
 	lzcount = __builtin_clzl ((long) (lo >> 32));
       else
 	lzcount = __builtin_clzl ((long) lo) + 32;
-      lzcount = lzcount - 11;
-      if (lzcount > 0)
-	{
-	  lo = lo << lzcount;
-	  exponent2 = exponent2 - lzcount;
-	}
+      lzcount = lzcount - (64 - 53);
+      lo <<= lzcount;
+      exponent2 -= lzcount;
+
       if (exponent2 > 0)
-	u.ieee.exponent2 = exponent2;
-      else
+	u.d[1].ieee.exponent = exponent2;
+      else if (exponent2 > -53)
 	lo >>= 1 - exponent2;
+      else
+	lo = 0;
     }
   else
-    u.ieee.negative2 = 0;
+    u.d[1].ieee.negative = 0;
 
-  u.ieee.mantissa3 = lo & 0xffffffffLL;
-  u.ieee.mantissa2 = (lo >> 32) & 0xfffff;
-  u.ieee.mantissa1 = hi & 0xffffffffLL;
-  u.ieee.mantissa0 = (hi >> 32) & ((1LL << (LDBL_MANT_DIG - 86)) - 1);
+  u.d[1].ieee.mantissa1 = lo;
+  u.d[1].ieee.mantissa0 = lo >> 32;
+  u.d[0].ieee.mantissa1 = hi;
+  u.d[0].ieee.mantissa0 = hi >> 32;
 
-  return u.d;
+  return u.ld;
 }
diff --git a/sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c b/sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c
index bfcd110..82f0197 100644
--- a/sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c
+++ b/sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c
@@ -34,11 +34,11 @@ __nearbyintl (long double x)
   fenv_t env;
   static const long double TWO52 = 4503599627370496.0L;
   union ibm_extended_long_double u;
-  u.d = x;
+  u.ld = x;
 
-  if (fabs (u.dd[0]) < TWO52)
+  if (fabs (u.d[0].d) < TWO52)
     {
-      double high = u.dd[0];
+      double high = u.d[0].d;
       feholdexcept (&env);
       if (high > 0.0)
 	{
@@ -52,13 +52,13 @@ __nearbyintl (long double x)
 	  high += TWO52;
           if (high == 0.0) high = -0.0;
 	}
-      u.dd[0] = high;
-      u.dd[1] = 0.0;
+      u.d[0].d = high;
+      u.d[1].d = 0.0;
       math_force_eval (u.dd[0]);
       math_force_eval (u.dd[1]);
       fesetenv (&env);
     }
-  else if (fabs (u.dd[1]) < TWO52 && u.dd[1] != 0.0)
+  else if (fabs (u.d[1].d) < TWO52 && u.d[1].d != 0.0)
     {
       double high, low, tau;
       /* In this case we have to round the low double and handle any
@@ -67,51 +67,51 @@ __nearbyintl (long double x)
          may already be rounded and the low double may have the
          opposite sign to compensate.  */
       feholdexcept (&env);
-      if (u.dd[0] > 0.0)
+      if (u.d[0].d > 0.0)
 	{
-	  if (u.dd[1] > 0.0)
+	  if (u.d[1].d > 0.0)
 	    {
 	      /* If the high/low doubles are the same sign then simply
 	         round the low double.  */
-	      high = u.dd[0];
-	      low = u.dd[1];
+	      high = u.d[0].d;
+	      low = u.d[1].d;
 	    }
-	  else if (u.dd[1] < 0.0)
+	  else if (u.d[1].d < 0.0)
 	    {
 	      /* Else the high double is pre rounded and we need to
 	         adjust for that.  */
 
-	      tau = __nextafter (u.dd[0], 0.0);
-	      tau = (u.dd[0] - tau) * 2.0;
-	      high = u.dd[0] - tau;
-	      low = u.dd[1] + tau;
+	      tau = __nextafter (u.d[0].d, 0.0);
+	      tau = (u.d[0].d - tau) * 2.0;
+	      high = u.d[0].d - tau;
+	      low = u.d[1].d + tau;
 	    }
 	  low += TWO52;
 	  low -= TWO52;
 	}
-      else if (u.dd[0] < 0.0)
+      else if (u.d[0].d < 0.0)
 	{
-	  if (u.dd[1] < 0.0)
+	  if (u.d[1].d < 0.0)
 	    {
 	      /* If the high/low doubles are the same sign then simply
 	         round the low double.  */
-	      high = u.dd[0];
-	      low = u.dd[1];
+	      high = u.d[0].d;
+	      low = u.d[1].d;
 	    }
-	  else if (u.dd[1] > 0.0)
+	  else if (u.d[1].d > 0.0)
 	    {
 	      /* Else the high double is pre rounded and we need to
 	         adjust for that.  */
-	      tau = __nextafter (u.dd[0], 0.0);
-	      tau = (u.dd[0] - tau) * 2.0;
-	      high = u.dd[0] - tau;
-	      low = u.dd[1] + tau;
+	      tau = __nextafter (u.d[0].d, 0.0);
+	      tau = (u.d[0].d - tau) * 2.0;
+	      high = u.d[0].d - tau;
+	      low = u.d[1].d + tau;
 	    }
 	  low = TWO52 - low;
 	  low = -(low - TWO52);
 	}
-      u.dd[0] = high + low;
-      u.dd[1] = high - u.dd[0] + low;
+      u.d[0].d = high + low;
+      u.d[1].d = high - u.d[0].d + low;
       math_force_eval (u.dd[0]);
       math_force_eval (u.dd[1]);
       fesetenv (&env);
diff --git a/sysdeps/ieee754/ldbl-128ibm/strtold_l.c b/sysdeps/ieee754/ldbl-128ibm/strtold_l.c
index 93415f0..bccc0ce 100644
--- a/sysdeps/ieee754/ldbl-128ibm/strtold_l.c
+++ b/sysdeps/ieee754/ldbl-128ibm/strtold_l.c
@@ -43,12 +43,12 @@ libc_hidden_proto (STRTOF)
 #define FLOAT_HUGE_VAL	HUGE_VALL
 # define SET_MANTISSA(flt, mant) \
   do { union ibm_extended_long_double u;				      \
-       u.d = (flt);							      \
+       u.ld = (flt);							      \
        if ((mant & 0xfffffffffffffULL) == 0)				      \
 	 mant = 0x8000000000000ULL;					      \
-       u.ieee.mantissa0 = ((mant) >> 32) & 0xfffff;			      \
-       u.ieee.mantissa1 = (mant) & 0xffffffff;				      \
-       (flt) = u.d;							      \
+       u.d[0].ieee.mantissa0 = ((mant) >> 32) & 0xfffff;		      \
+       u.d[0].ieee.mantissa1 = (mant) & 0xffffffff;			      \
+       (flt) = u.ld;							      \
   } while (0)
 
 #include <strtod_l.c>
diff --git a/sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c b/sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c
index ed0d4a5..06dcf02 100644
--- a/sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c
+++ b/sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c
@@ -89,23 +89,23 @@ __x2y2m1l (long double x, long double y)
   double vals[12];
   SET_RESTORE_ROUND (FE_TONEAREST);
   union ibm_extended_long_double xu, yu;
-  xu.d = x;
-  yu.d = y;
-  if (fabs (xu.dd[1]) < 0x1p-500)
-    xu.dd[1] = 0.0;
-  if (fabs (yu.dd[1]) < 0x1p-500)
-    yu.dd[1] = 0.0;
-  mul_split (&vals[1], &vals[0], xu.dd[0], xu.dd[0]);
-  mul_split (&vals[3], &vals[2], xu.dd[0], xu.dd[1]);
+  xu.ld = x;
+  yu.ld = y;
+  if (fabs (xu.d[1].d) < 0x1p-500)
+    xu.d[1].d = 0.0;
+  if (fabs (yu.d[1].d) < 0x1p-500)
+    yu.d[1].d = 0.0;
+  mul_split (&vals[1], &vals[0], xu.d[0].d, xu.d[0].d);
+  mul_split (&vals[3], &vals[2], xu.d[0].d, xu.d[1].d);
   vals[2] *= 2.0;
   vals[3] *= 2.0;
-  mul_split (&vals[5], &vals[4], xu.dd[1], xu.dd[1]);
-  mul_split (&vals[7], &vals[6], yu.dd[0], yu.dd[0]);
-  mul_split (&vals[9], &vals[8], yu.dd[0], yu.dd[1]);
+  mul_split (&vals[5], &vals[4], xu.d[1].d, xu.d[1].d);
+  mul_split (&vals[7], &vals[6], yu.d[0].d, yu.d[0].d);
+  mul_split (&vals[9], &vals[8], yu.d[0].d, yu.d[1].d);
   vals[8] *= 2.0;
   vals[9] *= 2.0;
-  mul_split (&vals[11], &vals[10], yu.dd[1], yu.dd[1]);
-  if (xu.dd[0] >= 0.75)
+  mul_split (&vals[11], &vals[10], yu.d[1].d, yu.d[1].d);
+  if (xu.d[0].d >= 0.75)
     vals[1] -= 1.0;
   else
     {

-- 
Alan Modra
Australia Development Lab, IBM
Follow-Ups:
- Re: IBM long double fixes
  - From: Andreas Schwab
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]