RFC: .sleb128 and bignums

Richard Sandiford rsandifo@redhat.com
Tue Jan 18 13:27:00 GMT 2005


Nick Clifton <nickc@redhat.com> writes:
> Hi Richard,
>> SO... to finally get to the point ;)  I guess there are three options:
>>   (1) Declare the treatment of 0xffffffff to be correct but fix the
>>       bignum problems (i.e. apply something like the first patch but
>>       not the second).
>>   (2) Treat values as unsigned if they were written that way
>> (i.e. apply
>>       both patches, or variations of them).
>>   (3) Get rid of sleb128 bignum support altogether.  Treat everything
>>       as 32-bit if !BFD64.
>> (1) seems better than the status quo but the signed/unsigned thing is a
>> little odd.  I'm uneasy about (2) because of the arithmetic problem
>> described above.
>> According to the Cygnus repository, the bugs fixed by the first patch
>> have been around since the code was first added in Aug 1997.  That suggests
>> that bignum .sleb128s have never worked, yet as far as I know, no-one has
>> ever complained before.  Perhaps (3) really is a viable option?
>
> Well we definitely ought to do (1).  Just because a problem has not been
> reported before there is no reason not to fix it now.
>
> Personally I think that we ought to go with (2), but possibly arrange
> for a warning message to be generated and/or allow a command line switch
> or pseudo op to alter the new behaviour of GAS.

Thanks for the feedback.  Since posting the message, I've been coming
around to the idea that (2) is best as well.  Not sure about the warning
or command-line flag though.  Would it be OK to leave that for now, and
maybe add it later if anyone feels the need?

Alan's reply allays some of my concerns about the inconsistent treatment
of 32-bit vs. 64-bit numbers in general arithmetic, but I think we do
need to treat positive and negative constants consistently.  For example,
if we treat:

        .sleb128        0xffffffff

as a >32-bit number, then I think we need to do the same for:

        .sleb128        -0xffffffff

Handling both <literal> and -<literal> should be enough to keep gcc happy.

If we take that approach for .sleb128, I think we should be consistent
with other directives too.  At the moment:

        .quad           0x89abcdef

produces "efcdab89 00000000" on x86 (as expected), but:

        .quad           -0x89abcdef

produces "11325476 00000000", i.e. a positive 32-bit number.

The patch below implements (b) with the changes described above.
Tested on i686-pc-linux-gnu and mips64-linux-gnu (to check the two
endiannesses in the new .quad test).  OK to install?

Richard


gas/
	* read.c (convert_to_bignum): New function, split out from...
	(emit_expr): ...here.  Handle the case where X_add_number is
	positive and the input value is negative.
	(output_big_sleb128): Fix setting of continuation bit.  Check whether
	the final byte needs to be sign-extended.  Fix size-shrinking loop.
	(emit_leb128_expr): When generating a signed leb128, see whether the
	sign of an O_constant's X_add_number matches the sign of the input
	value.  Use a bignum if not.

gas/testsuite/
	* gas/all/sleb128.[sd]: New test.
	* gas/all/quad.[sd]: New test.
	* gas/all/gas.exp: Run them.

Index: gas/read.c
===================================================================
RCS file: /cvs/src/src/gas/read.c,v
retrieving revision 1.84
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.84 read.c
*** gas/read.c	29 Dec 2004 10:21:58 -0000	1.84
--- gas/read.c	18 Jan 2005 13:22:12 -0000
*************** read_a_source_file (char *name)
*** 1082,1087 ****
--- 1082,1110 ----
  #endif
  }
  
+ /* Convert O_constant expression EXP into the equivalent O_big representation.
+    Take the sign of the number from X_unsigned rather than X_add_number.  */
+ 
+ static void
+ convert_to_bignum (expressionS *exp)
+ {
+   valueT value;
+   unsigned int i;
+ 
+   value = exp->X_add_number;
+   for (i = 0; i < sizeof (exp->X_add_number) / CHARS_PER_LITTLENUM; i++)
+     {
+       generic_bignum[i] = value & LITTLENUM_MASK;
+       value >>= LITTLENUM_NUMBER_OF_BITS;
+     }
+   /* Add a sequence of sign bits if the top bit of X_add_number is not
+      the sign of the original value.  */
+   if ((exp->X_add_number < 0) != !exp->X_unsigned)
+     generic_bignum[i++] = exp->X_unsigned ? 0 : LITTLENUM_MASK;
+   exp->X_op = O_big;
+   exp->X_add_number = i;
+ }
+ 
  /* For most MRI pseudo-ops, the line actually ends at the first
     nonquoted space.  This function looks for that point, stuffs a null
     in, and sets *STOPCP to the character that used to be there, and
*************** emit_expr (expressionS *exp, unsigned in
*** 3541,3562 ****
       pass to md_number_to_chars, handle it as a bignum.  */
    if (op == O_constant && nbytes > sizeof (valueT))
      {
!       valueT val;
!       int gencnt;
! 
!       if (!exp->X_unsigned && exp->X_add_number < 0)
! 	extra_digit = (valueT) -1;
!       val = (valueT) exp->X_add_number;
!       gencnt = 0;
!       do
! 	{
! 	  generic_bignum[gencnt] = val & LITTLENUM_MASK;
! 	  val >>= LITTLENUM_NUMBER_OF_BITS;
! 	  ++gencnt;
! 	}
!       while (val != 0);
!       op = exp->X_op = O_big;
!       exp->X_add_number = gencnt;
      }
  
    if (op == O_constant)
--- 3564,3572 ----
       pass to md_number_to_chars, handle it as a bignum.  */
    if (op == O_constant && nbytes > sizeof (valueT))
      {
!       extra_digit = exp->X_unsigned ? 0 : -1;
!       convert_to_bignum (exp);
!       op = O_big;
      }
  
    if (op == O_constant)
*************** output_big_sleb128 (char *p, LITTLENUM_T
*** 4276,4311 ****
    unsigned byte;
  
    /* Strip leading sign extensions off the bignum.  */
!   while (size > 0 && bignum[size - 1] == (LITTLENUM_TYPE) -1)
      size--;
  
    do
      {
!       if (loaded < 7 && size > 0)
! 	{
! 	  val |= (*bignum << loaded);
! 	  loaded += 8 * CHARS_PER_LITTLENUM;
! 	  size--;
! 	  bignum++;
! 	}
! 
!       byte = val & 0x7f;
!       loaded -= 7;
!       val >>= 7;
  
!       if (size == 0)
  	{
! 	  if ((val == 0 && (byte & 0x40) == 0)
! 	      || (~(val | ~(((valueT) 1 << loaded) - 1)) == 0
! 		  && (byte & 0x40) != 0))
  	    byte |= 0x80;
  	}
  
        if (orig)
! 	*p = byte;
        p++;
      }
-   while (byte & 0x80);
  
    return p - orig;
  }
--- 4286,4333 ----
    unsigned byte;
  
    /* Strip leading sign extensions off the bignum.  */
!   while (size > 1
! 	 && bignum[size - 1] == LITTLENUM_MASK
! 	 && bignum[size - 2] > LITTLENUM_MASK / 2)
      size--;
  
    do
      {
!       /* OR in the next part of the littlenum.  */
!       val |= (*bignum << loaded);
!       loaded += LITTLENUM_NUMBER_OF_BITS;
!       size--;
!       bignum++;
  
!       /* Add bytes until there are less than 7 bits left in VAL
! 	 or until every non-sign bit has been written.  */
!       do
  	{
! 	  byte = val & 0x7f;
! 	  loaded -= 7;
! 	  val >>= 7;
! 	  if (size > 0
! 	      || val != ((byte & 0x40) == 0 ? 0 : ((valueT) 1 << loaded) - 1))
  	    byte |= 0x80;
+ 
+ 	  if (orig)
+ 	    *p = byte;
+ 	  p++;
  	}
+       while ((byte & 0x80) != 0 && loaded >= 7);
+     }
+   while (size > 0);
  
+   /* Mop up any left-over bits (of which there will be less than 7).  */
+   if ((byte & 0x80) != 0)
+     {
+       /* Sign-extend VAL.  */
+       if (val & (1 << (loaded - 1)))
+ 	val |= ~0 << loaded;
        if (orig)
! 	*p = val & 0x7f;
        p++;
      }
  
    return p - orig;
  }
*************** emit_leb128_expr (expressionS *exp, int 
*** 4384,4389 ****
--- 4406,4421 ----
        as_warn (_("register value used as expression"));
        op = O_constant;
      }
+   else if (op == O_constant
+ 	   && sign
+ 	   && (exp->X_add_number < 0) != !exp->X_unsigned)
+     {
+       /* We're outputting a signed leb128 and the sign of X_add_number
+ 	 doesn't reflect the sign of the original value.  Convert EXP
+ 	 to a correctly-extended bignum instead.  */
+       convert_to_bignum (exp);
+       op = O_big;
+     }
  
    /* Let check_eh_frame know that data is being emitted.  nbytes == -1 is
       a signal that this is leb128 data.  It shouldn't optimize this away.  */
Index: gas/testsuite/gas/all/gas.exp
===================================================================
RCS file: /cvs/src/src/gas/testsuite/gas/all/gas.exp,v
retrieving revision 1.23
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.23 gas.exp
*** gas/testsuite/gas/all/gas.exp	22 Nov 2004 13:00:23 -0000	1.23
--- gas/testsuite/gas/all/gas.exp	18 Jan 2005 13:22:12 -0000
*************** if {   [istarget "i*86-*-*pe*"] \
*** 195,200 ****
--- 195,203 ----
    gas_test "fastcall.s" ""   "" "fastcall labels"
  }
  
+ run_dump_test sleb128
+ run_dump_test quad
+ 
  load_lib gas-dg.exp
  dg-init
  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/err-*.s $srcdir/$subdir/warn-*.s]] "" ""
diff -c /dev/null gas/testsuite/gas/all/quad.d
*** /dev/null	2004-12-03 11:50:38.225654048 +0000
--- gas/testsuite/gas/all/quad.d	2005-01-18 13:21:29.561772137 +0000
***************
*** 0 ****
--- 1,12 ----
+ #objdump : -s -j .data
+ #name : .quad tests
+ 
+ .*: .*
+ 
+ Contents of section .data:
+  0000 (00000000 76543210 00000000 80000000|10325476 00000000 00000080 00000000) .*
+  0010 (00000000 87654321 00000000 ffffffff|21436587 00000000 ffffffff 00000000) .*
+  0020 (ffffffff 89abcdf0 ffffffff 80000000|f0cdab89 ffffffff 00000080 ffffffff) .*
+  0030 (ffffffff 789abcdf ffffffff 00000001|dfbc9a78 ffffffff 01000000 ffffffff) .*
+  0040 (01234567 89abcdef fedcba98 76543211|efcdab89 67452301 11325476 98badcfe) .*
+ #pass
diff -c /dev/null gas/testsuite/gas/all/quad.s
*** /dev/null	2004-12-03 11:50:38.225654048 +0000
--- gas/testsuite/gas/all/quad.s	2005-01-18 12:45:03.210086611 +0000
***************
*** 0 ****
--- 1,12 ----
+ 	.data
+ 	.quad	0x76543210
+ 	.quad	0x80000000
+ 	.quad	0x87654321
+ 	.quad	0xffffffff
+ 	.quad	-0x76543210
+ 	.quad	-0x80000000
+ 	.quad	-0x87654321
+ 	.quad	-0xffffffff
+ 
+ 	.quad	0x123456789abcdef
+ 	.quad	-0x123456789abcdef
diff -c /dev/null gas/testsuite/gas/all/sleb128.d
*** /dev/null	2004-12-03 11:50:38.225654048 +0000
--- gas/testsuite/gas/all/sleb128.d	2005-01-18 12:45:30.506962483 +0000
***************
*** 0 ****
--- 1,57 ----
+ #objdump : -s -j .data
+ #name : .sleb128 tests
+ 
+ .*: .*
+ 
+ Contents of section .data:
+ #
+ # 0x76543210   :           000_0111 0110_010 1_0100_00 11_0010_0 001_0000
+ # 0x80000000   :           000_1000 0000_000 0_0000_00 00_0000_0 000_0000
+ # 0x87654321   :           000_1000 0111_011 0_0101_01 00_0011_0 010_0001
+ # 0xffffffff   :           ..................................... 111_1111
+ #
+  0000 90e4d0b2 07808080 8008a186 95bb08ff .*
+ #
+ # 0xffffffff   :           000_1111 1111_111 1_1111_11 11_1111_1 ........
+ # -0x76543210  :           111_1000 1001_101 0_1011_11 00_1101_1 111_0000
+ # -0x80000000  :           111_1000 0000_000 0_0000_00 00_0000_0 000_0000
+ # -0x87654321  :           ........................... 11_1100_1 101_1111
+ #
+  0010 ffffff0f f09bafcd 78808080 8078dff9 .*
+ #
+ # -0x87654321  :           111_0111 1000_100 1_1010_10 ..................
+ # -0xffffffff  :           111_0000 0000_000 0_0000_00 00_0000_0 000_0001
+ #    789abcdef :           111_1000 1001_101 0_1011_11 00_1101_1 110_1111
+ # 0x123456     :           ........ 0010_001 1_0100_01 01_0110_0
+ #
+  0020 eac47781 80808070 ef9bafcd f8acd191 .*
+ #
+ # 0x123456     :           000_0001 ............................
+ #    789abcdef :           000_0111 0110_010 1_0100_00 11_0010_0 001_0001
+ # -0x123456    :           111_1110 1101_110 0_1011_10 01_1001_1
+ #    fffffffff :           000_0000 0000_000 0_0000_00 00_0000_0 000_0001
+ # -0x7ff       :                             ......... 00_0000_0
+ #
+  0030 0191e4d0 b287d3ae ee7e8180 80808080 .*
+ #
+ # -0x7ff       :                             1_1000_00 .........
+ #    000000000 :           000_0000 0000_000 0_0000_00 00_0000_0 000_0000
+ # -0x800       :                             1_1000_00 00_0000_0
+ #    fffffffff :           000_0000 0000_000 0_0000_00 00_0000_0 000_0001
+ # -0x7ffffff   : .................. 0000_000 0_0000_00 00_0000_0
+ #
+  0040 60808080 80808060 81808080 80808080 .*
+ #
+ # -0x7ffffff   : 11_1111_1 000_0000 ............................
+ #    000000000 :           000_0000 0000_000 0_0000_00 00_0000_0 000_0000
+ # -0x8000000   : 11_1111_1 000_0000 0000_000 0_0000_00 00_0000_0
+ # -0x100000000 :           ........ 0000_000 0_0000_00 00_0000_0 000_0000
+ #
+  0050 807f8080 80808080 8080807f 80808080 .*
+ #
+ # -0x100000000 :           111_0000 .....................................
+ #    000000000 :           000_0000 0000_000 0_0000_00 00_0000_0 000_0000
+ # -0x1000      :                             1_0000_00 00_0000_0
+ #
+  0060 70808080 80808040 00000000 00000000 .*
+ #pass
diff -c /dev/null gas/testsuite/gas/all/sleb128.s
*** /dev/null	2004-12-03 11:50:38.225654048 +0000
--- gas/testsuite/gas/all/sleb128.s	2005-01-18 10:50:21.000000000 +0000
***************
*** 0 ****
--- 1,22 ----
+ 	.data
+ 	.sleb128	0x76543210
+ 	.sleb128	0x80000000
+ 	.sleb128	0x87654321
+ 	.sleb128	0xffffffff
+ 	.sleb128	-0x76543210
+ 	.sleb128	-0x80000000
+ 	.sleb128	-0x87654321
+ 	.sleb128	-0xffffffff
+ 
+ 	.sleb128	0x123456789abcdef
+ 	.sleb128	-0x123456789abcdef
+ 
+ 	.sleb128	-0x7fffffffffff
+ 	.sleb128	-0x800000000000
+ 	.sleb128	-0x7fffffffffffffff
+ 	.sleb128	-0x8000000000000000
+ 
+ 	.sleb128	-0x100000000
+ 	.sleb128	-0x1000000000000
+ 
+ 	.fill		32



More information about the Binutils mailing list