[PATCH] Fix sleb128 encoding for difference expressions

Thu Nov 8 14:02:00 GMT 2012

>>> On 08.11.12 at 14:05, Julian Brown <julian@codesourcery.com> wrote:
> Hi,
> 
> We encountered a situation where .sleb128 directives would behave in an
> unexpected way. For a source file such as:
> 
>     .text
>     .globl foo
> foo:
> .L1:
>     .byte 0
>     .byte 0
>     .byte 0
> .L2:
> 
>     .data
> bar:
>     .sleb128 .L1 - .L2
>     .byte 42
> 
> assembled and objdump'd, we get:
> 
> Contents of section .data:
>  0000 fdffffff ffffffff ff012a00 00000000  ..........*.....
> 
> the value of "-3" has been interpreted as a (64-bit) unsigned quantity
> 0xfffffffffffffffd, and henceforth encoded as a sleb128 value, whereas
> what we wanted was simply the encoding "7d", i.e. -3.
> 
> The clause in read.c:emit_leb128_expr is a clue as to why this is
> happening:
> 
>   else if (op == O_constant
> 	   && sign
> 	   && (exp->X_add_number < 0) != !exp->X_unsigned)
>     {
>       /* We're outputting a signed leb128 and the sign of X_add_number
> 	 doesn't reflect the sign of the original value.  Convert EXP
> 	 to a correctly-extended bignum instead.  */
>       convert_to_bignum (exp);
>       op = O_big;
>     }
> 
> so in this case we have exp->X_add_number less than zero, but
> exp->X_unsigned is true: the operands (.L1, .L2) are unsigned, so the
> result of the expression .L1 - .L2 is also unsigned. Hence,
> convert_to_bignum converts "exp" to a bignum as an unsigned expression.
> 
> My proposed fix is a change to the ad-hoc type system for expressions.
> The basic idea is simply to see if the result of a subtraction
> operation will fall below zero, and if so force the "unsigned" flag for
> the result value to be false (normally all of an expression's operands,
> and its result, are unsigned). No other operators are altered.

I don't think this will go without surprises here and there: Consider
that expressions like

	a - b + c

and

	a + c - b

could now evaluate differently (for appropriately chosen values).

I could see your change being correct for the <symbol> -
<symbol> case, but certainly not for <constant> - <constant>
(<symbol> - <constant> being questionable).

> This still works for expressions such as,
> 
>     .sleb128 .L2 - .L1 + (1 << 31)
> (or .sleb128 .L2 - .L1 + (1 << 63))
> 
> where L2 is greater than L1, i.e. where the sign bit "looks like" it is
> set, since the result of these will still be regarded as unsigned.
> There may be other corner cases which behave unexpectedly, however.
> 
> OK to apply, or any comments? Maybe suggestions for other ways of
> tackling the problem? (Tested with cross to mips-linux.)

I think that the .sleb128 handler itself might instead need to be
changed to make the result explicitly signed (as that's what
directive says) at least in some specific cases.

Jan