[PATCH] Fix sleb128 encoding for difference expressions

Mon Nov 12 12:19:00 GMT 2012

Hi,

This is a more conservative version of the patch, which *only* affects
assembly of .sleb128 directives.

On Mon, 12 Nov 2012 08:01:48 +0000
Jan Beulich <JBeulich@suse.com> wrote:

> >>> On 09.11.12 at 18:57, Julian Brown <julian@codesourcery.com>
> >>> wrote:
> > On Thu, 8 Nov 2012 14:02:19 +0000 Jan Beulich <JBeulich@suse.com>
> > wrote:
> >> >>> On 08.11.12 at 14:05, Julian Brown <julian@codesourcery.com>
> >> I could see your change being correct for the <symbol> -
> >> <symbol> case, but certainly not for <constant> - <constant>
> >> (<symbol> - <constant> being questionable).
> > 
> > I'm not sure I understand this comment. Note that <constant> -
> > <constant> is just as broken at the moment for .sleb128...
> 
> It may be as broken, but that's not my point. The point is that,
> without type information, there's no way for you to guess whether
> a difference between constants in particular (but possibly not
> limited to this) is meant to be signed or unsigned, as you can't
> guess whether or not the programmer intended to make use of
> wraparound (I assume you realize that one can detect the
> width the assembler uses for calculations, and adjust code - e.g.
> through macros - accordingly).

I see. I understand your concern, and agree that breaking wraparound
semantics for most assembled constants would be a bad thing -- but I'd
strongly contest that the current semantics for overflowed numbers
with .sleb128 are useful for anything at all, and (at least this new
version of) my patch doesn't affect anything else.

> So before proposing any solutions, I think you first need to
> propose a scheme how things are intended to work. That's
> particularly being complicated by the fact that .sleb128 can
> encode arbitrarily large numbers, whereas calculations are
> being done at fixed widths (i.e. you specifically can't use the top
> bit of a number to tell its signed-ness - that's why I believe the
> X_unsigned flag got introduced in the first place).

The semantics introduced by my patch are quite simple in fact:

* Most assembled values (apart from the arguments of .sleb128
  directives) are left alone.

* Expressions involving + and - are evaluated using an extra bit:
  signedness or unsignedness is irrelevant at this stage, since we're
  using two's-complement binary, so only the interpretation of the bits
  matters, not the operations themselves [*].

* Operands (symbols, constants) are positive word-sized values (i.e.
  they have the "extra bit" clear).

* Unary negation operates on all bits of the value, including the extra
  bit.

* Arguments of .sleb128 directives are always sign-extended, using the
  new extra bit (i.e. the "word size + 1 bit" value is sign-extended).

Ideally .sleb128 expressions would be evaluated in arbitrary precision,
but that's just not practical as the code is currently written, and
would probably be overkill in general anyway. My patch, providing just a
little more precision where necessary, ought to cover all useful cases.

I understand that the X_unsigned flag was also intended to solve this
problem, but it clearly doesn't -- I'm sure you've seen the comment:

     FIXME: This field is not set very reliably.  */
  unsigned int X_unsigned : 1;

in fact the only thing which sets (clears it, rather) it is unary
negation. That works (for .sleb128) in the simple case,

  .sleb128 -(1<<63)

but not for any of the cases previously discussed, nor for e.g.:

  .sleb128 --(1<<63)

Nevertheless, any current code which relies on the current strange
behaviour of X_unsigned will continue to do the same thing (though not
for .sleb128 directives, of course).

> And if this scheme involves _any_ changes to the generic
> expression evaluation code, you would also have to make clear
> why that would be correct (specifically not causing regressions)
> for all other contexts using that code.

The behaviour of the X_unsigned flag and all of the bits of
X_add_number are unchanged -- quite deliberately, so that the patch is
as conservative as possible. The calculation of X_extrabit can be done
independently, without affecting any of the lower-order bits.

Thanks,

Julian

[*] Other operators (multiplication, bitwise operations, etc.) ignore
the X_extrabit field, as they ignore the X_unsigned field at present --
so it's probably still possible to construct expressions which behave
strangely with .sleb128 with my patch applied, using those operators.
The patch covers the actual real-world case we encountered, though.

ChangeLog

    gas/
    * read.c (convert_to_bignum): Add sign parameter. Use it
    instead of X_unsigned to determine sign of resulting bignum.
    (emit_expr): Pass extra argument to convert_to_bignum.
    (emit_leb128_expr): Use X_extrabit instead of X_unsigned. Pass
    X_extrabit to convert_to_bignum.
    (parse_bitfield_cons): Set X_extrabit.
    * expr.c (make_expr_symbol, expr_build_uconstant, operand):
    Initialise X_extrabit field as appropriate.
    (add_to_result): New.
    (subtract_from_result): New.
    (expr): Use above.
    * expr.h (expressionS): Add X_extrabit field.

    gas/testsuite/
    * gas/all/sleb128-2.s: New test.
    * gas/all/sleb128-3.s: Likewise.
    * gas/all/sleb128-4.s: Likewise.
    * gas/all/sleb128-5.s: Likewise.
    * gas/all/sleb128-6.s: Likewise.
    * gas/all/sleb128-7.s: Likewise.
    * gas/all/sleb128-2.d: New.
    * gas/all/sleb128-3.d: New.
    * gas/all/sleb123-4.d: New.
    * gas/all/sleb123-5.d: New.
    * gas/all/sleb123-6.d: New.
    * gas/all/sleb123-7.d: New.
    * gas/all/gas.exp (sleb128-2, sleb128-3, sleb128-4, sleb128-5)
    (sleb128-6, sleb128-7): Run new tests.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sleb128-fix-7.diff
Type: text/x-patch
Size: 13521 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20121112/e7e25daa/attachment.bin>