x86: AT&T syntax operand size defaults

Mon Oct 16 11:24:00 GMT 2017

On Mon, Oct 16, 2017 at 3:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 13.10.17 at 23:51, <hjl.tools@gmail.com> wrote:
>> On 10/13/17, Jan Beulich <JBeulich@suse.com> wrote:
>>> All,
>>>
>>> according to the only reasonable document about AT&T assembler
>>> syntax (Solaris'es / Oracles "x86 Assembly Language Reference
>>> Manual", operand size is supposed to default to "long".
>>>
>>> However, of these two
>>>
>>>      add     $1, (%eax)
>>>      add     $0x1234, (%eax)
>>>
>>> the first indeed defaults to "long" (except in 16-bit mode, but I think
>>> that's fine despite what that doc says) while the second causes an
>>> error. That's because of
>>>
>>>        if (i.tm.opcode_modifier.w)
>>>          {
>>>            as_bad (_("no instruction mnemonic suffix given and "
>>>                      "no register operands; can't size instruction"));
>>>            return 0;
>>>          }
>>>
>>> in process_suffix(): The pattern for the 8-bit sign extended
>>> immediate does no have W set, while most other instructions
>>> allowing for no register operands at all have it set. I'm of the
>>> strong opinion that the behavior of the assembler should at least
>>> be consistent, i.e. in particular it should not depend on the value
>>> of an immediate.
>>>
>>> Which way to make it consistent, though, I'm not sure about:
>>> It could be made match Intel syntax behavior, where an error is
>>> being flagged whenever multiple operand sizes are permitted for
>>> a mnemonic (that's imo the model most helpful to the programmer),
>>> or it could be made match that doc by simply removing the as_bad()
>>> invocation above (which is the model accepting the widest set of
>>> originally non-gas sources). Of course it would be possible to have
>>> the user select between the two by command line option and/or
>>> directive, but even then we would need to settle on what default
>>> behavior should be.
>>
>> I agreed that AT&T syntax is poorly documented.   As for this specific
>> case, I am OK with either option as long as it doesn't break existing
>> codes.
>
> Interesting requirement: How do you define "existing codes"? I
> certainly realize that regressions aren't nice, but in particular
> when going the error-on-ambiguity route this at least wouldn't
> go silent. Obviously changing the (silently selected) default for
> certain insns / operand combinations would be more dangerous
> in this regard. I don't, however, think such a change can come
> without the risk of people having written bogus code now
> needing to fix it.
>

It depends on how extensive these bogus codes are.  If Linux kernel
needs to be changed, you should fix them in kernel 4.xx branches first
before committing assembler changes.

-- 
H.J.