x86 operand size overriding prefixes vs suffixes vs ambiguity warnings

H.J. Lu hjl.tools@gmail.com
Tue Jun 16 12:44:02 GMT 2020


On Tue, Jun 16, 2020 at 12:33 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 09.06.2020 17:45, H.J. Lu wrote:
> > On Tue, Jun 9, 2020 at 8:13 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 09.06.2020 16:42, H.J. Lu wrote:
> >>> On Tue, Jun 9, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 09.06.2020 16:05, H.J. Lu wrote:
> >>>>> On Tue, Jun 9, 2020 at 6:56 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 09.06.2020 15:04, H.J. Lu wrote:
> >>>>>>> On Tue, Jun 9, 2020 at 5:59 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>>
> >>>>>>>> On 08.06.2020 18:25, H.J. Lu wrote:
> >>>>>>>>> On Mon, Jun 8, 2020 at 6:07 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> On 08.06.2020 14:44, H.J. Lu wrote:
> >>>>>>>>>>> On Mon, Jun 8, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> H.J.,
> >>>>>>>>>>>>
> >>>>>>>>>>>> the documentation of gas isn't really helpful in regards to what
> >>>>>>>>>>>> to expect when using explicit prefixes. For example I consider
> >>>>>>>>>>>>
> >>>>>>>>>>>>         rex64 movl %eax, %eax
> >>>>>>>>>>>>         data16 movl %eax, %eax
> >>>>>>>>>>>>
> >>>>>>>>>>>> sufficiently bogus that I would see at least a warning warranted.
> >>>>>>>>>>>> Otoh Andrew validly points out that for e.g. lret the possible
> >>>>>>>>>>>>
> >>>>>>>>>>>>         rex64 lret
> >>>>>>>>>>>>         data16 lret
> >>>>>>>>>>>>
> >>>>>>>>>>>> are sufficiently meaningful to perhaps even suppress the
> >>>>>>>>>>>> ambiguity warning the suffix-less lret-s currently cause in
> >>>>>>>>>>>> 64-bit mode. The question really is whether we could settle on
> >>>>>>>>>>>> an abstract model from which the behavior becomes predictable
> >>>>>>>>>>>> for a programmer. Of course a fundamental requirement I would
> >>>>>>>>>>>> have to any such model is that it be consistent for all current
> >>>>>>>>>>>> and future insns and that it preferably have as little quirks or
> >>>>>>>>>>>> special cases as possible.
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you (or anyone else on the list) have any thoughts or
> >>>>>>>>>>>> opinions here, I'd appreciate if you could share them.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> My takes on this are
> >>>>>>>>>>>
> >>>>>>>>>>> 1. If a prefix is totally ignored, assembler should allow it and
> >>>>>>>>>>> disassembler should display it.  Such prefixes can be used for
> >>>>>>>>>>> padding.
> >>>>>>>>>>> 2. If a prefix changes the instruction behavior and there is no
> >>>>>>>>>>> other ways to encode such instruction behavior, assembler
> >>>>>>>>>>> should allow it and disassembler should display it.  Such
> >>>>>>>>>>> prefixes can be used for valid operation.
> >>>>>>>>>>> 3. Otherwise, if a prefix changes the instruction behavior,
> >>>>>>>>>>> there should be an error, at least a warning.
> >>>>>>>>>>
> >>>>>>>>>> All of these are difficult when considering the evolving ISA:
> >>>>>>>>>> What is "ignored" or "changing behavior" varies over time. To take
> >>>>>>>>>> a concrete example, according to what you say WBNOINVD being a
> >>>>>>>>>> prefixed version of WBINVD should have been accepted in the latter
> >>>>>>>>>> form by gas prior to the addition of support for the insn, but not
> >>>>>>>>>> anymore. That's not very nice for people using gas. And to preempt
> >>>>>>>>>
> >>>>>>>>> We will change opcode map.  The means of prefixes will change
> >>>>>>>>> over time.  Programmers need to find a way to deal with it.  Assembler
> >>>>>>>>> can help within reasonable limits.
> >>>>>>>>
> >>>>>>>> "Programmers need to find a way to deal with it" is particularly
> >>>>>>>> relevant: Once they've found a means to encode an insn an older
> >>>>>>>> version of gas doesn't support yet, newer versions of gas
> >>>>>>>> shouldn't reject (nor even warn) about the encoding. Many (most?)
> >>>>>>>
> >>>>>>> I used the following options to deal such cases:
> >>>>>>>
> >>>>>>> 1. Use .byte to encode the whole instruction.
> >>>>>>> 2. Use .byte to only encode prefix.
> >>>>>>
> >>>>>> If that's the model to use, what use are the prefix mnemonics then?
> >>>>>
> >>>>> 1. Some prefixes won't change operations.
> >>>>> 2. It is easier to write notrack than .byte.
> >>>>
> >>>> notrack is a bad example, as it's tied to (and only permitted for)
> >>>> a small set of insns. The rep forms would be of more interest, but
> >>>> as said earlier they're also restricted in their use. The main
> >>>> question here was about the data size overrides (66 and REX.W) as
> >>>> well as the other rex forms, though.
> >>>
> >>> We need to address them case by case and we will add new instructions
> >>> by repurposing existing prefixes.
> >>
> >> Addressing on a case by case basis is exactly what makes things
> >> unpredictable for the programmer. May I ask that you go back and
> >> look at the two examples given (still visible in context above)
> >> and suggest your "case by case" resolution there? Once you did,
> >> maybe you can generalize your line of thinking?
> >>
> >
> > For
> >
> >         rex64 movl %eax, %eax
> >         data16 movl %eax, %eax
> >
> > you should use .byte.
>
> IOW you suggest the use of prefixes here should be rejected? If so,
> what about the mentioned LRET cases then? I'm still trying to
> understand what the criteria in your judgement are ...

We make no guarantee on arbitrary prefixes before any instructions.

-- 
H.J.


More information about the Binutils mailing list