RFC: [PATCH] X86: Add pseudo prefixes to control encoding

Jan Beulich JBeulich@suse.com
Thu Nov 10 08:14:00 GMT 2016


>>> On 10.11.16 at 00:47, <hjl.tools@gmail.com> wrote:
> On Tue, Nov 8, 2016 at 12:11 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 07.11.16 at 23:26, <hjl.tools@gmail.com> wrote:
>>> On Mon, Nov 7, 2016 at 12:31 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 04.11.16 at 19:24, <hongjiu.lu@intel.com> wrote:
>>>>> Many x86 instructions have more than one encodings.  Assembler picks
>>>>> the default one, usually the shortest one.  Although the ".s", ".d8"
>>>>> and ".d32" suffixes can be used to swap register operands or specify
>>>>> displacement size, they aren't very flexible.  This patch adds pseudo
>>>>> prefixes, {xxx}, to control instruction encoding.  The available
>>>>> pseudo prefixes are {disp8}, {disp32}, {swap}, {vex2}, {vex3} and
>>>>> {evex}.  Pseudo prefixes are preferred over the ".s", ".d8" and ".d32"
>>>>> suffixes, which are deprecated.
>>>>>
>>>>> Any comments?
>>>>
>>>> Looking at ...
>>>>
>>>>> --- /dev/null
>>>>> +++ b/gas/testsuite/gas/i386/pseudos.s
>>>>> @@ -0,0 +1,52 @@
>>>>> +# Check instructions with pseudo prefixes for encoding
>>>>> +
>>>>> +     .text
>>>>> +_start:
>>>>> +     {vex3} vmovaps %xmm7,%xmm2
>>>>> +     {vex3} {swap} vmovaps %xmm7,%xmm2
>>>>> +     vmovaps %xmm7,%xmm2
>>>>> +     {vex2} vmovaps %xmm7,%xmm2
>>>>> +     {vex2} {swap} vmovaps %xmm7,%xmm2
>>>>>[...]
>>>>> +     mov %ecx, %eax
>>>>> +     {swap} mov %ecx, %eax
>>>>
>>>> ... all of these, I think {swap} is a bad name, as it implies that there
>>>> is a particular encoding that the assembler chooses now and
>>>> forever. That, however, should be an implementation detail, and
>>>> hence the prefix should express exactly which encoding is wanted.
>>>> Sadly I can't think of a really concise short name for them; {rm-reg}
>>>> and {reg-rm} look a little ugly to me, namely due to the embedded
>>>> dash.
>>>
>>> For most parts, programmer isn't aware how operands are encoded.
>>> {swap} tells assembler to swap register operand order in encoding.
>>> Since Intel and AT&T have different operand order, {rm-reg} may be
>>> confusing.
>>
>> I can accept the "confusing" part, but I'm missing some alternative
>> suggestion. If you want the programmer to be able to control the
>> encoding, then it only makes sense if you give him/her full control.
>> That is either there's no prefix (in which case the assembler gets to
>> choose) or there is a prefix (in which case the encoding is full
>> controlled by the prefix). IOW a single prefix will not do.
> 
> We can use {load} and {store} to encode in load or store form.

Ah, yes, these look fine.

>>>> And then there are mnemonics which have two alternate base
>>>> opcodes (e.g. {,v}movq and {,v}pextrw). I think this prefix model
>>>
>>> vmovq %rax, %xmm
>>>
>>> isn't equivalent to
>>>
>>> movq %rax, %xmm
>>>
>>> since only VEX encoding clear the upper bits.
>>
>> And I didn't say they would be. Instead I did refer to the two
>> different base opcodes (660FD6 and F30F7E vs 660F6E and
>> 660F7E when talking about transfer between XMM register
>> and memory [different opcodes apply for MM registers] and
>> 660FC5 vs 660F3A15 when talking about PEXTRW with a
>> GPR destination). Equivalent considerations apply for the
>> VEX and EVEX encoded forms. I didn't do a very careful check
>> whether there are any further redundant encodings.
> 
> We can add {regmem} pseudo prefix to encode a memory
> or integer register operand in instruction which takes both
> types of operand. If necessary, we can add {mmx} for MMX
> encoding.

I don't see how either of these would fit: What would "regmem"
be meant to identify? After all we have

	MOVQ	xmm, r/m64
	MOVQ	xmm, xmm/m64

as one example pair. Would you mean it to identify the GPR
case? In that case {gpr} and {xmm} might be more appropriate
(and obviously then {mmx} for the MMX counterparts).

Plus the ambiguous forms of PEXTRW are only about insns
operating on XMM registers. Or did you mean {regmem} to
apply here, and then with a {reg} counterpart?

In any event there would need to be pairs of prefixes, to be
able to force both variants regardless of what the assembler
defaults to.

Jan



More information about the Binutils mailing list