Bug 20046 - x86 feature request: build an instruction including rex and modr/m
Summary: x86 feature request: build an instruction including rex and modr/m
Status: WAITING
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.27
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-04 21:19 UTC by H. Peter Anvin
Modified: 2016-05-05 22:43 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H. Peter Anvin 2016-05-04 21:19:50 UTC
In the Linux kernel, we constantly bump into the problem that we need to use new instructions, but that backwards versions of gas which we still need to handle don't support them.  For plain instructions, it is fine to use .byte, but it gets very suboptimal when the instructions takes registers or memory operands.

I'm wondering if something like this might make sense:

    .rex[/w] <r>,<m>
    .modrm <r>,<m>

... which would generate the REX and modr/m (+SIB, displacement etc needed for the addressing mode) bytes for an instruction with the given r and m operands.  r might be an integer immediate for cases where the register operand is used as an opcode extension.

This would allow a gcc inline of the form:

asm(".byte 0xf3 ; .rex/w %0,%1 ; .byte 0x0f, 0x99 ; .modrm %0,%1"
    : "=rm" (...) : "r" (...));

... for some random new instruction with the SDM description F3 0F 99 /r.
Comment 1 H.J. Lu 2016-05-05 11:28:36 UTC
Do we need to encode VEX and EVEX prefixes?
Comment 2 H.J. Lu 2016-05-05 13:25:19 UTC
Those are just special pseudo instructions.  We need the list of all
possible operand combinations.
Comment 3 H. Peter Anvin 2016-05-05 21:56:30 UTC
I thought about it some more, and a better choice that .rex[/w] would be something like .inspfxs[bwlq] <r>,<m> that would be able to generate 66, 67, or REX prefixes as appropriate.

The possible combinations would be:

<r> can be a register or an immediate in the range 0-7.  If it is a register, it sets the default operand size.  If a register, it controls REX.R for .inspfxs.

<m> can be a register or a memory operand.  If it is a register, it sets the default operand size.  If a register, it controls REX.B; if a memory operand it controls REX.B and REX.X for .inspfxs.

If neither <r> nor <m> is a register, and the memory operand isn't size-specified with the PTR argument (in Intel syntax), then [bwlq] has to be provided to the .inspfxs operation, (as with any other instruction).  It doesn't matter for .modrm, of course.

b - no 66 prefix nor REX.W in any mode
w - 66 prefix in 32- or 64-bit mode
l - 66 prefix in 16-bit mode
q - REX.W (only valid in 64-bit mode)

A 67 prefix would be generated if appropriate for the memory operand.

.vex and .evex pseudo-ops could be added as well.  However, scaled SIB and all the various EVEX modes would make that much more complicated.  It is unlikely we would use those in the Linux kernel, but there might be user-space programs that might be interested.  I would suggest treating that as a potential future extension if it turns out to be desirable.
Comment 4 H. Peter Anvin 2016-05-05 22:17:35 UTC
"Register" at least for .inspfxs would mean any of GPR, segment, BND, CR, DR, MM or XMM registers.  Since YMM, ZMM, or K registers are not encodable without VEX/XOP/EVEX prefixes those presumably should be supported if and when .vex/.xop/.evex are added.
Comment 5 H. Peter Anvin 2016-05-05 22:43:49 UTC
x87 registers use a different encoding than modr/m; the upper two bits can be something other than 11 for a register operation.  However, it seems very unlikely that new x87 instructions will be added at this point, so I don't see any significant reason to add pseudoops to synthesize x87 instructions.