[PATCH 1/2] Support Intel MOVRS
Jan Beulich
jbeulich@suse.com
Fri Dec 27 12:26:34 GMT 2024
On 24.12.2024 10:24, Haochen Jiang wrote:
> From: "Hu, Lin1" <lin1.hu@intel.com>
>
> This patch focus on supporting MOVRS ISA. We could take this full ISA
> as four part: PREFETCHRST2, MOVRS, MOVRS APX_F extension and MOVRS AVX10.2
> extension.
>
> The APX_F extension for MOVRS will be:
> - EVEX.LLZ.NP.MAP4.SCALABLE 8A !(11):rrr:bbb for r8/m8 with NF=0 and
> ND=0
This got to be IGNORED, not SCALABLE, to match other byte insns in APX?
> - EVEX.LLZ.NP/66.MAP4.SCALABLE 8B !(11):rrr:bbb for rv/mv with NF=0
> and ND=0
>
> We did not merge the table together for APX_F since there is an explicit
> x64 for movrs insn. The current APX_F() did not support the combination
> between CPUIDs.
This is an odd remark to make. APX_F() makes sense to use only for dual
VEX/EVEX templates, yet the non-APX forms of MOVRS are legacy-encoded.
See also the ENQCMD{,S} cleanup patch just sent.
> gas/ChangeLog:
>
> * NEWS: Support Intel MOVRS.
> * config/tc-i386.c: Add MOVRS.
> * doc/c-i386.texi: Document .movrs.
> * testsuite/gas/i386/i386.exp: Run MOVRS tests.
> * testsuite/gas/i386/x86-64.exp: Ditto.
> * testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Add MOVRS
> tests.
> * testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Ditto.
> * testsuite/gas/i386/x86-64-apx-evex-promoted.d: Ditto.
> * testsuite/gas/i386/x86-64-apx-evex-promoted.s: Ditto.
Nit: Indentation anomaly.
> * testsuite/gas/i386/lfence-load.d: Add prefetchrst2.
> * testsuite/gas/i386/lfence-load.s: Ditto.
> * testsuite/gas/i386/nops-8.d: Ditto.
> * testsuite/gas/i386/prefetch-intel.d: Ditto.
> * testsuite/gas/i386/prefetch.d: Ditto.
> * testsuite/gas/i386/x86-64-lfence-load.d: Ditto.
> * testsuite/gas/i386/x86-64-lfence-load.s: Ditto.
> * testsuite/gas/i386/x86-64-prefetch-intel.d: Ditto.
> * testsuite/gas/i386/x86-64-prefetch.d: Ditto.
> * testsuite/gas/i386/movrs-intel.d: New test.
> * testsuite/gas/i386/movrs-inval.l: Ditto.
> * testsuite/gas/i386/movrs-inval.s: Ditto.
> * testsuite/gas/i386/movrs.d: Ditto.
> * testsuite/gas/i386/movrs.s: Ditto.
> * testsuite/gas/i386/x86-64-movrs-avx10_2-256-intel.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs-avx10_2-256.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs-avx10_2-256.s: Ditto.
> * testsuite/gas/i386/x86-64-movrs-avx10_2-512-intel.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs-avx10_2-512.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs-avx10_2-512.s: Ditto.
> * testsuite/gas/i386/x86-64-movrs-intel.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs.s: Ditto.
> * testsuite/gas/i386/x86-64-movrs-intel-suffix.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs-suffix.d: Ditto.
> * testsuite/gas/i386/x86-64-movrs-suffix.s: Ditto.
>
> opcodes/ChangeLog:
>
> * i386-dis-evex-mod.h: Add MOD_EVEX_MAP4_8A_W_0, MOD_EVEX_MAP4_8B.
> and MOD_EVEX_MAP5_6F.
> * i386-dis-evex-prefix.h: Add PREFIX_EVEX_MAP5_6F_M_0_X86_64.
> * i386-dis-evex-w.h: Add EVEX_W_MAP4_8A.
> * i386-dis-evex-x86.h: Add X86_64_EVEX_MAP5_6F_M_0.
Again.
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/movrs-inval.s
> @@ -0,0 +1,12 @@
> +# Check Illegal 32bit MOVRS instructions
> +
> + .text
> +_start:
> + movrs 0x10000000(%esp, %esi, 8), %dx
> + movrs 0x10000000(%esp, %esi, 8), %edx
> + movrs 0x10000000(%esp, %esi, 8), %r12
> + movrs 0x10000000(%esp, %esi, 8), %bl
> + vmovrsb 0x10000000(%esp, %esi, 8), %zmm6{%k7}
> + vmovrsd 0x10000000(%esp, %esi, 8), %zmm6{%k7}
> + vmovrsq 0x10000000(%esp, %esi, 8), %zmm6{%k7}
> + vmovrsw 0x10000000(%esp, %esi, 8), %zmm6{%k7}
Once again I'm uncertain of the usefulness of this test.
> --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d
> @@ -118,6 +118,10 @@ Disassembly of section \.text:
> [ ]*[a-f0-9]+:[ ]*62 4c .d 08 f8 bc 87 23 01 00 00[ ]+movdir64b[ ]+0x123\(%r31,%rax,4\),%r31
> [ ]*[a-f0-9]+:[ ]*62 4c 7c 08 f9 8c 87 23 01 00 00[ ]+movdiri[ ]+%r25d,0x123\(%r31,%rax,4\)
> [ ]*[a-f0-9]+:[ ]*62 4c fc 08 f9 bc 87 23 01 00 00[ ]+movdiri[ ]+%r31,0x123\(%r31,%rax,4\)
> +[ ]*[a-f0-9]+:[ ]*62 cc 7c 08 8a 84 87 23 01 00 00[ ]+movrs[ ]+0x123\(%r31,%rax,4\),%r16b
This one's WIG, but doesn't end up with W=1. I wonder whether that's
related to the very last comment (on the opcode table entries).
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-movrs-suffix.d
> @@ -0,0 +1,13 @@
> +#objdump: -dwMsuffix
> +#name: x86_64 MOVRS insns
> +
> +.*: +file format .*
> +
> +Disassembly of section \.text:
> +
> +0+ <_start>:
> +\s*[a-f0-9]+:\s*66 0f 38 8b 92 00 ff ff ff\s+movrsw -0x100\(%rdx\),%dx
> +\s*[a-f0-9]+:\s*0f 38 8b 92 00 fe ff ff\s+movrsl -0x200\(%rdx\),%edx
> +\s*[a-f0-9]+:\s*4c 0f 38 8b a2 00 fc ff ff\s+movrsq -0x400\(%rdx\),%r12
> +\s*[a-f0-9]+:\s*0f 38 8a 5a 80\s+movrsb -0x80\(%rdx\),%bl
> +#pass
While I agree with having a separate -Msuffix test, ...
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-movrs-suffix.s
> @@ -0,0 +1,15 @@
> +# Check 64bit MOVRS instructions
> +
> + .text
> +_start:
> + movrsw -256(%rdx), %dx
> + movrsl -512(%rdx), %edx
> + movrsq -1024(%rdx), %r12
> + movrsb -128(%rdx), %bl
... the source imo wants folding into x86-64-movrs.s and ...
> +_intel:
> + .intel_syntax noprefix
> + movrsw dx, WORD PTR [rdx-256]
> + movrsl edx, DWORD PTR [rdx-512]
> + movrsq r12, QWORD PTR [rdx-1024]
> + movrsb bl, BYTE PTR [rdx-128]
... further bogus use of suffixes in Intel syntax shouldn't be added
in testcases.
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -942,6 +942,7 @@ enum
> MOD_0F18_REG_1,
> MOD_0F18_REG_2,
> MOD_0F18_REG_3,
> + MOD_0F18_REG_4,
> MOD_0F18_REG_6,
> MOD_0F18_REG_7,
> MOD_0F1A_PREFIX_0,
> @@ -959,6 +960,8 @@ enum
> MOD_0FAE_REG_7,
> MOD_0FC7_REG_6,
> MOD_0FC7_REG_7,
> + MOD_0F388A,
> + MOD_0F388B,
> MOD_0F38DC_PREFIX_1,
> MOD_0F38F8,
>
> @@ -968,8 +971,12 @@ enum
>
> MOD_EVEX_MAP4_60,
> MOD_EVEX_MAP4_61,
> + MOD_EVEX_MAP4_8A_W_0,
This should be just MOD_EVEX_MAP4_8A, unless MOVRS differs from other APX
byte ops (who all ignore EVEX.W).
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3554,3 +3554,17 @@ vcomxs<sdh>, 0x<sdh:spfx>2f, AVX10_2, Modrm|EVexLIG|<sdh:spc1>|<sdh:vexw>|Disp8M
> vucomxs<sdh>, 0x<sdh:spfx>2e, AVX10_2, Modrm|EVexLIG|<sdh:spc1>|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM }
>
> // AVX10.2 instructions end.
> +
> +// MOVRS instructions.
> +
> +prefetchrst2, 0xf18/4, MOVRS, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
> +
> +movrs, 0x8a, MOVRS&x64, Modrm|Space0F38|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Byte|Unspecified|BaseIndex, Reg8 }
> +movrs, 0x8b, MOVRS&x64, Modrm|Anysize|Space0F38|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +movrs, 0x8a, MOVRS&APX_F, Modrm|EVex128|EVexMap4|VexW0|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Byte|Unspecified|BaseIndex, Reg8 }
> +movrs, 0x8b, MOVRS&APX_F, Modrm|CheckOperandSize|EVex128|EVexMap4|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
These want to leverage W if at all possible, allowing to go from 4 to 2 templates.
Jan
More information about the Binutils
mailing list