x86: Add support for Intel AMX instructions
Cui, Lili
lili.cui@intel.com
Wed Jul 8 08:49:29 GMT 2020
> What about the high bit of VEX.VVVV being zero?
>
> What about the case of there not being a SIB byte?
>
> What about the case of any two operands being the same, which I think the
> assembler also still doesn't error on, as one can see ...
>
I added them this time.
> tmm3,tmm4,tmm5
> > +[ ]*[a-f0-9]+:[ ]*c4 e2 63 5e ca[ ]*tdpbssd tmm1,tmm2,tmm3
> > +[ ]*[a-f0-9]+:[ ]*c4 e2 73 5e c1[ ]*tdpbssd tmm0,tmm1,tmm1
> > +[ ]*[a-f0-9]+:[ ]*c4 e2 73 5e c8[ ]*tdpbssd tmm1,tmm0,tmm1
> > +[ ]*[a-f0-9]+:[ ]*c4 e2 7b 5e c9[ ]*tdpbssd tmm1,tmm1,tmm0
>
> ... here (in my earlier reply I had specifically given the comment in the
> context of the "inval" test).
Sorry, I misunderstood it before. Below is my updated patch, thanks.
Subject: [PATCH] x86: Add support for Intel AMX instructions
gas/
* doc/c-i386.texi: Document amx_int8, amx_bf16 and amx_tile.
* config/tc-i386.c (i386_error): Add invalid_sib_address.
(cpu_arch): Add .amx_int8, .amx_bf16 and .amx_tile.
(cpu_noarch): Add noamx_int8, noamx_bf16 and noamx_tile.
(match_simd_size): Add tmmword check.
(operand_type_match): Add tmmword.
(type_names): Add rTMM.
(i386_error): Add invalid_tmm_register_set.
(check_VecOperands): Handle invalid_sib_address and
invalid_tmm_register_set.
(match_template): Handle invalid_sib_address.
(build_modrm_byte): Handle non-vector SIB and zmmword.
(i386_index_check): Disallow RegIP for non-vector SIB.
(check_register): Handle zmmword.
* testsuite/gas/i386/i386.exp: Add AMX new tests.
* testsuite/gas/i386/intel-regs.d: Add tmm.
* testsuite/gas/i386/intel-regs.s: Add tmm.
* testsuite/gas/i386/x86-64-amx-intel.d: New.
* testsuite/gas/i386/x86-64-amx-inval.l: New.
* testsuite/gas/i386/x86-64-amx-inval.s: New.
* testsuite/gas/i386/x86-64-amx.d: New.
* testsuite/gas/i386/x86-64-amx.s: New.
* testsuite/gas/i386/x86-64-amx-bad.d: New.
* testsuite/gas/i386/x86-64-amx-bad.s: New.
opcodes/
* i386-dis.c (TMM): New.
(EXtmm): Likewise.
(VexTmm): Likewise.
(MVexSIBMEM): Likewise.
(vex_sibmem_mode): Likewise.
(tmm_mode): Likewise.
(REG_VEX_0F3849_P_0_W_0_M_1): Likewise.
(MOD_VEX_0F3849_P_0_W_0): Likewise.
(MOD_VEX_0F3849_P_2_W_0): Likewise.
(MOD_VEX_0F3849_P_3_W_0): Likewise.
(MOD_VEX_0F384B_P_1_W_0): Likewise.
(MOD_VEX_0F384B_P_2_W_0): Likewise.
(MOD_VEX_0F384B_P_3_W_0): Likewise.
(MOD_VEX_0F385C_P_1_W_0): Likewise.
(MOD_VEX_0F385E_P_0_W_0): Likewise.
(MOD_VEX_0F385E_P_1_W_0): Likewise.
(MOD_VEX_0F385E_P_2_W_0): Likewise.
(MOD_VEX_0F385E_P_3_W_0): Likewise.
(RM_VEX_0F3849_P_0_W_0_M_1_R_0): Likewise.
(PREFIX_VEX_0F3849): Likewise.
(PREFIX_VEX_0F384B): Likewise.
(PREFIX_VEX_0F385C): Likewise.
(PREFIX_VEX_0F385E): Likewise.
(X86_64_0F01_REG_3): Likewise.
(X86_64_VEX_0F3849_P_0_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F3849_P_0_W_0_M_1_REG_0_RM_0_L_0): Likewise.
(X86_64_VEX_0F3849_P_2_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F3849_P_3_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F384B_P_1_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F384B_P_2_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F384B_P_3_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F385C_P_1_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F385E_P_0_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F385E_P_1_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F385E_P_2_W_0_M_0_L_0): Likewise.
(X86_64_VEX_0F385E_P_3_W_0_M_0_L_0): Likewise.
(VEX_W_0F3849_P_0): Likewise.
(VEX_W_0F3849_P_2): Likewise.
(VEX_W_0F3849_P_3): Likewise.
(VEX_W_0F384B_P_1): Likewise.
(VEX_W_0F384B_P_2): Likewise.
(VEX_W_0F384B_P_3): Likewise.
(VEX_W_0F385C_P_1): Likewise.
(VEX_W_0F385E_P_0): Likewise.
(VEX_W_0F385E_P_1): Likewise.
(VEX_W_0F385E_P_2): Likewise.
(VEX_W_0F385E_P_3): Likewise.
(VEX_LEN_0F3849_P_0_W_0_M_0): Likewise.
(VEX_LEN_0F3849_P_0_W_0_M_1_REG_0_RM_0): Likewise.
(VEX_LEN_0F3849_P_2_W_0_M_0): Likewise.
(VEX_LEN_0F3849_P_3_W_0_M_0): Likewise.
(VEX_LEN_0F384B_P_1_W_0_M_0): Likewise.
(VEX_LEN_0F384B_P_2_W_0_M_0): Likewise.
(VEX_LEN_0F384B_P_3_W_0_M_0): Likewise.
(VEX_LEN_0F385C_P_1_W_0_M_0): Likewise.
(VEX_LEN_0F385E_P_0_W_0_M_0): Likewise.
(VEX_LEN_0F385E_P_1_W_0_M_0): Likewise.
(VEX_LEN_0F385E_P_2_W_0_M_0): Likewise.
(VEX_LEN_0F385E_P_3_W_0_M_0): Likewise.
(names_tmm): Likewise.
(att_names_tmm): Likewise.
(intel_operand_size): Handle void_mode.
(OP_XMM): Handle tmm_mode.
(OP_EX): Likewise.
(OP_VEX): Likewise.
* i386-gen.c (cpu_flag_init): Add entries for
CpuAMX_INT8, CpuAMX_BF16 and CpuAMX_TILE.
(operand_type_shorthands): Add RegTMM.
(operand_type_init): Likewise.
(operand_types): Add Tmmword.
(cpu_flag_init): Add CPU_AMX_INT8, CpuAMX_BF16 and CpuAMX_TILE.
(cpu_flags): Add CpuAMX_INT8, CpuAMX_BF16 and CpuAMX_TILE.
* i386-opc.h (CpuAMX_INT8): New.
(CpuAMX_BF16): Likewise.
(CpuAMX_TILE): Likewise.
(SIBMEM): Likewise.
(Tmmword): Likewise.
(i386_cpu_flags): Add cpuamx_int8, cpuamx_bf16 and cpuamx_tile.
(i386_opcode_modifier): Extend width of fields vexvvvv and sib.
(i386_operand_type): Add tmmword.
* i386-opc.tbl: Add AMX instructions.
* i386-reg.tbl: Add AMX registers.
* i386-init.h: Regenerated.
* i386-tbl.h: Likewise.
---
gas/config/tc-i386.c | 97 +++++-
gas/doc/c-i386.texi | 7 +
gas/testsuite/gas/i386/i386.exp | 4 +
gas/testsuite/gas/i386/intel-regs.d | 4 +
gas/testsuite/gas/i386/intel-regs.s | 4 +
gas/testsuite/gas/i386/x86-64-amx-bad.d | 20 ++
gas/testsuite/gas/i386/x86-64-amx-bad.s | 40 +++
gas/testsuite/gas/i386/x86-64-amx-intel.d | 70 ++++
gas/testsuite/gas/i386/x86-64-amx-inval.l | 17 +
gas/testsuite/gas/i386/x86-64-amx-inval.s | 22 ++
gas/testsuite/gas/i386/x86-64-amx.d | 70 ++++
gas/testsuite/gas/i386/x86-64-amx.s | 61 ++++
opcodes/i386-dis.c | 390 +++++++++++++++++++++-
opcodes/i386-gen.c | 18 +
opcodes/i386-opc.h | 16 +-
opcodes/i386-opc.tbl | 23 ++
opcodes/i386-reg.tbl | 9 +
17 files changed, 853 insertions(+), 19 deletions(-)
create mode 100644 gas/testsuite/gas/i386/x86-64-amx-bad.d
create mode 100644 gas/testsuite/gas/i386/x86-64-amx-bad.s
create mode 100644 gas/testsuite/gas/i386/x86-64-amx-intel.d
create mode 100644 gas/testsuite/gas/i386/x86-64-amx-inval.l
create mode 100644 gas/testsuite/gas/i386/x86-64-amx-inval.s
create mode 100644 gas/testsuite/gas/i386/x86-64-amx.d
create mode 100644 gas/testsuite/gas/i386/x86-64-amx.s
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 2e0eb24753..96f9d2a926 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -290,8 +290,10 @@ enum i386_error
unsupported_with_intel_mnemonic,
unsupported_syntax,
unsupported,
+ invalid_sib_address,
invalid_vsib_address,
invalid_vector_register_set,
+ invalid_tmm_register_set,
unsupported_vector_index_register,
unsupported_broadcast,
broadcast_needed,
@@ -372,6 +374,9 @@ struct _i386_insn
/* Has ZMM register operands. */
bfd_boolean has_regzmm;
+ /* Has TMM register operands. */
+ bfd_boolean has_regtmm;
+
/* Has GOTPC or TLS relocation. */
bfd_boolean has_gotpc_tls_reloc;
@@ -1201,6 +1206,12 @@ static const arch_entry cpu_arch[] =
CPU_WAITPKG_FLAGS, 0 },
{ STRING_COMMA_LEN (".cldemote"), PROCESSOR_UNKNOWN,
CPU_CLDEMOTE_FLAGS, 0 },
+ { STRING_COMMA_LEN (".amx_int8"), PROCESSOR_UNKNOWN,
+ CPU_AMX_INT8_FLAGS, 0 },
+ { STRING_COMMA_LEN (".amx_bf16"), PROCESSOR_UNKNOWN,
+ CPU_AMX_BF16_FLAGS, 0 },
+ { STRING_COMMA_LEN (".amx_tile"), PROCESSOR_UNKNOWN,
+ CPU_AMX_TILE_FLAGS, 0 },
{ STRING_COMMA_LEN (".movdiri"), PROCESSOR_UNKNOWN,
CPU_MOVDIRI_FLAGS, 0 },
{ STRING_COMMA_LEN (".movdir64b"), PROCESSOR_UNKNOWN,
@@ -1259,6 +1270,9 @@ static const noarch_entry cpu_noarch[] =
{ STRING_COMMA_LEN ("noavx512_bitalg"), CPU_ANY_AVX512_BITALG_FLAGS },
{ STRING_COMMA_LEN ("noibt"), CPU_ANY_IBT_FLAGS },
{ STRING_COMMA_LEN ("noshstk"), CPU_ANY_SHSTK_FLAGS },
+ { STRING_COMMA_LEN ("noamx_int8"), CPU_ANY_AMX_INT8_FLAGS },
+ { STRING_COMMA_LEN ("noamx_bf16"), CPU_ANY_AMX_BF16_FLAGS },
+ { STRING_COMMA_LEN ("noamx_tile"), CPU_ANY_AMX_TILE_FLAGS },
{ STRING_COMMA_LEN ("nomovdiri"), CPU_ANY_MOVDIRI_FLAGS },
{ STRING_COMMA_LEN ("nomovdir64b"), CPU_ANY_MOVDIR64B_FLAGS },
{ STRING_COMMA_LEN ("noavx512_bf16"), CPU_ANY_AVX512_BF16_FLAGS },
@@ -2159,7 +2173,9 @@ match_simd_size (const insn_template *t, unsigned int wanted,
|| (i.types[given].bitfield.ymmword
&& !t->operand_types[wanted].bitfield.ymmword)
|| (i.types[given].bitfield.zmmword
- && !t->operand_types[wanted].bitfield.zmmword));
+ && !t->operand_types[wanted].bitfield.zmmword)
+ || (i.types[given].bitfield.tmmword
+ && !t->operand_types[wanted].bitfield.tmmword));
}
/* Return 1 if there is no conflict in any size between operand GIVEN
@@ -2296,6 +2312,7 @@ operand_type_match (i386_operand_type overlap,
temp.bitfield.xmmword = 0;
temp.bitfield.ymmword = 0;
temp.bitfield.zmmword = 0;
+ temp.bitfield.tmmword = 0;
if (operand_type_all_zero (&temp))
goto mismatch;
@@ -3304,6 +3321,7 @@ const type_names[] =
{ OPERAND_TYPE_REGXMM, "rXMM" },
{ OPERAND_TYPE_REGYMM, "rYMM" },
{ OPERAND_TYPE_REGZMM, "rZMM" },
+ { OPERAND_TYPE_REGTMM, "rTMM" },
{ OPERAND_TYPE_REGMASK, "Mask reg" },
};
@@ -5790,7 +5808,7 @@ check_VecOperands (const insn_template *t)
/* For VSIB byte, we need a vector register for index, and all vector
registers must be distinct. */
- if (t->opcode_modifier.sib)
+ if (t->opcode_modifier.sib && t->opcode_modifier.sib != SIBMEM)
{
if (!i.index_reg
|| !((t->opcode_modifier.sib == VECSIB128
@@ -5849,6 +5867,23 @@ check_VecOperands (const insn_template *t)
}
}
+ /* For AMX instructions with three tmmword operands, all tmmword operand must be
+ distinct */
+ if (t->operand_types[0].bitfield.tmmword
+ && i.reg_operands == 3)
+ {
+ if (register_number (i.op[0].regs)
+ == register_number (i.op[1].regs)
+ || register_number (i.op[0].regs)
+ == register_number (i.op[2].regs)
+ || register_number (i.op[1].regs)
+ == register_number (i.op[2].regs))
+ {
+ i.error = invalid_tmm_register_set;
+ return 1;
+ }
+ }
+
/* Check if broadcast is supported by the instruction and is applied
to the memory operand. */
if (i.broadcast)
@@ -6584,12 +6619,18 @@ match_template (char mnem_suffix)
as_bad (_("unsupported instruction `%s'"),
current_templates->start->name);
return NULL;
+ case invalid_sib_address:
+ err_msg = _("invalid SIB address");
+ break;
case invalid_vsib_address:
err_msg = _("invalid VSIB address");
break;
case invalid_vector_register_set:
err_msg = _("mask, index, and destination registers must be distinct");
break;
+ case invalid_tmm_register_set:
+ err_msg = _("tmm register must be distinct");
+ break;
case unsupported_vector_index_register:
err_msg = _("unsupported vector index register");
break;
@@ -7923,8 +7964,11 @@ build_modrm_byte (void)
else if (i.op[dest].regs->reg_type.bitfield.class == RegSIMD
|| i.op[source].regs->reg_type.bitfield.class == RegSIMD)
{
- if (i.types[dest].bitfield.zmmword
- || i.types[source].bitfield.zmmword)
+ if (i.types[dest].bitfield.tmmword
+ || i.types[source].bitfield.tmmword)
+ i.has_regtmm = TRUE;
+ else if (i.types[dest].bitfield.zmmword
+ || i.types[source].bitfield.zmmword)
i.has_regzmm = TRUE;
else if (i.types[dest].bitfield.ymmword
|| i.types[source].bitfield.ymmword)
@@ -7966,7 +8010,9 @@ build_modrm_byte (void)
if (i.tm.opcode_modifier.sib)
{
- if (i.index_reg->reg_num == RegIZ)
+ /* The index register of VSIB shouldn't be RegIZ. */
+ if (i.tm.opcode_modifier.sib != SIBMEM
+ && i.index_reg->reg_num == RegIZ)
abort ();
i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING;
@@ -7989,8 +8035,19 @@ build_modrm_byte (void)
i.types[op].bitfield.disp32s = 1;
}
}
- i.sib.index = i.index_reg->reg_num;
- set_rex_vrex (i.index_reg, REX_X, FALSE);
+
+ /* Since the mandatory SIB always has index register, so
+ the code logic remains unchanged. The non-mandatory SIB
+ without index register is allowed and will be handled
+ later. */
+ if (i.index_reg)
+ {
+ if (i.index_reg->reg_num == RegIZ)
+ i.sib.index = NO_INDEX_REGISTER;
+ else
+ i.sib.index = i.index_reg->reg_num;
+ set_rex_vrex (i.index_reg, REX_X, FALSE);
+ }
}
default_seg = &ds;
@@ -8004,7 +8061,9 @@ build_modrm_byte (void)
{
i386_operand_type newdisp;
- gas_assert (!i.tm.opcode_modifier.sib);
+ /* Both check for VSIB and mandatory non-vector SIB. */
+ gas_assert (!i.tm.opcode_modifier.sib
+ || i.tm.opcode_modifier.sib == SIBMEM);
/* Operand is just <disp> */
if (flag_code == CODE_64BIT)
{
@@ -8142,7 +8201,11 @@ build_modrm_byte (void)
i.sib.scale = i.log2_scale_factor;
if (i.index_reg == 0)
{
- gas_assert (!i.tm.opcode_modifier.sib);
+ /* Only check for VSIB. */
+ gas_assert (i.tm.opcode_modifier.sib != VECSIB128
+ && i.tm.opcode_modifier.sib != VECSIB256
+ && i.tm.opcode_modifier.sib != VECSIB512);
+
/* <disp>(%esp) becomes two byte modrm with no index
register. We've already stored the code for esp
in i.rm.regmem ie. ESCAPE_TO_TWO_BYTE_ADDRESSING.
@@ -8267,7 +8330,9 @@ build_modrm_byte (void)
break;
if (i.types[op].bitfield.class == RegSIMD)
{
- if (i.types[op].bitfield.zmmword)
+ if (i.types[op].bitfield.tmmword)
+ i.has_regtmm = TRUE;
+ else if (i.types[op].bitfield.zmmword)
i.has_regzmm = TRUE;
else if (i.types[op].bitfield.ymmword)
i.has_regymm = TRUE;
@@ -10926,9 +10991,10 @@ i386_index_check (const char *operand_string)
|| !i.index_reg->reg_type.bitfield.baseindex)))
goto bad_address;
- /* bndmk, bndldx, and bndstx have special restrictions. */
+ /* bndmk, bndldx, bndstx and mandatory non-vector SIB have special restrictions. */
if (current_templates->start->base_opcode == 0xf30f1b
- || (current_templates->start->base_opcode & ~1) == 0x0f1a)
+ || (current_templates->start->base_opcode & ~1) == 0x0f1a
+ || current_templates->start->opcode_modifier.sib == SIBMEM)
{
/* They cannot use RIP-relative addressing. */
if (i.base_reg && i.base_reg->reg_num == RegIP)
@@ -10938,7 +11004,7 @@ i386_index_check (const char *operand_string)
}
/* bndldx and bndstx ignore their scale factor. */
- if (current_templates->start->base_opcode != 0xf30f1b
+ if ((current_templates->start->base_opcode & ~1) == 0x0f1a
&& i.log2_scale_factor)
as_warn (_("register scaling is being ignored here"));
}
@@ -12440,6 +12506,11 @@ static bfd_boolean check_register (const reg_entry *r)
}
}
+ if (r->reg_type.bitfield.tmmword
+ && (!cpu_arch_flags.bitfield.cpuamx_tile
+ || flag_code != CODE_64BIT))
+ return FALSE;
+
if (r->reg_type.bitfield.class == RegBND && !cpu_arch_flags.bitfield.cpumpx)
return FALSE;
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index d4e6fcb698..cb86cc7968 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -226,6 +226,12 @@ accept various extension mnemonics. For example,
@code{noenqcmd},
@code{noserialize},
@code{notsxldtrk},
+@code{amx_int8},
+@code{noamx_int8},
+@code{amx_bf16},
+@code{noamx_bf16},
+@code{amx_tile},
+@code{noamx_tile},
@code{vmx},
@code{vmfunc},
@code{smx},
@@ -1504,6 +1510,7 @@ supported on the CPU specified. The choices for @var{cpu_type} are:
@item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote}
@item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq}
@item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
+@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_tile}
@item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5}
@item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme}
@item @samp{.lwp} @tab @samp{.fma4} @tab @samp{.xop} @tab @samp{.cx16}
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 55929d3acb..bd4adb07ef 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -1137,6 +1137,10 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
run_dump_test "x86-64-lfence-ret-d"
run_dump_test "x86-64-lfence-ret-e"
run_dump_test "x86-64-lfence-byte"
+ run_list_test "x86-64-amx-inval"
+ run_dump_test "x86-64-amx"
+ run_dump_test "x86-64-amx-intel"
+ run_dump_test "x86-64-amx-bad"
if { ![istarget "*-*-aix*"]
&& ![istarget "*-*-beos*"]
diff --git a/gas/testsuite/gas/i386/intel-regs.d b/gas/testsuite/gas/i386/intel-regs.d
index 65bcb6ca7d..480b291c91 100644
--- a/gas/testsuite/gas/i386/intel-regs.d
+++ b/gas/testsuite/gas/i386/intel-regs.d
@@ -6,6 +6,7 @@
Disassembly of section \.text:
0+0 <.*>:
+.*[ ]+R_386_32[ ]+tmm1
.*[ ]+R_386_16[ ]+eax
.*[ ]+R_386_16[ ]+rax
.*[ ]+R_386_16[ ]+axl
@@ -53,4 +54,7 @@ Disassembly of section \.text:
.* <ymm8>:
.*[ ]+<ymm8>
+
+.* <tmm0>:
+.*[ ]+<tmm0>
#pass
diff --git a/gas/testsuite/gas/i386/intel-regs.s b/gas/testsuite/gas/i386/intel-regs.s
index 66ab16dfc5..44e369bb0f 100644
--- a/gas/testsuite/gas/i386/intel-regs.s
+++ b/gas/testsuite/gas/i386/intel-regs.s
@@ -1,6 +1,8 @@
.text
.intel_syntax noprefix
+ mov eax, tmm1
+
.arch i286
.code16
mov ax, eax ; add [bx+si], al
@@ -59,3 +61,5 @@
mov rax, r8
ymm8:
jmp ymm8
+tmm0:
+ jmp tmm0
diff --git a/gas/testsuite/gas/i386/x86-64-amx-bad.d b/gas/testsuite/gas/i386/x86-64-amx-bad.d
new file mode 100644
index 0000000000..2957b6a15b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-bad.d
@@ -0,0 +1,20 @@
+#as:
+#objdump: -drw
+#name: x86_64 AMX insns
+#source: x86-64-amx-bad.s
+
+.*: +file format .*
+
+
+Disassembly of section \.text:
+
+0+ <\.text>:
+[ ]*[a-f0-9]+:[ ]*c4 e2 d2 5c[ ]*\(bad\)[ ]*
+[ ]*[a-f0-9]+:[ ]*dc 90 90 90 90 90[ ]*fcoml.*
+[ ]*[a-f0-9]+:[ ]*c4 e2 56 5c[ ]*\(bad\)[ ]*
+[ ]*[a-f0-9]+:[ ]*dc 90 90 90 90 90[ ]*fcoml.*
+[ ]*[a-f0-9]+:[ ]*c4 62 52 5c dc[ ]*tdpbf16ps %tmm5,%tmm4,\(bad\)
+[ ]*[a-f0-9]+:[ ]*c4 c2 52 5c dc[ ]*tdpbf16ps %tmm5,\(bad\),%tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 32 5c dc[ ]*tdpbf16ps \(bad\),%tmm4,%tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 09[ ]*tileloadd \(bad\),%tmm1
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-bad.s b/gas/testsuite/gas/i386/x86-64-amx-bad.s
new file mode 100644
index 0000000000..f0db1a9493
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-bad.s
@@ -0,0 +1,40 @@
+.text
+ #tdpbf16ps %tmm5,%tmm4,%tmm3 set VEX.W = 1 (illegal value).
+ .byte 0xc4
+ .byte 0xe2
+ .byte 0xd2
+ .byte 0x5c
+ .byte 0xdc
+ .fill 0x05, 0x01, 0x90
+ #tdpbf16ps %tmm5,%tmm4,%tmm3 set VEX.L = 1 (illegal value).
+ .byte 0xc4
+ .byte 0xe2
+ .byte 0x56
+ .byte 0x5c
+ .byte 0xdc
+ .fill 0x05, 0x01, 0x90
+ #tdpbf16ps %tmm5,%tmm4,%tmm3 set VEX.R = 0 (illegal value).
+ .byte 0xc4
+ .byte 0x62
+ .byte 0x52
+ .byte 0x5c
+ .byte 0xdc
+ #tdpbf16ps %tmm5,%tmm4,%tmm3 set VEX.B = 0 (illegal value).
+ .byte 0xc4
+ .byte 0xc2
+ .byte 0x52
+ .byte 0x5c
+ .byte 0xdc
+ #tdpbf16ps %tmm5,%tmm4,%tmm3 set VEX.VVVV = 0110 (illegal value).
+ .byte 0xc4
+ .byte 0xe2
+ .byte 0x32
+ .byte 0x5c
+ .byte 0xdc
+ #tileloadd (%rax),%tmm1 set R/M= 001 (illegal value) without SIB.
+ .byte 0xc4
+ .byte 0xe2
+ .byte 0x7b
+ .byte 0x4b
+ .byte 0x09
+
diff --git a/gas/testsuite/gas/i386/x86-64-amx-intel.d b/gas/testsuite/gas/i386/x86-64-amx-intel.d
new file mode 100644
index 0000000000..fc5e0745ea
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-intel.d
@@ -0,0 +1,70 @@
+#as:
+#objdump: -d -Mintel
+#name: x86_64 AMX insns in Intel syntax
+#source: x86-64-amx.s
+
+.*: +file format .*
+
+
+Disassembly of section \.text:
+
+0+ <_start>:
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 04 51[ ]*ldtilecfg \[rcx\+rdx\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 49 04 51[ ]*sttilecfg \[rcx\+rdx\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 52 5c dc[ ]*tdpbf16ps tmm3,tmm4,tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 63 5e ca[ ]*tdpbssd tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 62 5e ca[ ]*tdpbsud tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 61 5e ca[ ]*tdpbusd tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 60 5e ca[ ]*tdpbuud tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 25 00[ ]*tileloadd tmm5,ds:0x0
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 21[ ]*tileloadd tmm5,\[rcx\+riz\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 2c 21[ ]*tileloadd tmm5,\[ecx\+eiz\*1\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 11[ ]*tileloadd tmm5,\[rcx\+rdx\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 0c 51[ ]*tileloadd tmm1,\[ecx\+edx\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 25 00[ ]*tileloaddt1 tmm5,ds:0x0
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 21[ ]*tileloaddt1 tmm5,\[rcx\+riz\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 2c 21[ ]*tileloaddt1 tmm5,\[ecx\+eiz\*1\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 11[ ]*tileloaddt1 tmm5,\[rcx\+rdx\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 0c 51[ ]*tileloaddt1 tmm1,\[ecx\+edx\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 0c 61[ ]*tileloaddt1 tmm1,\[rcx\+riz\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 c0[ ]*tilerelease *
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 21[ ]*tilestored \[rcx\+riz\*1\],tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 2c 21[ ]*tilestored \[ecx\+eiz\*1\],tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 11[ ]*tilestored \[rcx\+rdx\*1\],tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 0c 51[ ]*tilestored \[ecx\+edx\*2\],tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 c0[ ]*tilezero tmm0
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 e8[ ]*tilezero tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 f8[ ]*tilezero tmm7
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 01[ ]*ldtilecfg \[rcx\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 03[ ]*ldtilecfg \[rbx\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 49 01[ ]*sttilecfg \[rcx\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 49 03[ ]*sttilecfg \[rbx\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 52 5c dc[ ]*tdpbf16ps tmm3,tmm4,tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 63 5e ca[ ]*tdpbssd tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 62 5e ca[ ]*tdpbsud tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 61 5e ca[ ]*tdpbusd tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 60 5e ca[ ]*tdpbuud tmm1,tmm2,tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 25 00[ ]*tileloadd tmm5,ds:0x0
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 21[ ]*tileloadd tmm5,\[rcx\+riz\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 2c 21[ ]*tileloadd tmm5,\[ecx\+eiz\*1\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 11[ ]*tileloadd tmm5,\[rcx\+rdx\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 0c 51[ ]*tileloadd tmm1,\[ecx\+edx\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 25 00[ ]*tileloaddt1 tmm5,ds:0x0
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 21[ ]*tileloaddt1 tmm5,\[rcx\+riz\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 2c 21[ ]*tileloaddt1 tmm5,\[ecx\+eiz\*1\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 11[ ]*tileloaddt1 tmm5,\[rcx\+rdx\*1\]
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 0c 51[ ]*tileloaddt1 tmm1,\[ecx\+edx\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 0c 61[ ]*tileloaddt1 tmm1,\[rcx\+riz\*2\]
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 c0[ ]*tilerelease *
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 21[ ]*tilestored \[rcx\+riz\*1\],tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 2c 21[ ]*tilestored \[ecx\+eiz\*1\],tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 11[ ]*tilestored \[rcx\+rdx\*1\],tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 0c 51[ ]*tilestored \[ecx\+edx\*2\],tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 c0[ ]*tilezero tmm0
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 e8[ ]*tilezero tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 f8[ ]*tilezero tmm7
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx-inval.l b/gas/testsuite/gas/i386/x86-64-amx-inval.l
new file mode 100644
index 0000000000..e7c284fd71
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-inval.l
@@ -0,0 +1,17 @@
+.* Assembler messages:
+.*:5: Error: `\(%rip\)' cannot be used here
+.*:6: Error: `\(%rip\)' cannot be used here
+.*:7: Error: `\(%rip\)' cannot be used here
+.*:8: Error: operand size mismatch for `tdpbssd'
+.*:9: Error: operand size mismatch for `vaddps'
+.*:10: Error: tmm register must be distinct for `tdpbssd'
+.*:11: Error: tmm register must be distinct for `tdpbssd'
+.*:12: Error: tmm register must be distinct for `tdpbssd'
+.*:15: Error: `\[rip\]' cannot be used here
+.*:16: Error: `\[rip\]' cannot be used here
+.*:17: Error: `\[rip\]' cannot be used here
+.*:18: Error: operand size mismatch for `tdpbssd'
+.*:19: Error: operand size mismatch for `vaddps'
+.*:20: Error: tmm register must be distinct for `tdpbssd'
+.*:21: Error: tmm register must be distinct for `tdpbssd'
+.*:22: Error: tmm register must be distinct for `tdpbssd'
diff --git a/gas/testsuite/gas/i386/x86-64-amx-inval.s b/gas/testsuite/gas/i386/x86-64-amx-inval.s
new file mode 100644
index 0000000000..6e29453669
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx-inval.s
@@ -0,0 +1,22 @@
+# Check illegal SIBMEM and register size used in AMX instructions
+
+ .text
+_start:
+ tileloadd (%rip), %tmm1
+ tileloaddt1 (%rip), %tmm1
+ tilestored %tmm1, (%rip)
+ tdpbssd %xmm1, %xmm2, %xmm3
+ vaddps %tmm1, %tmm2, %tmm3
+ tdpbssd %tmm1, %tmm1, %tmm0
+ tdpbssd %tmm1, %tmm0, %tmm1
+ tdpbssd %tmm0, %tmm1, %tmm1
+
+ .intel_syntax noprefix
+ tileloadd tmm1, [rip]
+ tileloaddt1 tmm1, [rip]
+ tilestored [rip], tmm1
+ tdpbssd xmm3, xmm2, xmm1
+ vaddps %tmm1, %tmm2, %tmm3
+ tdpbssd tmm0, tmm1, tmm1
+ tdpbssd tmm1, tmm0, tmm1
+ tdpbssd tmm1, tmm1, tmm0
diff --git a/gas/testsuite/gas/i386/x86-64-amx.d b/gas/testsuite/gas/i386/x86-64-amx.d
new file mode 100644
index 0000000000..ad6f42240b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx.d
@@ -0,0 +1,70 @@
+#as:
+#objdump: -d
+#name: x86_64 AMX insns
+#source: x86-64-amx.s
+
+.*: +file format .*
+
+
+Disassembly of section \.text:
+
+0+ <_start>:
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 04 51[ ]*ldtilecfg \(%rcx,%rdx,2\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 49 04 51[ ]*sttilecfg \(%rcx,%rdx,2\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 52 5c dc[ ]*tdpbf16ps %tmm5,%tmm4,%tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 63 5e ca[ ]*tdpbssd %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 62 5e ca[ ]*tdpbsud %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 61 5e ca[ ]*tdpbusd %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 60 5e ca[ ]*tdpbuud %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 25 00[ ]*tileloadd 0x0,%tmm5
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 21[ ]*tileloadd \(%rcx,%riz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 2c 21[ ]*tileloadd \(%ecx,%eiz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 11[ ]*tileloadd \(%rcx,%rdx,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 0c 51[ ]*tileloadd \(%ecx,%edx,2\),%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 25 00[ ]*tileloaddt1 0x0,%tmm5
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 21[ ]*tileloaddt1 \(%rcx,%riz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 2c 21[ ]*tileloaddt1 \(%ecx,%eiz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 11[ ]*tileloaddt1 \(%rcx,%rdx,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 0c 51[ ]*tileloaddt1 \(%ecx,%edx,2\),%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 0c 61[ ]*tileloaddt1 \(%rcx,%riz,2\),%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 c0[ ]*tilerelease *
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 21[ ]*tilestored %tmm5,\(%rcx,%riz,1\)
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 2c 21[ ]*tilestored %tmm5,\(%ecx,%eiz,1\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 11[ ]*tilestored %tmm5,\(%rcx,%rdx,1\)
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 0c 51[ ]*tilestored %tmm1,\(%ecx,%edx,2\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 c0[ ]*tilezero %tmm0
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 e8[ ]*tilezero %tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 f8[ ]*tilezero %tmm7
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 01[ ]*ldtilecfg \(%rcx\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 03[ ]*ldtilecfg \(%rbx\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 49 01[ ]*sttilecfg \(%rcx\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 49 03[ ]*sttilecfg \(%rbx\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 52 5c dc[ ]*tdpbf16ps %tmm5,%tmm4,%tmm3
+[ ]*[a-f0-9]+:[ ]*c4 e2 63 5e ca[ ]*tdpbssd %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 62 5e ca[ ]*tdpbsud %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 61 5e ca[ ]*tdpbusd %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 60 5e ca[ ]*tdpbuud %tmm3,%tmm2,%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 25 00[ ]*tileloadd 0x0,%tmm5
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 21[ ]*tileloadd \(%rcx,%riz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 2c 21[ ]*tileloadd \(%ecx,%eiz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 4b 2c 11[ ]*tileloadd \(%rcx,%rdx,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7b 4b 0c 51[ ]*tileloadd \(%ecx,%edx,2\),%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 25 00[ ]*tileloaddt1 0x0,%tmm5
+[ ]*[a-f0-9]+:[ ]*00 00 00[ ]*
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 21[ ]*tileloaddt1 \(%rcx,%riz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 2c 21[ ]*tileloaddt1 \(%ecx,%eiz,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 2c 11[ ]*tileloaddt1 \(%rcx,%rdx,1\),%tmm5
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 79 4b 0c 51[ ]*tileloaddt1 \(%ecx,%edx,2\),%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 79 4b 0c 61[ ]*tileloaddt1 \(%rcx,%riz,2\),%tmm1
+[ ]*[a-f0-9]+:[ ]*c4 e2 78 49 c0[ ]*tilerelease *
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 21[ ]*tilestored %tmm5,\(%rcx,%riz,1\)
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 2c 21[ ]*tilestored %tmm5,\(%ecx,%eiz,1\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 7a 4b 2c 11[ ]*tilestored %tmm5,\(%rcx,%rdx,1\)
+[ ]*[a-f0-9]+:[ ]*67 c4 e2 7a 4b 0c 51[ ]*tilestored %tmm1,\(%ecx,%edx,2\)
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 c0[ ]*tilezero %tmm0
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 e8[ ]*tilezero %tmm5
+[ ]*[a-f0-9]+:[ ]*c4 e2 7b 49 f8[ ]*tilezero %tmm7
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-amx.s b/gas/testsuite/gas/i386/x86-64-amx.s
new file mode 100644
index 0000000000..c70543152b
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-amx.s
@@ -0,0 +1,61 @@
+
+ .allow_index_reg
+ .text
+_start:
+ ldtilecfg (%rcx,%rdx,2)
+ sttilecfg (%rcx,%rdx,2)
+ tdpbf16ps %tmm5, %tmm4, %tmm3
+ tdpbssd %tmm3, %tmm2, %tmm1
+ tdpbsud %tmm3, %tmm2, %tmm1
+ tdpbusd %tmm3, %tmm2, %tmm1
+ tdpbuud %tmm3, %tmm2, %tmm1
+ tileloadd foo, %tmm5
+ tileloadd (%rcx), %tmm5
+ tileloadd (%ecx), %tmm5
+ tileloadd (%rcx,%rdx,1), %tmm5
+ tileloadd (%ecx,%edx,2), %tmm1
+ tileloaddt1 foo, %tmm5
+ tileloaddt1 (%rcx), %tmm5
+ tileloaddt1 (%ecx), %tmm5
+ tileloaddt1 (%rcx,%rdx,1), %tmm5
+ tileloaddt1 (%ecx,%edx,2), %tmm1
+ tileloaddt1 (%rcx,%riz,2), %tmm1
+ tilerelease
+ tilestored %tmm5, (%rcx)
+ tilestored %tmm5, (%ecx)
+ tilestored %tmm5, (%rcx,%rdx,1)
+ tilestored %tmm1, (%ecx,%edx,2)
+ tilezero %tmm0
+ tilezero %tmm5
+ tilezero %tmm7
+
+
+ .intel_syntax noprefix
+ ldtilecfg [rcx]
+ ldtilecfg [rbx]
+ sttilecfg [rcx]
+ sttilecfg [rbx]
+ tdpbf16ps tmm3, tmm4, tmm5
+ tdpbssd tmm1, tmm2, tmm3
+ tdpbsud tmm1, tmm2, tmm3
+ tdpbusd tmm1, tmm2, tmm3
+ tdpbuud tmm1, tmm2, tmm3
+ tileloadd tmm5, foo
+ tileloadd tmm5, [rcx]
+ tileloadd tmm5, [ecx]
+ tileloadd tmm5, [rcx+rdx]
+ tileloadd tmm1, [ecx+edx*2]
+ tileloaddt1 tmm5, foo
+ tileloaddt1 tmm5, [rcx]
+ tileloaddt1 tmm5, [ecx]
+ tileloaddt1 tmm5, [rcx+rdx]
+ tileloaddt1 tmm1, [ecx+edx*2]
+ tileloaddt1 tmm1, [rcx+riz*2]
+ tilerelease
+ tilestored [rcx], tmm5
+ tilestored [ecx], tmm5
+ tilestored [rcx+rdx], tmm5
+ tilestored [ecx+edx*2], tmm1
+ tilezero tmm0
+ tilezero tmm5
+ tilezero tmm7
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 956e2c3539..2b4ad3cd4e 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -375,6 +375,7 @@ fetch_data (struct disassemble_info *info, bfd_byte *addr)
#define XMScalar { OP_XMM, scalar_mode }
#define XMGatherQ { OP_XMM, vex_vsib_q_w_dq_mode }
#define XMM { OP_XMM, xmm_mode }
+#define TMM { OP_XMM, tmm_mode }
#define XMxmmq { OP_XMM, xmmq_mode }
#define EM { OP_EM, v_mode }
#define EMS { OP_EM, v_swap_mode }
@@ -391,6 +392,7 @@ fetch_data (struct disassemble_info *info, bfd_byte *addr)
#define EXxS { OP_EX, x_swap_mode }
#define EXxmm { OP_EX, xmm_mode }
#define EXymm { OP_EX, ymm_mode }
+#define EXtmm { OP_EX, tmm_mode }
#define EXxmmq { OP_EX, xmmq_mode }
#define EXEvexHalfBcstXmmq { OP_EX, evex_half_bcst_xmmq_mode }
#define EXxmm_mb { OP_EX, xmm_mb_mode }
@@ -421,6 +423,7 @@ fetch_data (struct disassemble_info *info, bfd_byte *addr)
#define Vex128 { OP_VEX, vex128_mode }
#define Vex256 { OP_VEX, vex256_mode }
#define VexGdq { OP_VEX, dq_mode }
+#define VexTmm { OP_VEX, tmm_mode }
#define EXdVexScalarS { OP_EX_Vex, d_scalar_swap_mode }
#define EXqVexScalarS { OP_EX_Vex, q_scalar_swap_mode }
#define EXVexW { OP_EX_VexW, x_mode }
@@ -451,6 +454,8 @@ fetch_data (struct disassemble_info *info, bfd_byte *addr)
#define MVexVSIBQWpX { OP_M, vex_vsib_q_w_dq_mode }
#define MVexVSIBQDWpX { OP_M, vex_vsib_q_w_d_mode }
+#define MVexSIBMEM { OP_M, vex_sibmem_mode }
+
/* Used handle "rep" prefix for string instructions. */
#define Xbr { REP_Fixup, eSI_reg }
#define Xvr { REP_Fixup, eSI_reg }
@@ -542,6 +547,8 @@ enum
ymmq_mode,
/* 32-byte YMM or 16-byte word operand */
ymmxmm_mode,
+ /* TMM operand */
+ tmm_mode,
/* d_mode in 32bit, q_mode in 64bit mode. */
m_mode,
/* pair of v_mode operands */
@@ -595,6 +602,8 @@ enum
vex_vsib_q_w_dq_mode,
/* Similar to vex_vsib_q_w_dq_mode, with smaller memory. */
vex_vsib_q_w_d_mode,
+ /* mandatory non-vector SIB. */
+ vex_sibmem_mode,
/* scalar, ignore vector length. */
scalar_mode,
@@ -743,6 +752,7 @@ enum
REG_VEX_0F72,
REG_VEX_0F73,
REG_VEX_0FAE,
+ REG_VEX_0F3849_P_0_W_0_M_1,
REG_VEX_0F38F3,
REG_XOP_LWPCB,
REG_XOP_LWP,
@@ -826,6 +836,17 @@ enum
MOD_0FE7_PREFIX_2,
MOD_0FF0_PREFIX_3,
MOD_0F382A_PREFIX_2,
+ MOD_VEX_0F3849_P_0_W_0,
+ MOD_VEX_0F3849_P_2_W_0,
+ MOD_VEX_0F3849_P_3_W_0,
+ MOD_VEX_0F384B_P_1_W_0,
+ MOD_VEX_0F384B_P_2_W_0,
+ MOD_VEX_0F384B_P_3_W_0,
+ MOD_VEX_0F385C_P_1_W_0,
+ MOD_VEX_0F385E_P_0_W_0,
+ MOD_VEX_0F385E_P_1_W_0,
+ MOD_VEX_0F385E_P_2_W_0,
+ MOD_VEX_0F385E_P_3_W_0,
MOD_0F38F5_PREFIX_2,
MOD_0F38F6_PREFIX_0,
MOD_0F38F8_PREFIX_1,
@@ -963,6 +984,7 @@ enum
RM_0F1E_P_1_MOD_3_REG_7,
RM_0FAE_REG_6_MOD_3_P_0,
RM_0FAE_REG_7_MOD_3,
+ RM_VEX_0F3849_P_0_W_0_M_1_R_0
};
enum
@@ -1298,9 +1320,13 @@ enum
PREFIX_VEX_0F3845,
PREFIX_VEX_0F3846,
PREFIX_VEX_0F3847,
+ PREFIX_VEX_0F3849,
+ PREFIX_VEX_0F384B,
PREFIX_VEX_0F3858,
PREFIX_VEX_0F3859,
PREFIX_VEX_0F385A,
+ PREFIX_VEX_0F385C,
+ PREFIX_VEX_0F385E,
PREFIX_VEX_0F3878,
PREFIX_VEX_0F3879,
PREFIX_VEX_0F388C,
@@ -1673,7 +1699,19 @@ enum
X86_64_0F01_REG_0,
X86_64_0F01_REG_1,
X86_64_0F01_REG_2,
- X86_64_0F01_REG_3
+ X86_64_0F01_REG_3,
+ X86_64_VEX_0F3849_P_0_W_0_M_0_L_0,
+ X86_64_VEX_0F3849_P_0_W_0_M_1_REG_0_RM_0_L_0,
+ X86_64_VEX_0F3849_P_2_W_0_M_0_L_0,
+ X86_64_VEX_0F3849_P_3_W_0_M_0_L_0,
+ X86_64_VEX_0F384B_P_1_W_0_M_0_L_0,
+ X86_64_VEX_0F384B_P_2_W_0_M_0_L_0,
+ X86_64_VEX_0F384B_P_3_W_0_M_0_L_0,
+ X86_64_VEX_0F385C_P_1_W_0_M_0_L_0,
+ X86_64_VEX_0F385E_P_0_W_0_M_0_L_0,
+ X86_64_VEX_0F385E_P_1_W_0_M_0_L_0,
+ X86_64_VEX_0F385E_P_2_W_0_M_0_L_0,
+ X86_64_VEX_0F385E_P_3_W_0_M_0_L_0
};
enum
@@ -1758,7 +1796,19 @@ enum
VEX_LEN_0F381A_P_2_M_0,
VEX_LEN_0F3836_P_2,
VEX_LEN_0F3841_P_2,
+ VEX_LEN_0F3849_P_0_W_0_M_0,
+ VEX_LEN_0F3849_P_0_W_0_M_1_REG_0_RM_0,
+ VEX_LEN_0F3849_P_2_W_0_M_0,
+ VEX_LEN_0F3849_P_3_W_0_M_0,
+ VEX_LEN_0F384B_P_1_W_0_M_0,
+ VEX_LEN_0F384B_P_2_W_0_M_0,
+ VEX_LEN_0F384B_P_3_W_0_M_0,
VEX_LEN_0F385A_P_2_M_0,
+ VEX_LEN_0F385C_P_1_W_0_M_0,
+ VEX_LEN_0F385E_P_0_W_0_M_0,
+ VEX_LEN_0F385E_P_1_W_0_M_0,
+ VEX_LEN_0F385E_P_2_W_0_M_0,
+ VEX_LEN_0F385E_P_3_W_0_M_0,
VEX_LEN_0F38DB_P_2,
VEX_LEN_0F38F2_P_0,
VEX_LEN_0F38F3_R_1_P_0,
@@ -1926,9 +1976,20 @@ enum
VEX_W_0F382F_P_2_M_0,
VEX_W_0F3836_P_2,
VEX_W_0F3846_P_2,
+ VEX_W_0F3849_P_0,
+ VEX_W_0F3849_P_2,
+ VEX_W_0F3849_P_3,
+ VEX_W_0F384B_P_1,
+ VEX_W_0F384B_P_2,
+ VEX_W_0F384B_P_3,
VEX_W_0F3858_P_2,
VEX_W_0F3859_P_2,
VEX_W_0F385A_P_2_M_0,
+ VEX_W_0F385C_P_1,
+ VEX_W_0F385E_P_0,
+ VEX_W_0F385E_P_1,
+ VEX_W_0F385E_P_2,
+ VEX_W_0F385E_P_3,
VEX_W_0F3878_P_2,
VEX_W_0F3879_P_2,
VEX_W_0F38CF_P_2,
@@ -3045,6 +3106,16 @@ static const char *att_names_zmm[] = {
"%zmm28", "%zmm29", "%zmm30", "%zmm31"
};
+static const char **names_tmm;
+static const char *intel_names_tmm[] = {
+ "tmm0", "tmm1", "tmm2", "tmm3",
+ "tmm4", "tmm5", "tmm6", "tmm7"
+};
+static const char *att_names_tmm[] = {
+ "%tmm0", "%tmm1", "%tmm2", "%tmm3",
+ "%tmm4", "%tmm5", "%tmm6", "%tmm7"
+};
+
static const char **names_mask;
static const char *intel_names_mask[] = {
"k0", "k1", "k2", "k3", "k4", "k5", "k6", "k7"
@@ -3413,6 +3484,10 @@ static const struct dis386 reg_table[][8] = {
{ MOD_TABLE (MOD_VEX_0FAE_REG_2) },
{ MOD_TABLE (MOD_VEX_0FAE_REG_3) },
},
+ /* REG_VEX_0F3849_P_0_W_0_M_1 */
+ {
+ { RM_TABLE (RM_VEX_0F3849_P_0_W_0_M_1_R_0) },
+ },
/* REG_VEX_0F38F3 */
{
{ Bad_Opcode },
@@ -5794,6 +5869,22 @@ static const struct dis386 prefix_table[][4] = {
{ "vpsllv%LW", { XM, Vex, EXx }, 0 },
},
+ /* PREFIX_VEX_0F3849 */
+ {
+ { VEX_W_TABLE (VEX_W_0F3849_P_0) },
+ { Bad_Opcode },
+ { VEX_W_TABLE (VEX_W_0F3849_P_2) },
+ { VEX_W_TABLE (VEX_W_0F3849_P_3) },
+ },
+
+ /* PREFIX_VEX_0F384B */
+ {
+ { Bad_Opcode },
+ { VEX_W_TABLE (VEX_W_0F384B_P_1) },
+ { VEX_W_TABLE (VEX_W_0F384B_P_2) },
+ { VEX_W_TABLE (VEX_W_0F384B_P_3) },
+ },
+
/* PREFIX_VEX_0F3858 */
{
{ Bad_Opcode },
@@ -5815,6 +5906,21 @@ static const struct dis386 prefix_table[][4] = {
{ MOD_TABLE (MOD_VEX_0F385A_PREFIX_2) },
},
+ /* PREFIX_VEX_0F385C */
+ {
+ { Bad_Opcode },
+ { VEX_W_TABLE (VEX_W_0F385C_P_1) },
+ { Bad_Opcode },
+ },
+
+ /* PREFIX_VEX_0F385E */
+ {
+ { VEX_W_TABLE (VEX_W_0F385E_P_0) },
+ { VEX_W_TABLE (VEX_W_0F385E_P_1) },
+ { VEX_W_TABLE (VEX_W_0F385E_P_2) },
+ { VEX_W_TABLE (VEX_W_0F385E_P_3) },
+ },
+
/* PREFIX_VEX_0F3878 */
{
{ Bad_Opcode },
@@ -6830,6 +6936,78 @@ static const struct dis386 x86_64_table[][2] = {
{ "lidt{Q|Q}", { M }, 0 },
{ "lidt", { M }, 0 },
},
+
+ /* X86_64_VEX_0F3849_P_0_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "ldtilecfg", { M }, 0 },
+ },
+
+ /* X86_64_VEX_0F3849_P_0_W_0_M_1_REG_0_RM_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tilerelease", { Skip_MODRM }, 0 },
+ },
+
+ /* X86_64_VEX_0F3849_P_2_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "sttilecfg", { M }, 0 },
+ },
+
+ /* X86_64_VEX_0F3849_P_3_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tilezero", { TMM, Skip_MODRM }, 0 },
+ },
+
+ /* X86_64_VEX_0F384B_P_1_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tilestored", { MVexSIBMEM, TMM }, 0 },
+ },
+
+ /* X86_64_VEX_0F384B_P_2_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tileloaddt1", { TMM, MVexSIBMEM }, 0 },
+ },
+
+ /* X86_64_VEX_0F384B_P_3_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tileloadd", { TMM, MVexSIBMEM }, 0 },
+ },
+
+ /* X86_64_VEX_0F385C_P_1_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tdpbf16ps", { TMM, EXtmm, VexTmm }, 0 },
+ },
+
+ /* X86_64_VEX_0F385E_P_0_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tdpbuud", {TMM, EXtmm, VexTmm }, 0 },
+ },
+
+ /* X86_64_VEX_0F385E_P_1_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tdpbsud", {TMM, EXtmm, VexTmm }, 0 },
+ },
+
+ /* X86_64_VEX_0F385E_P_2_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tdpbusd", {TMM, EXtmm, VexTmm }, 0 },
+ },
+
+ /* X86_64_VEX_0F385E_P_3_W_0_M_0_L_0 */
+ {
+ { Bad_Opcode },
+ { "tdpbssd", {TMM, EXtmm, VexTmm }, 0 },
+ },
};
static const struct dis386 three_byte_table[][256] = {
@@ -8671,9 +8849,9 @@ static const struct dis386 vex_table[][256] = {
{ PREFIX_TABLE (PREFIX_VEX_0F3847) },
/* 48 */
{ Bad_Opcode },
+ { PREFIX_TABLE (PREFIX_VEX_0F3849) },
{ Bad_Opcode },
- { Bad_Opcode },
- { Bad_Opcode },
+ { PREFIX_TABLE (PREFIX_VEX_0F384B) },
{ Bad_Opcode },
{ Bad_Opcode },
{ Bad_Opcode },
@@ -8692,9 +8870,9 @@ static const struct dis386 vex_table[][256] = {
{ PREFIX_TABLE (PREFIX_VEX_0F3859) },
{ PREFIX_TABLE (PREFIX_VEX_0F385A) },
{ Bad_Opcode },
+ { PREFIX_TABLE (PREFIX_VEX_0F385C) },
{ Bad_Opcode },
- { Bad_Opcode },
- { Bad_Opcode },
+ { PREFIX_TABLE (PREFIX_VEX_0F385E) },
{ Bad_Opcode },
/* 60 */
{ Bad_Opcode },
@@ -9432,12 +9610,72 @@ static const struct dis386 vex_len_table[][2] = {
{ "vphminposuw", { XM, EXx }, 0 },
},
+ /* VEX_LEN_0F3849_P_0_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F3849_P_0_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F3849_P_0_W_0_M_1_REG_0_RM_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F3849_P_0_W_0_M_1_REG_0_RM_0_L_0) },
+ },
+
+ /* VEX_LEN_0F3849_P_2_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F3849_P_2_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F3849_P_3_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F3849_P_3_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F384B_P_1_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F384B_P_1_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F384B_P_2_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F384B_P_2_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F384B_P_3_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F384B_P_3_W_0_M_0_L_0) },
+ },
+
/* VEX_LEN_0F385A_P_2_M_0 */
{
{ Bad_Opcode },
{ VEX_W_TABLE (VEX_W_0F385A_P_2_M_0) },
},
+ /* VEX_LEN_0F385C_P_1_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F385C_P_1_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F385E_P_0_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F385E_P_0_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F385E_P_1_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F385E_P_1_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F385E_P_2_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F385E_P_2_W_0_M_0_L_0) },
+ },
+
+ /* VEX_LEN_0F385E_P_3_W_0_M_0 */
+ {
+ { X86_64_TABLE (X86_64_VEX_0F385E_P_3_W_0_M_0_L_0) },
+ },
+
/* VEX_LEN_0F38DB_P_2 */
{
{ "vaesimc", { XM, EXx }, 0 },
@@ -9930,6 +10168,30 @@ static const struct dis386 vex_w_table[][2] = {
/* VEX_W_0F3846_P_2 */
{ "vpsravd", { XM, Vex, EXx }, 0 },
},
+ {
+ /* VEX_W_0F3849_P_0 */
+ { MOD_TABLE (MOD_VEX_0F3849_P_0_W_0) },
+ },
+ {
+ /* VEX_W_0F3849_P_2 */
+ { MOD_TABLE (MOD_VEX_0F3849_P_2_W_0) },
+ },
+ {
+ /* VEX_W_0F3849_P_3 */
+ { MOD_TABLE (MOD_VEX_0F3849_P_3_W_0) },
+ },
+ {
+ /* VEX_W_0F384B_P_1 */
+ { MOD_TABLE (MOD_VEX_0F384B_P_1_W_0) },
+ },
+ {
+ /* VEX_W_0F384B_P_2 */
+ { MOD_TABLE (MOD_VEX_0F384B_P_2_W_0) },
+ },
+ {
+ /* VEX_W_0F384B_P_3 */
+ { MOD_TABLE (MOD_VEX_0F384B_P_3_W_0) },
+ },
{
/* VEX_W_0F3858_P_2 */
{ "vpbroadcastd", { XM, EXxmm_md }, 0 },
@@ -9942,6 +10204,26 @@ static const struct dis386 vex_w_table[][2] = {
/* VEX_W_0F385A_P_2_M_0 */
{ "vbroadcasti128", { XM, Mxmm }, 0 },
},
+ {
+ /* VEX_W_0F385C_P_1 */
+ { MOD_TABLE (MOD_VEX_0F385C_P_1_W_0) },
+ },
+ {
+ /* VEX_W_0F385E_P_0 */
+ { MOD_TABLE (MOD_VEX_0F385E_P_0_W_0) },
+ },
+ {
+ /* VEX_W_0F385E_P_1 */
+ { MOD_TABLE (MOD_VEX_0F385E_P_1_W_0) },
+ },
+ {
+ /* VEX_W_0F385E_P_2 */
+ { MOD_TABLE (MOD_VEX_0F385E_P_2_W_0) },
+ },
+ {
+ /* VEX_W_0F385E_P_3 */
+ { MOD_TABLE (MOD_VEX_0F385E_P_3_W_0) },
+ },
{
/* VEX_W_0F3878_P_2 */
{ "vpbroadcastb", { XM, EXxmm_mb }, 0 },
@@ -10388,6 +10670,57 @@ static const struct dis386 mod_table[][2] = {
/* MOD_0F382A_PREFIX_2 */
{ "movntdqa", { XM, Mx }, 0 },
},
+ {
+ /* MOD_VEX_0F3849_P_0_W_0 */
+ { VEX_LEN_TABLE (VEX_LEN_0F3849_P_0_W_0_M_0) },
+ { REG_TABLE (REG_VEX_0F3849_P_0_W_0_M_1) },
+ },
+ {
+ /* MOD_VEX_0F3849_P_2_W_0 */
+ { VEX_LEN_TABLE (VEX_LEN_0F3849_P_2_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F3849_P_3_W_0 */
+ { Bad_Opcode },
+ { VEX_LEN_TABLE (VEX_LEN_0F3849_P_3_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F384B_P_1_W_0 */
+ { VEX_LEN_TABLE (VEX_LEN_0F384B_P_1_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F384B_P_2_W_0 */
+ { VEX_LEN_TABLE (VEX_LEN_0F384B_P_2_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F384B_P_3_W_0 */
+ { VEX_LEN_TABLE (VEX_LEN_0F384B_P_3_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F385C_P_1_W_0 */
+ { Bad_Opcode },
+ { VEX_LEN_TABLE (VEX_LEN_0F385C_P_1_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F385E_P_0_W_0 */
+ { Bad_Opcode },
+ { VEX_LEN_TABLE (VEX_LEN_0F385E_P_0_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F385E_P_1_W_0 */
+ { Bad_Opcode },
+ { VEX_LEN_TABLE (VEX_LEN_0F385E_P_1_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F385E_P_2_W_0 */
+ { Bad_Opcode },
+ { VEX_LEN_TABLE (VEX_LEN_0F385E_P_2_W_0_M_0) },
+ },
+ {
+ /* MOD_VEX_0F385E_P_3_W_0 */
+ { Bad_Opcode },
+ { VEX_LEN_TABLE (VEX_LEN_0F385E_P_3_W_0_M_0) },
+ },
{
/* MOD_0F38F5_PREFIX_2 */
{ "wrussK", { M, Gdq }, PREFIX_OPCODE },
@@ -10949,6 +11282,10 @@ static const struct dis386 rm_table[][8] = {
{ "sfence", { Skip_MODRM }, 0 },
},
+ {
+ /* RM_VEX_0F3849_P_0_W_0_M_1_R_0 */
+ { VEX_LEN_TABLE (VEX_LEN_0F3849_P_0_W_0_M_1_REG_0_RM_0) },
+ },
};
#define INTERNAL_DISASSEMBLER_ERROR _("<internal disassembler error>")
@@ -11845,6 +12182,7 @@ print_insn (bfd_vma pc, disassemble_info *info)
names_xmm = intel_names_xmm;
names_ymm = intel_names_ymm;
names_zmm = intel_names_zmm;
+ names_tmm = intel_names_tmm;
index64 = intel_index64;
index32 = intel_index32;
names_mask = intel_names_mask;
@@ -11867,6 +12205,7 @@ print_insn (bfd_vma pc, disassemble_info *info)
names_xmm = att_names_xmm;
names_ymm = att_names_ymm;
names_zmm = att_names_zmm;
+ names_tmm = att_names_tmm;
index64 = att_index64;
index32 = att_index32;
names_mask = att_names_mask;
@@ -14023,6 +14362,15 @@ OP_E_memory (int bytemode, int sizeflag)
base = sib.base;
codep++;
}
+ else
+ {
+ /* mandatory non-vector SIB must have sib */
+ if (bytemode == vex_sibmem_mode)
+ {
+ oappend ("(bad)");
+ return;
+ }
+ }
rbase = base + add;
switch (modrm.mod)
@@ -15050,6 +15398,7 @@ OP_XMM (int bytemode, int sizeflag ATTRIBUTE_UNUSED)
&& bytemode != xmmq_mode
&& bytemode != evex_half_bcst_xmmq_mode
&& bytemode != ymm_mode
+ && bytemode != tmm_mode
&& bytemode != scalar_mode)
{
switch (vex.length)
@@ -15088,6 +15437,16 @@ OP_XMM (int bytemode, int sizeflag ATTRIBUTE_UNUSED)
abort ();
}
}
+ else if (bytemode == tmm_mode)
+ {
+ if (reg >= 8)
+ {
+ oappend ("(bad)");
+ return;
+ }
+ names = names_tmm;
+ }
+
else if (bytemode == ymm_mode)
names = names_ymm;
else
@@ -15212,6 +15571,7 @@ OP_EX (int bytemode, int sizeflag)
&& bytemode != xmmq_mode
&& bytemode != evex_half_bcst_xmmq_mode
&& bytemode != ymm_mode
+ && bytemode != tmm_mode
&& bytemode != d_scalar_swap_mode
&& bytemode != q_scalar_swap_mode
&& bytemode != vex_scalar_w_dq_mode)
@@ -15247,6 +15607,15 @@ OP_EX (int bytemode, int sizeflag)
abort ();
}
}
+ else if (bytemode == tmm_mode)
+ {
+ if (reg >= 8)
+ {
+ oappend ("(bad)");
+ return;
+ }
+ names = names_tmm;
+ }
else if (bytemode == ymm_mode)
names = names_ymm;
else
@@ -15802,6 +16171,17 @@ OP_VEX (int bytemode, int sizeflag ATTRIBUTE_UNUSED)
return;
}
+ if (bytemode == tmm_mode)
+ {
+ if (reg >= 8)
+ {
+ oappend ("(bad)");
+ return;
+ }
+ oappend (names_tmm[reg]);
+ return;
+ }
+
switch (vex.length)
{
case 128:
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 7230f87344..3334155071 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -297,6 +297,12 @@ static initializer cpu_flag_init[] =
"CpuWAITPKG" },
{ "CPU_CLDEMOTE_FLAGS",
"CpuCLDEMOTE" },
+ { "CPU_AMX_INT8_FLAGS",
+ "CpuAMX_INT8" },
+ { "CPU_AMX_BF16_FLAGS",
+ "CpuAMX_BF16" },
+ { "CPU_AMX_TILE_FLAGS",
+ "CpuAMX_TILE" },
{ "CPU_MOVDIRI_FLAGS",
"CpuMOVDIRI" },
{ "CPU_MOVDIR64B_FLAGS",
@@ -383,6 +389,12 @@ static initializer cpu_flag_init[] =
"CpuAVX512_BITALG" },
{ "CPU_ANY_AVX512_BF16_FLAGS",
"CpuAVX512_BF16" },
+ { "CPU_ANY_AMX_INT8_FLAGS",
+ "CpuAMX_INT8" },
+ { "CPU_ANY_AMX_BF16_FLAGS",
+ "CpuAMX_BF16" },
+ { "CPU_ANY_AMX_TILE_FLAGS",
+ "CpuAMX_TILE|CpuAMX_INT8|CpuAMX_BF16" },
{ "CPU_ANY_MOVDIRI_FLAGS",
"CpuMOVDIRI" },
{ "CPU_ANY_MOVDIR64B_FLAGS",
@@ -459,6 +471,8 @@ static initializer operand_type_init[] =
"Class=RegSIMD|Ymmword" },
{ "OPERAND_TYPE_REGZMM",
"Class=RegSIMD|Zmmword" },
+ { "OPERAND_TYPE_REGTMM",
+ "Class=RegSIMD|Tmmword" },
{ "OPERAND_TYPE_REGMASK",
"Class=RegMask" },
{ "OPERAND_TYPE_REGBND",
@@ -611,6 +625,9 @@ static bitfield cpu_flags[] =
BITFIELD (CpuPCONFIG),
BITFIELD (CpuWAITPKG),
BITFIELD (CpuCLDEMOTE),
+ BITFIELD (CpuAMX_INT8),
+ BITFIELD (CpuAMX_BF16),
+ BITFIELD (CpuAMX_TILE),
BITFIELD (CpuMOVDIRI),
BITFIELD (CpuMOVDIR64B),
BITFIELD (CpuENQCMD),
@@ -741,6 +758,7 @@ static bitfield operand_types[] =
BITFIELD (Xmmword),
BITFIELD (Ymmword),
BITFIELD (Zmmword),
+ BITFIELD (Tmmword),
BITFIELD (Unspecified),
#ifdef OTUnused
BITFIELD (OTUnused),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index c65febbe81..b8a6dfc25c 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -223,6 +223,12 @@ enum
/* CET instructions support required */
CpuIBT,
CpuSHSTK,
+ /* AMX-INT8 instructions required */
+ CpuAMX_INT8,
+ /* AMX-BF16 instructions required */
+ CpuAMX_BF16,
+ /* AMX-TILE instructions required */
+ CpuAMX_TILE,
/* GFNI instructions required */
CpuGFNI,
/* VAES instructions required */
@@ -372,6 +378,9 @@ typedef union i386_cpu_flags
unsigned int cpuptwrite:1;
unsigned int cpuibt:1;
unsigned int cpushstk:1;
+ unsigned int cpuamx_int8:1;
+ unsigned int cpuamx_bf16:1;
+ unsigned int cpuamx_tile:1;
unsigned int cpugfni:1;
unsigned int cpuvaes:1;
unsigned int cpuvpclmulqdq:1;
@@ -574,7 +583,9 @@ enum
#define VECSIB128 1
#define VECSIB256 2
#define VECSIB512 3
+#define SIBMEM 4
SIB,
+
/* SSE to AVX support required */
SSE2AVX,
/* No AVX equivalent */
@@ -702,7 +713,7 @@ typedef struct i386_opcode_modifier
unsigned int vexw:2;
unsigned int vexopcode:3;
unsigned int vexsources:2;
- unsigned int sib:2;
+ unsigned int sib:3;
unsigned int sse2avx:1;
unsigned int noavx:1;
unsigned int evex:3;
@@ -807,6 +818,8 @@ enum
Ymmword,
/* ZMMWORD size. */
Zmmword,
+ /* TMMWORD size. */
+ Tmmword,
/* Unspecified memory size. */
Unspecified,
@@ -851,6 +864,7 @@ typedef union i386_operand_type
unsigned int xmmword:1;
unsigned int ymmword:1;
unsigned int zmmword:1;
+ unsigned int tmmword:1;
unsigned int unspecified:1;
#ifdef OTUnused
unsigned int unused:(OTNumOfBits - OTUnused);
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index cd6833c5ae..2a8ec52b41 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -52,6 +52,7 @@
#define RegXMM Class=RegSIMD|Xmmword
#define RegYMM Class=RegSIMD|Ymmword
#define RegZMM Class=RegSIMD|Zmmword
+#define RegTMM Class=RegSIMD|Tmmword
#define RegMask Class=RegMask
@@ -88,6 +89,7 @@
#define VecSIB128 SIB=VECSIB128
#define VecSIB256 SIB=VECSIB256
#define VecSIB512 SIB=VECSIB512
+#define Sibmem SIB=SIBMEM|Modrm
#define EVex128 EVex=EVEX128
#define EVex256 EVex=EVEX256
@@ -4093,3 +4095,24 @@ xsusldtrk, 0, 0xf20f01e8, None, 3, CpuTSXLDTRK, No_bSuf|No_wSuf|No_lSuf|No_sSuf|
xresldtrk, 0, 0xf20f01e9, None, 3, CpuTSXLDTRK, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
// TSXLDTRK instructions end.
+
+// AMX instructions.
+
+ldtilecfg, 1, 0x49, None, 1, CpuAMX_TILE|Cpu64, Modrm|Vex128|VexOpcode=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex }
+sttilecfg, 1, 0x6649, None, 1, CpuAMX_TILE|Cpu64, Modrm|Vex128|VexOpcode=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex }
+
+tdpbf16ps, 3, 0xf35c, None, 1, CpuAMX_BF16|Cpu64, Modrm|Vex128|VexOpcode=1|VexVVVV=1|VexW0|SwapSources|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM, RegTMM, RegTMM }
+tdpbssd, 3, 0xf25e, None, 1, CpuAMX_INT8|Cpu64, Modrm|Vex128|VexOpcode=1|VexVVVV=1|VexW0|SwapSources|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM, RegTMM, RegTMM }
+tdpbuud, 3, 0x5e, None, 1, CpuAMX_INT8|Cpu64, Modrm|Vex128|VexOpcode=1|VexVVVV=1|VexW0|SwapSources|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM, RegTMM, RegTMM }
+tdpbusd, 3, 0x665e, None, 1, CpuAMX_INT8|Cpu64, Modrm|Vex128|VexOpcode=1|VexVVVV=1|VexW0|SwapSources|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM, RegTMM, RegTMM }
+tdpbsud, 3, 0xf35e, None, 1, CpuAMX_INT8|Cpu64, Modrm|Vex128|VexOpcode=1|VexVVVV=1|VexW0|SwapSources|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM, RegTMM, RegTMM }
+
+tileloadd, 2, 0xf24b, None, 1, CpuAMX_TILE|Cpu64, Sibmem|Vex128|VexOpcode=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex, RegTMM }
+tileloaddt1, 2, 0x664b, None, 1, CpuAMX_TILE|Cpu64, Sibmem|Vex128|VexOpcode=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex, RegTMM }
+tilestored, 2, 0xf34b, None, 1, CpuAMX_TILE|Cpu64, Sibmem|Vex128|VexOpcode=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM, Unspecified|BaseIndex }
+
+tilerelease, 0, 0x49c0, None, 2, CpuAMX_TILE|Cpu64, Vex128|VexOpcode=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
+
+tilezero, 1, 0xf249, None, 1, CpuAMX_TILE|Cpu64, Modrm|Vex128|VexOpcode=1|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM }
+
+// AMX instructions end.
diff --git a/opcodes/i386-reg.tbl b/opcodes/i386-reg.tbl
index cdff763ca7..ca7eeba488 100644
--- a/opcodes/i386-reg.tbl
+++ b/opcodes/i386-reg.tbl
@@ -278,6 +278,15 @@ zmm28, Class=RegSIMD|Zmmword, RegVRex|RegRex, 4, Dw2Inval, Dw2Inval
zmm29, Class=RegSIMD|Zmmword, RegVRex|RegRex, 5, Dw2Inval, Dw2Inval
zmm30, Class=RegSIMD|Zmmword, RegVRex|RegRex, 6, Dw2Inval, Dw2Inval
zmm31, Class=RegSIMD|Zmmword, RegVRex|RegRex, 7, Dw2Inval, Dw2Inval
+// TMM registers for AMX
+tmm0, Class=RegSIMD|Tmmword, 0, 0, Dw2Inval, Dw2Inval
+tmm1, Class=RegSIMD|Tmmword, 0, 1, Dw2Inval, Dw2Inval
+tmm2, Class=RegSIMD|Tmmword, 0, 2, Dw2Inval, Dw2Inval
+tmm3, Class=RegSIMD|Tmmword, 0, 3, Dw2Inval, Dw2Inval
+tmm4, Class=RegSIMD|Tmmword, 0, 4, Dw2Inval, Dw2Inval
+tmm5, Class=RegSIMD|Tmmword, 0, 5, Dw2Inval, Dw2Inval
+tmm6, Class=RegSIMD|Tmmword, 0, 6, Dw2Inval, Dw2Inval
+tmm7, Class=RegSIMD|Tmmword, 0, 7, Dw2Inval, Dw2Inval
// Bound registers for MPX
bnd0, Class=RegBND, 0, 0, Dw2Inval, Dw2Inval
bnd1, Class=RegBND, 0, 1, Dw2Inval, Dw2Inval
--
2.17.1
Thanks,
Lili.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-Add-support-for-Intel-AMX-instructions.patch
Type: application/octet-stream
Size: 55075 bytes
Desc: 0001-x86-Add-support-for-Intel-AMX-instructions.patch
URL: <https://sourceware.org/pipermail/binutils/attachments/20200708/9036bfc3/attachment-0001.obj>
More information about the Binutils
mailing list