x86: Add support for Intel AMX instructions

Jan Beulich jbeulich@suse.com
Mon Jul 6 12:35:06 GMT 2020


On 06.07.2020 05:36, Cui, Lili wrote:
> @@ -10926,9 +10970,10 @@ i386_index_check (const char *operand_string)
>  		      || !i.index_reg->reg_type.bitfield.baseindex)))
>  	    goto bad_address;
>  
> -	  /* bndmk, bndldx, and bndstx have special restrictions. */
> +	  /* bndmk, bndldx, bndstx and mandatory non-vector SIB have special restrictions. */
>  	  if (current_templates->start->base_opcode == 0xf30f1b
> -	      || (current_templates->start->base_opcode & ~1) == 0x0f1a)
> +	      || (current_templates->start->base_opcode & ~1) == 0x0f1a
> +	      || current_templates->start->opcode_modifier.sib == SIBMEM)

With this in view, isn't it possible to ...

> @@ -10939,6 +10984,7 @@ i386_index_check (const char *operand_string)
>  
>  	      /* bndldx and bndstx ignore their scale factor. */
>  	      if (current_templates->start->base_opcode != 0xf30f1b
> +		  && current_templates->start->opcode_modifier.sib != SIBMEM
>  		  && i.log2_scale_factor)
>  		as_warn (_("register scaling is being ignored here"));

... check "... & ~1) == 0x0f1a" here, instead of adding a second
condition (resulting in overall more simple code)?

> --- a/gas/testsuite/gas/i386/i386.exp
> +++ b/gas/testsuite/gas/i386/i386.exp
> @@ -1137,6 +1137,9 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
>      run_dump_test "x86-64-lfence-ret-d"
>      run_dump_test "x86-64-lfence-ret-e"
>      run_dump_test "x86-64-lfence-byte"
> +    run_list_test "x86-64-amx-inval"
> +    run_dump_test "x86-64-amx"
> +    run_dump_test "x86-64-amx-intel"

There still look to be tests missing here that were previously
requested (proving correct [failing] decode of AMX insns with
VEX.W, .L, .R, .B set or the high bit of VEX.VVVV clear; the
latter few of course only if you and H.J. continue to object to
the addition of %tmm8...%tmm15, which I've pointed out before
the specification doesn't exclude).

> @@ -6938,6 +7043,78 @@ static const struct dis386 x86_64_table[][2] = {
>      { "lidt{Q|Q}", { M }, 0 },
>      { "lidt", { M }, 0 },
>    },

With this even in context, ...

> +  /* X86_64_VEX_0F3849_P_0_W_0_M_0_L_0 */
> +  {
> +    { Bad_Opcode },
> +    { "ldtilecfg", { EV }, 0 },

... is there anything wrong to use M instead of the new EV here
and ...

> +  /* X86_64_VEX_0F3849_P_2_W_0_M_0_L_0 */
> +  {
> +    { Bad_Opcode },
> +    { "sttilecfg", { EV }, 0 },

... here (and again twice below)? New things should imo be
introduced only when there are no existing suitable items. If
you look at some of the cleanup I've been doing recently (and
more in the works), you'll notice how several pieces had been
introduced over time when there already was suitable logic
available.

> +  /* X86_64_VEX_0F3849_P_3_W_0_M_0_L_0 */
> +  {
> +    { Bad_Opcode },
> +    { "tilezero", { XMT, Skip_MODRM }, 0 },

At the example of this - where does the "XMT" name come from?
Following existing XMM naming, I'd expect this to be either
TM (along the lines of XM) or TMM (along the lines of XMM).
(And again I can't help the impression that the two actually
perform the same thing.)

> +  /* X86_64_VEX_0F385C_P_1_W_0_M_0_L_0 */
> +  {
> +    { Bad_Opcode },
> +    { "tdpbf16ps", { XMT, EXtmm, Vextmm }, 0 },

Along the lines of the above, EXt then (paralleling EXx)? For
the last operand here I'd suggest VexTmm or VexTMM.

> @@ -10474,6 +10755,57 @@ static const struct dis386 mod_table[][2] = {
>      /* MOD_0F382A_PREFIX_2 */
>      { "movntdqa",	{ XM, Mx }, 0 },
>    },
> +  {
> +    /* MOD_VEX_0F3849_P_0_W_0 */
> +    { VEX_LEN_TABLE (VEX_LEN_0F3849_P_0_W_0_M_0) },
> +    { REG_TABLE (REG_VEX_0F3849_P_0_W_0_M_3) },

If you look at existing suffixes you'll find that we have _M_1
or _MOD_3, but not _M_3. Please try to conform to this. (_MOD_
should probably get replaced altogether.)

> @@ -13451,6 +13789,8 @@ intel_operand_size (int bytemode, int sizeflag)
>      }
>    switch (bytemode)
>      {
> +    case generic_mode:
> +      break;
>      case b_mode:

I don't see the need for this addition - the default case covers
it fine afaics.

Jan


More information about the Binutils mailing list