[PATCH 01/10] Support Intel AVX-IFMA

Mon Oct 24 05:53:36 GMT 2022

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, October 14, 2022 5:53 PM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: hjl.tools@gmail.com; Wang, Hongyu <hongyu.wang@intel.com>;
> binutils@sourceware.org
> Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA
> 
> On 14.10.2022 11:12, Haochen Jiang wrote:
> > From: wwwhhhyyy <hongyu.wang@intel.com>
> >
> > x86: Support Intel AVX-IFMA
> >
> > Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> > cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> > are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> > encoding for Intel IFMA instructions.
> 
> I firmly object to the proliferation of this mis-feature. As expressed
> before for AVX-VNNI, as long as the user has disabled AVX512 (or
> respective sub-features thereof), there should be no need to use {vex} in
> the source code. There's also no reason at all to make the disassembler
> print {vex} prefixes - we don't do so for any other insns (apart from
> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> (when none of the EVEX-specific features is used).
> 
> I actually have a patch queued to undo the odd behavior for AVX-VNNI, at
> least on the assembler side (which also drops the PseudoVexPrefix
> attribute).

Has rebased the patch to latest trunk and removed PseudoVexPrefix in table.
Also added some testcases just like how your patch did.

> 
> > --- a/opcodes/i386-dis.c
> > +++ b/opcodes/i386-dis.c
> > @@ -1526,6 +1526,8 @@ enum
> >    VEX_W_0F385E_X86_64_P_3,
> >    VEX_W_0F3878,
> >    VEX_W_0F3879,
> > +  VEX_W_0F38B4,
> > +  VEX_W_0F38B5,
> >    VEX_W_0F38CF,
> >    VEX_W_0F3A00_L_1,
> >    VEX_W_0F3A01_L_1,
> > @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F38B4) },
> > +    { VEX_W_TABLE (VEX_W_0F38B5) },
> >      { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> >      { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> >      /* b8 */
> > @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = {
> >      /* VEX_W_0F3879 */
> >      { "vpbroadcastw",	{ XM, EXw }, PREFIX_DATA },
> >    },
> > +  {
> > +    /* VEX_W_0F38B4 */
> > +    { Bad_Opcode },
> > +    { "%XV vpmadd52luq",	{ XM, Vex, EXx }, PREFIX_DATA },
> > +  },
> > +  {
> > +    /* VEX_W_0F38B5 */
> > +    { Bad_Opcode },
> > +    { "%XV vpmadd52huq",	{ XM, Vex, EXx }, PREFIX_DATA },
> > +  },
> 
> Irrespective of the aspect mentioned at the top I think this is yet
> another case where VEX and EVEX table entries can be shared. This would
> (if the {vex} printing really needs retaining for whatever obscure
> reason) merely require the processing of %XV to do nothing for EVEX-
> encoded insns, plus of course the separating blank would then also need
> to be included in the processing of %XV.
> 
> I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries,
> which you could then re-base on top of.

Folded the table of AVX512IFMA and AVX-IFMA.

> 
> > --- a/opcodes/i386-gen.c
> > +++ b/opcodes/i386-gen.c
> > @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] =
> >      "CPU_AVX512F_FLAGS|CpuAVX512_BF16" },
> >    { "CPU_AVX512_FP16_FLAGS",
> >      "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
> > +  { "CPU_AVX_IFMA_FLAGS",
> > +    "CPU_AVX2_FLAGS|CpuAVX_IFMA" },
> >    { "CPU_IAMCU_FLAGS",
> >      "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" },
> >    { "CPU_ADX_FLAGS",
> > @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] =
> >      "CpuHRESET" },
> >    { "CPU_ANY_AVX512_FP16_FLAGS",
> >      "CpuAVX512_FP16" },
> > +  { "CPU_ANY_AVX_IFMA_FLAGS",
> > +    "CpuAVX_IFMA" },
> 
> If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs
> adjustment, such that disabling of AVX2 also results in disabling of
> AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.)
> 

Added AVX-IFMA to CPU_ANY_AVX2_FLAGS.

> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16,
> Modrm|Masking=3|EVexMap6|VexW0|Broadcast
> >  vrsqrtsh, 0x664f, None, CpuAVX512_FP16,
> Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|N
> o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
> >
> >  // FP16 (HFNI) instructions end.
> > +
> > +// AVX_IFMA instructions.
> 
> Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples
> of the (needless) use of underscores like this.
> 
> > +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA,
> Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> RegXMM|RegYMM }
> > +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA,
> Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> RegXMM|RegYMM }
> 
> Please use plain VexVVVV (without =1) - we want to have as little clutter as
> possible on these usually already overlong lines.

Changed to VexVVVV.

Thx for your review and see if there is still something need to be changed.

Haochen

> 
> Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Support-Intel-AVX-IFMA.patch
Type: application/octet-stream
Size: 26325 bytes
Desc: 0001-Support-Intel-AVX-IFMA.patch
URL: <https://sourceware.org/pipermail/binutils/attachments/20221024/e5cd80a4/attachment-0001.obj>