x86: Support Intel AVX VNNI

Thu Oct 15 12:38:08 GMT 2020

On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.10.2020 13:15, H.J. Lu wrote:
> > On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 15.10.2020 09:10, Cui, Lili wrote:
> >>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
> >>>>>>        cpu = cpu_flags_and (x, cpu);
> >>>>>>        if (!cpu_flags_all_zero (&cpu))
> >>>>>>       {
> >>>>>> -       if (x.bitfield.cpuavx)
> >>>>>> +       if (x.bitfield.cpuvex_prefix)
> >>>>>> +         {
> >>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
> >>>>>> +           if (i.vec_encoding == vex_encoding_vex
> >>>>>> +               || i.vec_encoding == vex_encoding_vex3)
> >>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
> >>>>>> +         }
> >>>>>> +       else if (x.bitfield.cpuavx)
> >>>>>
> >>>>> Is this (including the new cpuvex_prefix attribute, which imo
> >>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
> >>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
> >>>>> templates get tried in order, and the first match wins. The {vex3}
> >>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
> >>>> templates.
> >>>>
> >>>> Lili, please look into it.
> >>>>
> >>>
> >>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
> >>>
> >>> .arch .noavx512_vnni
> >>> vpdpbusd %xmm2,%xmm4,%xmm2
> >>>
> >>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
> >>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
> >>> and move it into opcode_modifier bit, thanks.
> >>
> >> I disagree, unless AVX-VNNI was specified to have a dependency on
> >> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
> >> that another reason for introducing these encodings may be to allow
> >> their use on AVX512-incapable hardware). The above very much should
> >> result in the VEX encoding despite the absence of a {vex} prefix.
> >> It's really only the default case of everything being enabled where
> >> the pseudo-prefix should be mandated. This particularly implies
> >> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
> >> need for the pseudo prefix.
> >
> > AVX VNNI always requires the {vex} prefix.  It isn't optional.
>
> That's said or written where? These are new insns with - afaict - no
> specification beyond the ISA extensions doc. There's nothing like

This is true.  When we implemented AVX VNNI, we decided that
the {vex} prefix is mandatory so that

vpdpbusd %xmm2,%xmm4,%xmm2

always mean EVEX encoding.

> that said there afaics.
>
> > It is similar to
> >
> > vmovdqu32 %xmm5, %xmm6
> >
> > vs
> >
> > vmovdqu %xmm5, %xmm6
> >
> > It is the 32 suffix vs the {vex} prefix.
>
> I don't see the similarity. the 32 / 64 suffix in the EVEX encoding
> controls EVEX.W. There's nothing similar here.
>

There are no EVEX vmovdqu instructions, just like there are no
AVX VNNI without {vex}.

-- 
H.J.