x86: Support Intel AVX VNNI

Jan Beulich jbeulich@suse.com
Thu Oct 15 15:28:11 GMT 2020


On 15.10.2020 17:23, H.J. Lu wrote:
> On Thu, Oct 15, 2020 at 8:22 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 15.10.2020 14:38, H.J. Lu wrote:
>>> On Thu, Oct 15, 2020 at 5:28 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 15.10.2020 13:15, H.J. Lu wrote:
>>>>> On Thu, Oct 15, 2020 at 12:24 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 15.10.2020 09:10, Cui, Lili wrote:
>>>>>>>>>> @@ -1964,7 +1967,14 @@ cpu_flags_match (const insn_template *t)
>>>>>>>>>>        cpu = cpu_flags_and (x, cpu);
>>>>>>>>>>        if (!cpu_flags_all_zero (&cpu))
>>>>>>>>>>       {
>>>>>>>>>> -       if (x.bitfield.cpuavx)
>>>>>>>>>> +       if (x.bitfield.cpuvex_prefix)
>>>>>>>>>> +         {
>>>>>>>>>> +           /* We need to check a few extra flags with VEX_PREFIX.  */
>>>>>>>>>> +           if (i.vec_encoding == vex_encoding_vex
>>>>>>>>>> +               || i.vec_encoding == vex_encoding_vex3)
>>>>>>>>>> +             match |= CPU_FLAGS_ARCH_MATCH;
>>>>>>>>>> +         }
>>>>>>>>>> +       else if (x.bitfield.cpuavx)
>>>>>>>>>
>>>>>>>>> Is this (including the new cpuvex_prefix attribute, which imo
>>>>>>>>> shouldn't be a Cpu* bit) really needed? Couldn't you achieve the same
>>>>>>>>> by placing the templates _after_ the AVX512 counterparts? Iirc
>>>>>>>>> templates get tried in order, and the first match wins. The {vex3}
>>>>>>>>> prefix would then prevent a match on the EVEX-encoded AVX512_VNNI
>>>>>>>> templates.
>>>>>>>>
>>>>>>>> Lili, please look into it.
>>>>>>>>
>>>>>>>
>>>>>>> I add an invalid test for it, we need cpuvex_prefix attribute for under scenario.
>>>>>>>
>>>>>>> .arch .noavx512_vnni
>>>>>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>>>>>
>>>>>>> As without the pseudo {vex} prefix, this instruction should be encoded with EVEX prefix.
>>>>>>> we should report error for it, I rename CpuVEX_PREFIX to PseudoVexPrefix
>>>>>>> and move it into opcode_modifier bit, thanks.
>>>>>>
>>>>>> I disagree, unless AVX-VNNI was specified to have a dependency on
>>>>>> AVX512-VNNI (which would seem pretty odd, as meanwhile I've noticed
>>>>>> that another reason for introducing these encodings may be to allow
>>>>>> their use on AVX512-incapable hardware). The above very much should
>>>>>> result in the VEX encoding despite the absence of a {vex} prefix.
>>>>>> It's really only the default case of everything being enabled where
>>>>>> the pseudo-prefix should be mandated. This particularly implies
>>>>>> that an explicit ".arch .avx_vnni" ought to _also_ eliminate the
>>>>>> need for the pseudo prefix.
>>>>>
>>>>> AVX VNNI always requires the {vex} prefix.  It isn't optional.
>>>>
>>>> That's said or written where? These are new insns with - afaict - no
>>>> specification beyond the ISA extensions doc. There's nothing like
>>>
>>> This is true.  When we implemented AVX VNNI, we decided that
>>> the {vex} prefix is mandatory so that
>>>
>>> vpdpbusd %xmm2,%xmm4,%xmm2
>>>
>>> always mean EVEX encoding.
>>
>> And this decision was discussed internally at Intel, and other
> 
> Internally.
> 
>> community members get no say at all?

Please can such discussions be had in the public for open source projects?
I continue to think that the behavior as implemented is not the best
possible choice. Therefore I'd like to at least hear the arguments that
led to this decision.

Thanks, Jan


More information about the Binutils mailing list