[PATCH] x86: Support Intel AVX512 BF16

H.J. Lu hjl.tools@gmail.com
Mon Apr 8 18:18:00 GMT 2019


On Mon, Apr 8, 2019 at 1:56 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 05.04.19 at 20:01, <hongjiu.lu@intel.com> wrote:
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -4710,3 +4710,33 @@ movdir64b, 2, 0x660f38f8, None, 3, CpuMOVDIR64B|CpuNo64, Modrm|IgnoreSize|No_bSu
> >  movdir64b, 2, 0x660f38f8, None, 3, CpuMOVDIR64B|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|AddrPrefixOpReg, { Unspecified|BaseIndex, Reg32|Reg64 }
> >
> >  // MOVEDIR instructions end.
> > +
> > +// AVX512_BF16 instructions.
> > +
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Xmmword|Unspecified|BaseIndex, RegXMM, RegXMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Ymmword|Unspecified|BaseIndex, RegYMM, RegYMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex=1|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Zmmword|Unspecified|BaseIndex, RegZMM, RegZMM }
>
> Here and below, can you please avoid introducing anew inefficient
> table entries with partly irrelevant / redundant attributes:
> - use Disp8ShiftVL (and no explicit EVex<NNN> / EVex=<N> nor
>   CpuAVX512VL), thus folding the above three entries into one
> - don't explicitly use [XYZ]mmWord (redundant with Reg[XYZ]MM,
>   once folding with the non-broadcast forms is also done - see
>   further down)
> - presumably IgnoreSize is not needed
> I hope I didn't forget further ones.

Done.

> For brevity / readability I'd also like to recommend omitting"=1" in
> insn attribute specifications. i386-gen assumes 1 whether there's
> no explicit value given.

Done.

> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, RegYMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex512|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM, RegZMM, RegZMM }
>
> Why this second set of three entries? Broadcast is an optional
> attribute, i.e. the first entries (really: entry) ought to cover
> everything.

Done.

> > +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
> > +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, RegYMM }
> > +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex512|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM, RegZMM, RegZMM }
> > +// AVX512_BF16 instructions end.
>
> Can we gain a blank line above this sentinel please?
>

Done.

I am checking in this patch to update BF16 which was implemented by
Xuepeng before he left Intel.

Thanks.

-- 
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-Consolidate-AVX512-BF16-entries-in-i386-opc.tbl.patch
Type: text/x-patch
Size: 7358 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20190408/f03c6af3/attachment.bin>


More information about the Binutils mailing list