[PATCH] x86: Support Intel AVX512 BF16
H.J. Lu
hjl.tools@gmail.com
Mon Apr 8 18:18:00 GMT 2019
On Mon, Apr 8, 2019 at 1:56 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 05.04.19 at 20:01, <hongjiu.lu@intel.com> wrote:
> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -4710,3 +4710,33 @@ movdir64b, 2, 0x660f38f8, None, 3, CpuMOVDIR64B|CpuNo64, Modrm|IgnoreSize|No_bSu
> > movdir64b, 2, 0x660f38f8, None, 3, CpuMOVDIR64B|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|AddrPrefixOpReg, { Unspecified|BaseIndex, Reg32|Reg64 }
> >
> > // MOVEDIR instructions end.
> > +
> > +// AVX512_BF16 instructions.
> > +
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Xmmword|Unspecified|BaseIndex, RegXMM, RegXMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Ymmword|Unspecified|BaseIndex, RegYMM, RegYMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex=1|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Zmmword|Unspecified|BaseIndex, RegZMM, RegZMM }
>
> Here and below, can you please avoid introducing anew inefficient
> table entries with partly irrelevant / redundant attributes:
> - use Disp8ShiftVL (and no explicit EVex<NNN> / EVex=<N> nor
> CpuAVX512VL), thus folding the above three entries into one
> - don't explicitly use [XYZ]mmWord (redundant with Reg[XYZ]MM,
> once folding with the non-broadcast forms is also done - see
> further down)
> - presumably IgnoreSize is not needed
> I hope I didn't forget further ones.
Done.
> For brevity / readability I'd also like to recommend omitting"=1" in
> insn attribute specifications. i386-gen assumes 1 whether there's
> no explicit value given.
Done.
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, RegYMM }
> > +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex512|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM, RegZMM, RegZMM }
>
> Why this second set of three entries? Broadcast is an optional
> attribute, i.e. the first entries (really: entry) ought to cover
> everything.
Done.
> > +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
> > +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, RegYMM }
> > +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex512|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM, RegZMM, RegZMM }
> > +// AVX512_BF16 instructions end.
>
> Can we gain a blank line above this sentinel please?
>
Done.
I am checking in this patch to update BF16 which was implemented by
Xuepeng before he left Intel.
Thanks.
--
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-Consolidate-AVX512-BF16-entries-in-i386-opc.tbl.patch
Type: text/x-patch
Size: 7358 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20190408/f03c6af3/attachment.bin>
More information about the Binutils
mailing list