This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] x86: Support Intel AVX512 BF16


>>> On 05.04.19 at 20:01, <hongjiu.lu@intel.com> wrote:
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -4710,3 +4710,33 @@ movdir64b, 2, 0x660f38f8, None, 3, CpuMOVDIR64B|CpuNo64, Modrm|IgnoreSize|No_bSu
>  movdir64b, 2, 0x660f38f8, None, 3, CpuMOVDIR64B|Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|AddrPrefixOpReg, { Unspecified|BaseIndex, Reg32|Reg64 }
>  
>  // MOVEDIR instructions end.
> +
> +// AVX512_BF16 instructions.
> +
> +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Xmmword|Unspecified|BaseIndex, RegXMM, RegXMM }
> +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Ymmword|Unspecified|BaseIndex, RegYMM, RegYMM }
> +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex=1|VexVVVV=1|Masking=3|VexW0|Broadcast|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Zmmword|Unspecified|BaseIndex, RegZMM, RegZMM }

Here and below, can you please avoid introducing anew inefficient
table entries with partly irrelevant / redundant attributes:
- use Disp8ShiftVL (and no explicit EVex<NNN> / EVex=<N> nor
  CpuAVX512VL), thus folding the above three entries into one
- don't explicitly use [XYZ]mmWord (redundant with Reg[XYZ]MM,
  once folding with the non-broadcast forms is also done - see
  further down)
- presumably IgnoreSize is not needed
I hope I didn't forget further ones.

For brevity / readability I'd also like to recommend omitting"=1" in
insn attribute specifications. i386-gen assumes 1 whether there's
no explicit value given.

> +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
> +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, RegYMM }
> +vcvtne2ps2bf16, 3, 0xf272, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex512|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM, RegZMM, RegZMM }

Why this second set of three entries? Broadcast is an optional
attribute, i.e. the first entries (really: entry) ought to cover
everything.

> +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex128|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM }
> +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16|CpuAVX512VL, Modrm|VexOpcode=1|EVex256|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM, RegYMM, RegYMM }
> +vdpbf16ps, 3, 0xf352, None, 1, CpuAVX512_BF16, Modrm|VexOpcode=1|EVex512|VexVVVV=1|Masking=3|VexW0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM, RegZMM, RegZMM }
> +// AVX512_BF16 instructions end.

Can we gain a blank line above this sentinel please?

Thanks, Jan



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]