[PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16 instructions

Cui, Lili lili.cui@intel.com
Tue Jul 6 12:42:50 GMT 2021



> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, July 2, 2021 11:47 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: binutils@sourceware.org; hjl.tools@gmail.com
> Subject: Re: [PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16 instructions
> 
> On 02.07.2021 15:42, Jan Beulich via Binutils wrote:
> > vcvtph2pd may need to have their AVX512VL templates split: Having e.g.
> > Word|RegXMM|Dword means Word is the broadcast form and both
> XmmWord
> > and Dword can be used with non-broadcast memory operands.
> > But that's not true - in Intel syntax only "dword ptr" ought to be
> > valid there, afaict. Same for vcvtph2qq (where the group also has a
> > stray blank line in the middle), vcvtph2uqq, vcvttph2qq, and
> > vcvttph2uqq.
> 
> Hmm, looking at your testcase addition to xmmword.s I'm guessing now that
> I was wrong with the above, albeit I couldn't point at the code in tc-i386.c
> that's making this work.
> 

Hi Jan, 
really appreciate your so many good suggestions, I will review and modify them one by one.

I found a similar instruction "vcvtph2ps" in AVX512F, and it also works normally, because there is a special 
judgment in function " match_mem_size", which maybe not a good way to resolve this case.

vcvtph2ps, 0x6613, None, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|Space0F38|VexW0|Disp8MemShift=3|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }

In match_mem_size function, 

               /* For scalar opcode templates to allow register and memory
                  operands at the same time, some special casing is needed
                  here.  Also for v{,p}broadcast*, {,v}pmov{s,z}*, and
                  down-conversion vpmov*.  */
               || ((t->operand_types[wanted].bitfield.class == RegSIMD
                    && t->operand_types[wanted].bitfield.byte
                       + t->operand_types[wanted].bitfield.word
                       + t->operand_types[wanted].bitfield.dword
                       + t->operand_types[wanted].bitfield.qword
                       > !!t->opcode_modifier.broadcast)
                   ? (i.types[given].bitfield.xmmword
                      || i.types[given].bitfield.ymmword
                      || i.types[given].bitfield.zmmword)
                   : !match_simd_size(t, wanted, given))));



More information about the Binutils mailing list