[PATCH] Support APX zero-upper
Cui, Lili
lili.cui@intel.com
Thu May 9 07:56:39 GMT 2024
> On 28.04.2024 12:54, Cui, Lili wrote:
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -1920,7 +1920,7 @@ static INLINE bool need_evex_encoding (const
> insn_template *t)
> > return i.encoding == encoding_evex
> > || i.encoding == encoding_evex512
> > || (t->opcode_modifier.vex && i.encoding == encoding_egpr)
> > @@ -4285,8 +4286,9 @@ build_apx_evex_prefix (void)
> > i.vex.bytes[3] &= ~0x08;
> >
> > /* Encode the NDD bit of the instruction promoted from the legacy
> > - space. */
> > - if (i.vex.register_specifier && i.tm.opcode_space ==
> > SPACE_EVEXMAP4)
> > + space. ZU shares the same bit with NDD. */ if
> > + ((i.vex.register_specifier && i.tm.opcode_space == SPACE_EVEXMAP4)
> > + || i.tm.opcode_modifier.zu)
> > i.vex.bytes[3] |= 0x10;
> >
> > /* Encode the NF bit. */
> > @@ -9204,7 +9206,7 @@ match_template (char mnem_suffix)
> > /* APX insns acting on byte operands are WIG, yet that can't be expressed
> > in the templates (they're also covering word/dword/qword operands).
> */
> > if (t->opcode_space == SPACE_EVEXMAP4 && !t->opcode_modifier.vexw
> &&
> > - i.types[i.operands - 1].bitfield.byte)
> > + i.types[i.operands - 1].bitfield.byte &&
> > + !t->opcode_modifier.zu)
>
> With a change request at the bottom this won't be needed anymore either, I
> think.
>
This's a good idea.
> > @@ -14060,3 +14077,15 @@ JMPABS_Fixup (instr_info *ins, int bytemode,
> int sizeflag)
> > return OP_IMREG (ins, bytemode, sizeflag);
> > return OP_OFF64 (ins, bytemode, sizeflag); }
> > +
> > +static bool
> > +IMUL_Fixup (instr_info *ins, int bytemode, int sizeflag) {
> > + /* Although imul do not support NDD, the EVEX.ND bit is used to control
> > + whether its destination register has its upper bits zeroed when OSIZE
> > + is 16b. */
> > + if (ins->vex.nd)
> > + ins->mnemonicendp = stpcpy (ins->obuf, "imulzu");
>
> Despite the comment this handling isn't restricted to 16-bit operand size.
>
> > + return OP_G (ins, bytemode, sizeflag); }
>
> Further for SETZUcc I can't even spot how you check that EVEX.NDD=1. With
> EVEX.NDD=0 aiui this is ordinary SETcc, just EVEX-encoded.
>
Good point, I also found other issues with "{nf} imulzu"( {nf} was flushed), I added a macro %ZU for them and dropped IMUL_Fixup. Also added more test cases for them.
+ case 'U':
+ if (l == 1 && (last[0] == 'Z'))
+ {
+ /* Although IMUL/SETcc does not support NDD, the EVEX.ND bit is
+ used to control whether its destination register has its upper
+ bits zeroed when OSIZE is 16b/8b. */
+ if (ins->vex.nd)
+ {
+ oappend (ins, "zu");
+ /* When we print zu for the EVEX instruction, we no longer
+ need prefix {evex}. */
+ if (evex_printed == true && startswith (ins->obufp, "{evex}"))
+ ins->obufp += 6;
+ }
+ }
+ else
+ abort ();
+ break;
> > @@ -528,6 +530,7 @@ loopne, 0xe0, x64,
> > JumpByte|No_bSuf|No_wSuf|No_sSuf|NoRex64, { Disp8 }
> >
> > // Set byte on flag instructions.
> > set<cc>, 0xf9<cc:opc>/0, i386,
> Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf,
> > { Reg8|Unspecified|BaseIndex }
> > +setzu<cc>, 0xf24<cc:opc>/0, APX_F,
> > +Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|EVexMap4|ZU, { Reg8 }
>
> Didn't we kind of agree to also permit
>
> set<cc>, 0xf24<cc:opc>/0, APX_F, Modrm|No_bSuf|No_sSuf|EVexMap4|ZU,
> { Reg32|Reg64 }
>
We discussed this internally, and the spec folks thought that adding two SETZU formats to the spec was a bit redundant and might confuse users. Therefore, the spec will not be updated, it's a bit strange that binutils adds a separate format.
Thanks,
Lili.
More information about the Binutils
mailing list