[PATCH] Support ymm rounding control for Intel AVX10.2

Fri Aug 2 06:36:07 GMT 2024

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, August 2, 2024 2:16 PM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: hjl.tools@gmail.com; binutils@sourceware.org
> Subject: Re: [PATCH] Support ymm rounding control for Intel AVX10.2
> 
> On 02.08.2024 04:39, Jiang, Haochen wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Thursday, August 1, 2024 6:08 PM
> >>
> >> On 01.08.2024 09:33, Haochen Jiang wrote:
> >>> @@ -1732,6 +1735,7 @@ _is_cpu (const i386_cpu_attr *a, enum
> i386_cpu
> >> cpu)
> >>>      case CpuAVX512F:  return a->bitfield.cpuavx512f;
> >>>      case CpuAVX512VL: return a->bitfield.cpuavx512vl;
> >>>      case CpuAPX_F:    return a->bitfield.cpuapx_f;
> >>> +    case CpuAVX10_2:  return a->bitfield.cpuavx10_2;
> >>>      case Cpu64:       return a->bitfield.cpu64;
> >>>      case CpuNo64:     return a->bitfield.cpuno64;
> >>>      default:
> >>
> >> This shouldn't be needed; see the comment on i386-opc.h.
> >
> > It seems unneeded till now or even till the end of the patch series.
> > But it will be needed in the future when AVX10.2 co-operate with other ISAs
> > in future processors, which should also happen in Binutils 2.44 timeframe.
> 
> We can move it when such interactions require it. It is only at that point
> when it can be decided which of the features to move here.

Ok, I will leave this part to future.

> 
> >>> --- /dev/null
> >>> +++ b/gas/testsuite/gas/i386/avx10_2-rounding.d
> >>> @@ -0,0 +1,451 @@
> >>
> >> This file is only half the size of avx10_2-rounding-intel.d - why?
> >
> > This is because actually in those asm file, the att syntax are on the top half
> > of the testcases, the second half is intel syntax. No need to check them twice
> > for att syntax.
> >
> > For Intel syntax, since we could not just skip the first part, the size is
> doubled.
> 
> Of course you can, using "#..." on a line on its own. You'll find examples
> in existing testcases.

I c. I will change them to reduce testcase size.

> >>> --- a/opcodes/i386-dis.c
> >>> +++ b/opcodes/i386-dis.c
> >>> @@ -229,6 +229,7 @@ struct instr_info
> >>>      bool b;
> >>>      bool no_broadcast;
> >>>      bool nf;
> >>> +    bool u;
> >>>    }
> >>>    vex;
> >>>
> >>> @@ -9030,6 +9031,8 @@ get_valid_dis386 (const struct dis386 *dp,
> >>> instr_info *ins)
> >>>
> >>>        if (!(*ins->codep & 0x4))
> >>>  	ins->rex2 |= REX_X;
> >>> +
> >>> +      ins->vex.u = *ins->codep & 0x4;
> >>>
> >>>        switch ((*ins->codep & 0x3))
> >>>  	{
> >>> @@ -9066,7 +9069,7 @@ get_valid_dis386 (const struct dis386 *dp,
> >> instr_info *ins)
> >>>  	  /* Report bad for !evex_default and when two fixed values of evex
> >>>  	     change..  */
> >>>  	  if (ins->evex_type != evex_default
> >>> -	      || (ins->rex2 & (REX_B | REX_X)))
> >>> +	      && (ins->rex2 & (REX_B | REX_X)))
> >>
> >> I can see why you may need to change this for REX_X, but hardly for REX_B
> at
> >> the same time?
> >
> > I suppose it is a typo according to the comment, which is buggy potentially
> and
> > found in AVX10.2.
> 
> If it was a pre-existing bug, it would want fixing separately (so it can
> be backported), or at the very least it would want calling out explicitly
> in the description.

I checked the encoding and found I misunderstood something. It is not buggy and
actually I just need to bypass REX_X. I will change the logic here.

Thx,
Haochen

> 
> Jan