[PATCH 1/3] Support Intel AMX-TRANSPOSE

Jiang, Haochen haochen.jiang@intel.com
Tue Dec 24 08:36:59 GMT 2024


> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, December 24, 2024 4:33 PM
> 
> On 24.12.2024 04:10, Jiang, Haochen wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Thursday, December 19, 2024 7:21 PM
> >>
> >> On 18.12.2024 07:32, Haochen Jiang wrote:
> >>> @@ -10750,25 +10752,43 @@ process_operands (void)
> >>>        unsigned int op, extra;
> >>>        const reg_entry *first;
> >>>
> >>> -      /* The second operand must be {x,y,z}mmN. */
> >>> -      gas_assert (i.operands == 3 && i.types[1].bitfield.class == RegSIMD);
> >>> +      /* The second operand must be {x,y,z,t}mmN */
> >>> +      gas_assert ((i.operands == 2 || i.operands == 3)
> >>> +		  && i.types[1].bitfield.class == RegSIMD);
> >>>
> >>> -      switch (i.types[2].bitfield.class)
> >>> +      if (i.operands == 3)
> >>>  	{
> >>> -	case RegSIMD:
> >>> -	  /* AVX512-{4FMAPS,4VNNIW} operand 2: N must be a multiple of
> >> 4. */
> >>> -	  op = 1;
> >>> -	  extra = 3;
> >>> -	  break;
> >>> +	  switch (i.types[2].bitfield.class)
> >>> +	    {
> >>> +	    case RegSIMD:
> >>> +	      /* AVX512-{4FMAPS,4VNNIW} operand 2: N must be a multiple of
> >> 4. */
> >>> +	      op = 1;
> >>> +	      extra = 3;
> >>> +	      break;
> >>>
> >>> -	case RegMask:
> >>> -	  /* AVX512-VP2INTERSECT operand 3: N must be a multiple of 2. */
> >>> -	  op = 2;
> >>> -	  extra = 1;
> >>> -	  break;
> >>> +	    case RegMask:
> >>> +	      /* AVX512-VP2INTERSECT operand 3: N must be a multiple of 2. */
> >>> +	      op = 2;
> >>> +	      extra = 1;
> >>> +	      break;
> >>>
> >>> -	default:
> >>> -	  abort ();
> >>> +	    default:
> >>> +	      abort ();
> >>> +	    }
> >>> +	}
> >>> +      else
> >>> +	{
> >>> +	  switch (i.types[1].bitfield.class)
> >>> +	    {
> >>> +	    case RegSIMD:
> >>> +	      /* AMX-TRANSPOSE operand 2: N must be a multiple of 2. */
> >>> +	      op = 1;
> >>> +	      extra = 1;
> >>> +	      break;
> >>> +
> >>> +	    default:
> >>> +	      abort ();
> >>> +	    }
> >>>  	}
> >>
> >> This could have been done with less churn, also making it easier to review.
> >> There's imo no need to wrap an operand count check around the switch().
> >> Instead in the RegSIMD case you can check the register type
> >> (Tmmword), thus likely making the new code a simple insertion. That's
> >> what I had in mind when originally laying out the code that you're now fully
> re-indenting.
> >>
> >
> > Let me have a try. I am not sure if it could be done.
> >
> > I need to mention here the AMX-TRANSPOSE related inst only got two
> operands.
> > There are no types[2] here. Then you have to use types[1] in switch.
> > However, we could not distinguish the original AVX512-VP2INTERSECT and
> > AVX512_4FMAPS with types[1].
> 
> No, that wasn't the (implied) intention. Instead I was assuming you would
> simply check the _last_ operand uniformly (i.e. i.types[i.operands - 1]).
> 

Aha that is a much better way I missed. I will send out the patch with that
change soon.

Thx,
Haochen


More information about the Binutils mailing list