[PATCH 1/3] Support Intel AMX-TRANSPOSE
Jiang, Haochen
haochen.jiang@intel.com
Tue Dec 24 08:36:59 GMT 2024
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, December 24, 2024 4:33 PM
>
> On 24.12.2024 04:10, Jiang, Haochen wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Thursday, December 19, 2024 7:21 PM
> >>
> >> On 18.12.2024 07:32, Haochen Jiang wrote:
> >>> @@ -10750,25 +10752,43 @@ process_operands (void)
> >>> unsigned int op, extra;
> >>> const reg_entry *first;
> >>>
> >>> - /* The second operand must be {x,y,z}mmN. */
> >>> - gas_assert (i.operands == 3 && i.types[1].bitfield.class == RegSIMD);
> >>> + /* The second operand must be {x,y,z,t}mmN */
> >>> + gas_assert ((i.operands == 2 || i.operands == 3)
> >>> + && i.types[1].bitfield.class == RegSIMD);
> >>>
> >>> - switch (i.types[2].bitfield.class)
> >>> + if (i.operands == 3)
> >>> {
> >>> - case RegSIMD:
> >>> - /* AVX512-{4FMAPS,4VNNIW} operand 2: N must be a multiple of
> >> 4. */
> >>> - op = 1;
> >>> - extra = 3;
> >>> - break;
> >>> + switch (i.types[2].bitfield.class)
> >>> + {
> >>> + case RegSIMD:
> >>> + /* AVX512-{4FMAPS,4VNNIW} operand 2: N must be a multiple of
> >> 4. */
> >>> + op = 1;
> >>> + extra = 3;
> >>> + break;
> >>>
> >>> - case RegMask:
> >>> - /* AVX512-VP2INTERSECT operand 3: N must be a multiple of 2. */
> >>> - op = 2;
> >>> - extra = 1;
> >>> - break;
> >>> + case RegMask:
> >>> + /* AVX512-VP2INTERSECT operand 3: N must be a multiple of 2. */
> >>> + op = 2;
> >>> + extra = 1;
> >>> + break;
> >>>
> >>> - default:
> >>> - abort ();
> >>> + default:
> >>> + abort ();
> >>> + }
> >>> + }
> >>> + else
> >>> + {
> >>> + switch (i.types[1].bitfield.class)
> >>> + {
> >>> + case RegSIMD:
> >>> + /* AMX-TRANSPOSE operand 2: N must be a multiple of 2. */
> >>> + op = 1;
> >>> + extra = 1;
> >>> + break;
> >>> +
> >>> + default:
> >>> + abort ();
> >>> + }
> >>> }
> >>
> >> This could have been done with less churn, also making it easier to review.
> >> There's imo no need to wrap an operand count check around the switch().
> >> Instead in the RegSIMD case you can check the register type
> >> (Tmmword), thus likely making the new code a simple insertion. That's
> >> what I had in mind when originally laying out the code that you're now fully
> re-indenting.
> >>
> >
> > Let me have a try. I am not sure if it could be done.
> >
> > I need to mention here the AMX-TRANSPOSE related inst only got two
> operands.
> > There are no types[2] here. Then you have to use types[1] in switch.
> > However, we could not distinguish the original AVX512-VP2INTERSECT and
> > AVX512_4FMAPS with types[1].
>
> No, that wasn't the (implied) intention. Instead I was assuming you would
> simply check the _last_ operand uniformly (i.e. i.types[i.operands - 1]).
>
Aha that is a much better way I missed. I will send out the patch with that
change soon.
Thx,
Haochen
More information about the Binutils
mailing list