[PATCH] x86: VCVTNEPS2BF16{X,Y} should permit broadcasting
H.J. Lu
hjl.tools@gmail.com
Mon Jan 20 14:42:00 GMT 2020
On Mon, Jan 20, 2020 at 6:35 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 20.01.2020 14:58, H.J. Lu wrote:
> > On Mon, Jan 20, 2020 at 4:00 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> Just like other VCVT*{X,Y} templates do, and to allow the programmer
> >> flexibility (might be relevant in particular when heavily macro-izing
> >> code), the two templates should also have Broadcast set, just like their
> >> X/Y-suffix-less counterparts. This in turn requires them to also have
> >> * Dword set on their memory operands, to cover the logic added to i386gen
> >> by 4a1b91eabbe7 ("x86: Expand Broadcast to 3 bits"),
> >> * Xmmword/Ymmword set on their memory operands, to satisfy broadcast
> >> sizing logic in gas itself.
> >> Otherwise ATTSyntax templates wouldn't need such operand size attributes.
> >>
> >> While extending the test cases, also add Intel syntax broadcast forms
> >> without explicit size specifiers.
> >>
> >> gas/
> >> 2020-01-XX Jan Beulich <jbeulich@suse.com>
> >>
> >> * testsuite/gas/i386/avx512_bf16_vl.s,
> >> testsuite/gas/i386/x86-64-avx512_bf16_vl.s: Add broadcast forms
> >> of VCVTNEPS2BF16{X,Y}. Add operand-size less Intel syntax
> >> broadcast forms of VCVTNEPS2BF16.
> >> * testsuite/gas/i386/avx512_bf16_vl.d,
> >> testsuite/gas/i386/x86-64-avx512_bf16_vl.d: Adjust expectations.
> >>
> >> opcodes/
> >> 2020-01-XX Jan Beulich <jbeulich@suse.com>
> >>
> >> * i386-opc.tbl (vcvtneps2bf16x): Add Broadcast, Xmmword, and
> >> Dword.
> >> (vcvtneps2bf16y): Add Broadcast, Ymmword, and Dword.
> >> * i386-tbl.h: Re-generate.
> >>
> >
> > OK.
>
> Thanks. What about this remark
>
> >>---
> >>Arguably, just like other VCVT*{X,Y}, the ones here could then also be
> >>made permit RegXMM/RegYMM as source operand. Personally I'd prefer this,
> >>but there was resistance to such in similar earlier cases.
>
> that the original mail had? I think I've meanwhile found a case where
> it would actually help if all such templates were consistent in this
> regard.
Are you referring to use AVX512VL for suffix check? I don't we should
do that.
--
H.J.
More information about the Binutils
mailing list