[PATCH 1/2] [PATCH 1/2] Enable Intel AVX512_FP16 instructions

Wed Jul 21 14:29:03 GMT 2021

On 21.07.2021 15:14, Cui, Lili wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Wednesday, July 14, 2021 12:25 AM
>> On 13.07.2021 08:58, Cui, Lili wrote:
>>
>> I don't think you've applied my comments regarding the cpu_flag_init[]
>> additions
> 
> Sorry, I forgot this one, I changed it in this patch.
> 
> { "CPU_AVX512_FP16_FLAGS",
>     "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
> ...
>   { "CPU_ANY_AVX512BW_FLAGS",
>     "CpuAVX512BW|CpuAVX512_FP16" },
> ...

In the latter case you will want to use CPU_ANY_AVX512_FP16_FLAGS,
not CpuAVX512_FP16. This not originally having got done properly
for CPU_ANY_AVX512F_FLAGS is why you now also need to change that
one. There preferably you'd replace CpuAVX512BW by
CPU_ANY_AVX512BW_FLAGS, instead of explicitly adding CpuAVX512_FP16.

>> I can't seem to find any pseudos of VCMP{P,S}H, despite the prior comment.
>>
> I have a misunderstanding about this last time, but this time I got it. 
> I added it to assembler and disassembler, also added test cases for it. 

Ah yes, albeit there's some confusion about the numbering of the
attached patches. I wanted to ask you anyway to send new versions of
the patches as new mail threads, instead of in reply to the prior
discussion. In fact, while the patch with the new tests is probably
indeed too large to send inline (but as said before, it's not
really reviewable anyway), I would much appreciate if you could
send the first patch inline instead of as attachment. This makes
commenting a lot easier.

As a purely cosmetic request - may I ask that you flip the order
of the two each vcmp<avx_frel>ph and vcmpph entries, such that - like
everywhere else - the rounding/SAE forms come second? As mentioned
previously, keeping things as consistent as possible helps readers
as well as making future changes or spotting possible problems.

As to the testsuite additions, I'm afraid you've misunderstood
what H.J. and I are requesting for xmmword.{s,d}: You don't need to
add _all_ new insns there - that would be making an unreasonably
large test case going forward. Only insns with irregular operand
combinations permitting %xmmN but not "xmmword ptr ..." for a given
operand position need putting there. This is to prove (now and going
forward) that despite allowing %xmmN the (wrong size) memory form
gets rejected. Regular scalar insns don't need putting there, and
even less so VDIVPH; if at all, a single random example (not from
the FP16 set, but from more basic ones) would be sufficient. So what
I was after with my request was that all the non-scalar VCVT* forms
would be represented there, when their vector element sizes vary.

I'm sorry if the earlier request wasn't explicit enough.

The other remark on the testsuite addition patch is that _if_ you
maintain a ChangeLog entry (which you aren't required to anymore),
then you will want to keep it up-to-date with patch contents.

I'll try to get to look at the assembler and disassembler parts in
more detail later this week, as the following week I'll be OoO.

Jan