[PATCH] x86: Optimize with EVEX128 encoding for AVX512VL

Thu Mar 8 15:29:00 GMT 2018

>>> On 08.03.18 at 15:34, <hjl.tools@gmail.com> wrote:
> On Thu, Mar 8, 2018 at 6:03 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 08.03.18 at 14:54, <hjl.tools@gmail.com> wrote:
>>> On Thu, Mar 8, 2018 at 4:56 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, Mar 8, 2018 at 12:23 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> H.J.,
>>>>>
>>>>> having taken another look at the optimizations you've added
>>>>> recently, I have a couple of remarks to make:
>>>>>
>>>>> 1) I don't think optimizations should raise the ISA requirements.
>>>>> The conversions you do from AVX512F to AVX512VL insns are in
>>>>> direct contradiction to the Disp32 -> Disp8 conversion I had
>>>>> suggested a couple of weeks ago, and that you objected to even if
>>>>> done very carefully (I still intend to produce a patch to that effect,
>>>>> to see whether you would want to reconsider). Since changing the
>>>>> vector length doesn't alter the encoding length, and doesn't - afaict -
>>>>> provide any other benefits, I don't think those conversions are
>>>>> useful at all. All that is useful imo are conversions from EVEX to VEX.
>>>>
>>>> It does reduce the vector size which may reduce CPU power and boost
>>>> CPU frequency.  I am checking this patch to use AVX512VL only if it is
>>>> enabled.
>>>>
>>>>> 2) Considering what the ORM states, I wonder whether it wouldn't
>>>>> be beneficial to uniformly convert all zeroing insns to VXORP*/VPXOR*.
>>>>
>>>> I will check.
>>>>
>>>>> 3) While merge masking indeed precludes the optimization, zeroing
>>>>> masking doesn't - after all it doesn't matter for what reason the
>>>>> respective part of the destination gets zeroed.
>>>>
>>>> Would you mind creating a patch to do that?
>>>>
>>>>> 4) I don't think {evex} prefixes should be ignored, i.e. I think the
>>>>> conversion to VEX encoding should be suppressed if that prefix
>>>>> was given.
>>>>
>>>> Yes.  I will fix it.
>>>>
>>>
>>> I am checking in this updated patch to cover {evex} prefix.
>>
>> Do you really need that extra pseudo_evex_prefix field, i.e.
>> why can't you just check i.vec_encoding?
>>
> 
> Yes, it is needed since  i.vec_encoding will be changed to
> vex_encoding_evex by:
> 
>  /* Upper 16 vector register is only available with VREX in 64bit
>      mode.  */
>   if ((r->reg_flags & RegVRex))
>     {
>       if (i.vec_encoding == vex_encoding_default)
>         i.vec_encoding = vex_encoding_evex;

But in that case you can't lower to VEX encoding anyway.

Jan