[PATCH] x86: Determine vector length from the last vector operand

H.J. Lu hjl.tools@gmail.com
Tue Jul 24 13:51:00 GMT 2018


On Tue, Jul 24, 2018 at 5:34 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 24.07.18 at 14:08, <hjl.tools@gmail.com> wrote:
>> On Tue, Jul 24, 2018 at 4:55 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 24.07.18 at 13:38, <hjl.tools@gmail.com> wrote:
>>>> On Tue, Jul 24, 2018 at 12:41 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 22.07.18 at 21:02, <hjl.tools@gmail.com> wrote:
>>>>>> Determine VEX/EVEXE vector length from the last multi-length vector
>>>>>> operand.
>>>>>
>>>>> I see you've committed this already: It would have been really
>>>>> helpful to say _why_ you're doing the change in the commit message.
>>>>> For posterity as well as my understanding - could you at least do so
>>>>> here please? That's even more so that VEX and EVEX processing
>>>>> differed in that regard before your change.
>>>>
>>>> The encoding can be determined by the last  multi-length vector
>>>> operand.  There is no need to scan from the start.
>>>
>>> But why is scanning from the rear better? Immediates usually sit at the
>>> end. In the EVEX case, considering the break-s you add, this may
>>> indeed be more efficient, but the same then wouldn't hold for the VEX
>>
>> break can be used only when scanning from the last operand.
>
> Yes - that's why I understand (even without the patch description
> saying so) why this way it's more efficient in the EVEX case. The
> VEX case, otoh, is still unclear to me.
>
>>>>>> --- a/gas/config/tc-i386.c
>>>>>> +++ b/gas/config/tc-i386.c
>>>>>> @@ -3363,10 +3363,12 @@ build_vex_prefix (const insn_template *t)
>>>>>>      vector_length = 1;
>>>>>>    else
>>>>>>      {
>>>>>> -      unsigned int op;
>>>>>> +      int op;
>>>>>
>>>>> This is the sort of change I would have objected to, btw. Variables
>>>>> used to index arrays should be unsigned whenever possible. And
>>>>> doing so would have been easy enough here:
>>>>
>>>> I want
>>>>
>>>> if (op < 0)
>>>>   abort ();
>>>>
>>>> after the loop.
>>>
>>>   if (op >= MAX_OPERANDS)
>>>     abort ()
>>>
>>> would be equivalent afaict.
>>
>> It won't work when scanning from the last operand.
>
> Why would it not? The value will wrap through zero to UINT_MAX.

I don't see there are significant differences here.
It is perfectly OK to use "int" as index.

>>>>>>           if (i.tm.operand_types[op].bitfield.xmmword
>>>>>>               + i.tm.operand_types[op].bitfield.ymmword
>>>>>>               + i.tm.operand_types[op].bitfield.zmmword > 1)
>>>>>>             {
>>>>>>               if (i.types[op].bitfield.zmmword)
>>>>>> -               i.tm.opcode_modifier.evex = EVEX512;
>>>>>> +               {
>>>>>> +                 i.tm.opcode_modifier.evex = EVEX512;
>>>>>> +                 break;
>>>>>> +               }
>>>>>>               else if (i.types[op].bitfield.ymmword)
>>>>>> -               i.tm.opcode_modifier.evex = EVEX256;
>>>>>> +               {
>>>>>> +                 i.tm.opcode_modifier.evex = EVEX256;
>>>>>> +                 break;
>>>>>> +               }
>>>>>>               else if (i.types[op].bitfield.xmmword)
>>>>>> -               i.tm.opcode_modifier.evex = EVEX128;
>>>>>> +               {
>>>>>> +                 i.tm.opcode_modifier.evex = EVEX128;
>>>>>> +                 break;
>>>>>> +               }
>>>>>>               else if (i.broadcast && (int) op == i.broadcast->operand)
>>>>>
>>>>> This has rendered all the "else" pointless.
>>>>
>>>> What do you mean? Isn't it used in
>>>>
>>>> vcvttpd2uqq, 2, 0x6678, None, 1, CpuAVX512DQ,
>>>> Modrm|Masking=3|VexOpcode=0|VexW=2|Broadcast|Disp8ShiftVL|CheckRegSize|No_bS
>>>> uf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
>>>> { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex,
>>>> RegXMM|RegYMM|RegZMM }
>>>
>>> I didn't mean to drop the "else if (...)", just the "else" (with the
>>> if () to stay).
>>>
>>
>> Scanning from the last operand is more efficient for both EVEX and VEX.
>
> Well, as per above - I can see why in the EVEX case, but not (yet) in
> the VEX one. But I also can't associate your response here to the
> point made in immediate context.

For

vgf2p8mulb, 3, 0x66cf, None, 1, CpuAVX|CpuGFNI,
Modrm|Vex|VexOpcode=1|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
{ Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }

we only need to check the last operand, instead of the first and
second operands.

vgf2p8mulb (%ecx),%ymm4,%ymm6


-- 
H.J.



More information about the Binutils mailing list