[PATCH] x86: Correct EVEX vector load/store optimization

Jan Beulich JBeulich@suse.com
Tue Mar 19 08:52:00 GMT 2019


>>> On 19.03.19 at 09:48, <hjl.tools@gmail.com> wrote:
> On Tue, Mar 19, 2019 at 4:30 PM Jan Beulich <JBeulich@suse.com> wrote:
>>
>> >>> On 19.03.19 at 07:20, <hjl.tools@gmail.com> wrote:
>> > On Mon, Mar 18, 2019 at 9:49 PM Jan Beulich <JBeulich@suse.com> wrote:
>> >>
>> >> >>> On 17.03.19 at 21:47, <hjl.tools@gmail.com> wrote:
>> >> > --- a/gas/config/tc-i386.c
>> >> > +++ b/gas/config/tc-i386.c
>> >> > @@ -4075,6 +4075,56 @@ optimize_encoding (void)
>> >> >           i.types[j].bitfield.ymmword = 0;
>> >> >         }
>> >> >      }
>> >> > +  else if ((cpu_arch_flags.bitfield.cpuavx
>> >> > +         || cpu_arch_isa_flags.bitfield.cpuavx)
>> >>
>> >> Once again a questionable condition, as per earlier replies to
>> >> other patches of yours.
>> >
>> > Fixed.
>> >
>> >> > +        && i.vec_encoding != vex_encoding_evex
>> >> > +        && !i.types[0].bitfield.zmmword
>> >> > +        && !i.mask
>> >> > +        && is_evex_encoding (&i.tm)
>> >> > +        && (i.tm.base_opcode == 0x666f
>> >> > +            || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0x666f
>> >> > +            || i.tm.base_opcode == 0xf36f
>> >> > +            || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf36f
>> >> > +            || i.tm.base_opcode == 0xf26f
>> >> > +            || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f)
>> >>
>> >> All three of these can be expressed with just a single comparison,
>> >> using & or | instead of ^ and (if necessary) adjusting the literal
>> >> value compared against.
>> >
>> > Fixed.
>> >
>> >> > +        && i.tm.extension_opcode == None)
>> >> > +    {
>> >> > +      /* Optimize: -O1:
>> >> > +        VOP, one of vmovdqa32, vmovdqa64, vmovdqu8, vmovdqu16,
>> >> > +        vmovdqu32 and vmovdqu64:
>> >> > +          EVEX VOP %xmmM, %xmmN
>> >> > +            -> VEX vmovdqa|vmovdqu %xmmM, %xmmN (M and N < 16)
>> >> > +          EVEX VOP %ymmM, %ymmN
>> >> > +            -> VEX vmovdqa|vmovdqu %ymmM, %ymmN (M and N < 16)
>> >> > +          EVEX VOP %xmmM, mem
>> >> > +            -> VEX vmovdqa|vmovdqu %xmmM, mem (M < 16)
>> >> > +          EVEX VOP %ymmM, mem
>> >> > +            -> VEX vmovdqa|vmovdqu %ymmM, mem (M < 16)
>> >> > +          EVEX VOP mem, %xmmN
>> >> > +            -> VEX mvmovdqa|vmovdquem, %xmmN (N < 16)
>> >>
>> >> There's some confusion on this line.
>> >>
>> >> > +          EVEX VOP mem, %ymmN
>> >> > +            -> VEX vmovdqa|vmovdqu mem, %ymmN (N < 16)
>> >> > +       */
>> >>
>> >> For the variants with a memory operand I doubt the conversion
>> >> is always a win, and it may be against the user request in case of
>> >> -Os. This is because of the Disp8 scaling the EVEX encoding permits.
>> >
>> > Fixed.
>> >
>> >> > +      if (i.tm.base_opcode == 0xf26f)
>> >> > +     i.tm.base_opcode = 0xf36f;
>> >> > +      else if ((i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f)
>> >> > +     i.tm.base_opcode = 0xf36f ^ Opcode_SIMD_IntD;
>> >>
>> >> This again can be expressed without "else if()" afaict.
>> >>
>> >
>> > Fixed.
>> >
>> > Here is the patch.
>>
>> Thanks.
>>
>> >--- a/gas/config/tc-i386.c
>> >+++ b/gas/config/tc-i386.c
>> >@@ -4068,18 +4068,14 @@ optimize_encoding (void)
>> >           i.types[j].bitfield.ymmword = 0;
>> >         }
>> >     }
>> >-  else if ((cpu_arch_flags.bitfield.cpuavx
>> >-          || cpu_arch_isa_flags.bitfield.cpuavx)
>> >-         && i.vec_encoding != vex_encoding_evex
>> >+  else if (i.vec_encoding != vex_encoding_evex
>> >          && !i.types[0].bitfield.zmmword
>>
>> Ah, here the remaining cpuavx goes away as well.
>>
>> >+      if ((i.tm.base_opcode & ~Opcode_SIMD_IntD) == 0xf26f)
>> >+      {
>> >+        i.tm.base_opcode &= Opcode_SIMD_IntD;
>> >+        i.tm.base_opcode |= 0xf36f;
>> >+      }
>>
>> How about the even simpler
>>
>>       if ((i.tm.base_opcode & ~Opcode_SIMD_IntD) == 0xf26f)
>>         i.tm.base_opcode ^= 0xf36f ^ 0xf26f;
>>
> 
> It works.
> 
> I am going to check in this patch together with other 2.
> 
> Thanks.

Thank you as well.

Jan



More information about the Binutils mailing list