[PATCH] x86: Correct EVEX vector load/store optimization

Tue Mar 19 06:21:00 GMT 2019

On Mon, Mar 18, 2019 at 9:49 PM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 17.03.19 at 21:47, <hjl.tools@gmail.com> wrote:
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -4075,6 +4075,56 @@ optimize_encoding (void)
> >           i.types[j].bitfield.ymmword = 0;
> >         }
> >      }
> > +  else if ((cpu_arch_flags.bitfield.cpuavx
> > +         || cpu_arch_isa_flags.bitfield.cpuavx)
>
> Once again a questionable condition, as per earlier replies to
> other patches of yours.

Fixed.

> > +        && i.vec_encoding != vex_encoding_evex
> > +        && !i.types[0].bitfield.zmmword
> > +        && !i.mask
> > +        && is_evex_encoding (&i.tm)
> > +        && (i.tm.base_opcode == 0x666f
> > +            || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0x666f
> > +            || i.tm.base_opcode == 0xf36f
> > +            || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf36f
> > +            || i.tm.base_opcode == 0xf26f
> > +            || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f)
>
> All three of these can be expressed with just a single comparison,
> using & or | instead of ^ and (if necessary) adjusting the literal
> value compared against.

Fixed.

> > +        && i.tm.extension_opcode == None)
> > +    {
> > +      /* Optimize: -O1:
> > +        VOP, one of vmovdqa32, vmovdqa64, vmovdqu8, vmovdqu16,
> > +        vmovdqu32 and vmovdqu64:
> > +          EVEX VOP %xmmM, %xmmN
> > +            -> VEX vmovdqa|vmovdqu %xmmM, %xmmN (M and N < 16)
> > +          EVEX VOP %ymmM, %ymmN
> > +            -> VEX vmovdqa|vmovdqu %ymmM, %ymmN (M and N < 16)
> > +          EVEX VOP %xmmM, mem
> > +            -> VEX vmovdqa|vmovdqu %xmmM, mem (M < 16)
> > +          EVEX VOP %ymmM, mem
> > +            -> VEX vmovdqa|vmovdqu %ymmM, mem (M < 16)
> > +          EVEX VOP mem, %xmmN
> > +            -> VEX mvmovdqa|vmovdquem, %xmmN (N < 16)
>
> There's some confusion on this line.
>
> > +          EVEX VOP mem, %ymmN
> > +            -> VEX vmovdqa|vmovdqu mem, %ymmN (N < 16)
> > +       */
>
> For the variants with a memory operand I doubt the conversion
> is always a win, and it may be against the user request in case of
> -Os. This is because of the Disp8 scaling the EVEX encoding permits.

Fixed.

> > +      if (i.tm.base_opcode == 0xf26f)
> > +     i.tm.base_opcode = 0xf36f;
> > +      else if ((i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f)
> > +     i.tm.base_opcode = 0xf36f ^ Opcode_SIMD_IntD;
>
> This again can be expressed without "else if()" afaict.
>

Fixed.

Here is the patch.

Thanks.

-- 
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-x86-Correct-EVEX-vector-load-store-optimization.patch
Type: text/x-patch
Size: 34028 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20190319/ed3f20ea/attachment.bin>