This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: V2 [PATCH] x86: Optimize EVEX vector load/store instructions
>>> On 17.03.19 at 21:47, <hjl.tools@gmail.com> wrote:
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -4075,6 +4075,56 @@ optimize_encoding (void)
> i.types[j].bitfield.ymmword = 0;
> }
> }
> + else if ((cpu_arch_flags.bitfield.cpuavx
> + || cpu_arch_isa_flags.bitfield.cpuavx)
Once again a questionable condition, as per earlier replies to
other patches of yours.
> + && i.vec_encoding != vex_encoding_evex
> + && !i.types[0].bitfield.zmmword
> + && !i.mask
> + && is_evex_encoding (&i.tm)
> + && (i.tm.base_opcode == 0x666f
> + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0x666f
> + || i.tm.base_opcode == 0xf36f
> + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf36f
> + || i.tm.base_opcode == 0xf26f
> + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f)
All three of these can be expressed with just a single comparison,
using & or | instead of ^ and (if necessary) adjusting the literal
value compared against.
> + && i.tm.extension_opcode == None)
> + {
> + /* Optimize: -O1:
> + VOP, one of vmovdqa32, vmovdqa64, vmovdqu8, vmovdqu16,
> + vmovdqu32 and vmovdqu64:
> + EVEX VOP %xmmM, %xmmN
> + -> VEX vmovdqa|vmovdqu %xmmM, %xmmN (M and N < 16)
> + EVEX VOP %ymmM, %ymmN
> + -> VEX vmovdqa|vmovdqu %ymmM, %ymmN (M and N < 16)
> + EVEX VOP %xmmM, mem
> + -> VEX vmovdqa|vmovdqu %xmmM, mem (M < 16)
> + EVEX VOP %ymmM, mem
> + -> VEX vmovdqa|vmovdqu %ymmM, mem (M < 16)
> + EVEX VOP mem, %xmmN
> + -> VEX mvmovdqa|vmovdquem, %xmmN (N < 16)
There's some confusion on this line.
> + EVEX VOP mem, %ymmN
> + -> VEX vmovdqa|vmovdqu mem, %ymmN (N < 16)
> + */
For the variants with a memory operand I doubt the conversion
is always a win, and it may be against the user request in case of
-Os. This is because of the Disp8 scaling the EVEX encoding permits.
> + if (i.tm.base_opcode == 0xf26f)
> + i.tm.base_opcode = 0xf36f;
> + else if ((i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f)
> + i.tm.base_opcode = 0xf36f ^ Opcode_SIMD_IntD;
This again can be expressed without "else if()" afaict.
Jan