[PATCH] x86: improve/shorten vector zeroing-idiom optimization conditional

H.J. Lu hjl.tools@gmail.com
Tue Aug 2 15:56:47 GMT 2022


On Tue, Aug 2, 2022 at 8:20 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> - Drop the rounding type check: We're past template matching, and none
>   of the involved insns support embedded rounding.
> - Drop the extension opcode check: None of the involved opcodes have
>   variants with it being other than None.
> - Instead check opcode space, even if just to be on the safe side going
>   forward.
> - Reduce the number of comparisons by folding two groups.
>
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -4329,24 +4329,19 @@ optimize_encoding (void)
>            && !i.types[2].bitfield.xmmword
>            && (i.tm.opcode_modifier.vex
>                || ((!i.mask.reg || i.mask.zeroing)
> -                  && i.rounding.type == rc_none
>                    && is_evex_encoding (&i.tm)
>                    && (i.vec_encoding != vex_encoding_evex
>                        || cpu_arch_isa_flags.bitfield.cpuavx512vl
>                        || i.tm.cpu_flags.bitfield.cpuavx512vl
>                        || (i.tm.operand_types[2].bitfield.zmmword
>                            && i.types[2].bitfield.ymmword))))
> -          && ((i.tm.base_opcode == 0x55
> -               || i.tm.base_opcode == 0x57
> -               || i.tm.base_opcode == 0xdf
> -               || i.tm.base_opcode == 0xef
> -               || i.tm.base_opcode == 0xf8
> -               || i.tm.base_opcode == 0xf9
> -               || i.tm.base_opcode == 0xfa
> -               || i.tm.base_opcode == 0xfb
> -               || i.tm.base_opcode == 0x42
> -               || i.tm.base_opcode == 0x47)
> -              && i.tm.extension_opcode == None))
> +          && i.tm.opcode_modifier.opcodespace == SPACE_0F
> +          && ((i.tm.base_opcode | 2) == 0x57
> +              || i.tm.base_opcode == 0xdf
> +              || i.tm.base_opcode == 0xef
> +              || (i.tm.base_opcode | 3) == 0xfb
> +              || i.tm.base_opcode == 0x42
> +              || i.tm.base_opcode == 0x47))
>      {
>        /* Optimize: -O1:
>            VOP, one of vandnps, vandnpd, vxorps, vxorpd, vpsubb, vpsubd,

OK.

Thanks.

-- 
H.J.


More information about the Binutils mailing list