[PATCH] x86: Add -munaligned-vector-move to assembler

Thu Oct 21 16:11:45 GMT 2021

On 21.10.2021 17:44, H.J. Lu wrote:
> Unaligned load/store instructions on aligned memory or register are as
> fast as aligned load/store instructions on modern Intel processors.  Add
> a command-line option, -munaligned-vector-move, to x86 assembler to
> encode encode aligned vector load/store instructions as unaligned
> vector load/store instructions.

But this doesn't clarify yet what the benefit is. For legacy encoded ones
it might be the shorter insn encoding, but for the VEX and EVEX ones? And
if encoding size matters, how about modern CPUs' behavior for MOV{A,U}PS
vs MOV{A,U}PD and VMOVDQ{A,U}?

> @@ -1950,6 +1953,11 @@ cpu_flags_match (const insn_template *t)
>    i386_cpu_flags x = t->cpu_flags;
>    int match = cpu_flags_check_cpu64 (x) ? CPU_FLAGS_64BIT_MATCH : 0;
>  
> +  /* Encode aligned vector move as unaligned vector move if asked.  */
> +  if (!unaligned_vector_move
> +      && t->opcode_modifier.unaligned_vector_move)
> +    return 0;

New (and effectively redundant) templates just to record this extra flag
look wasteful to me. Couldn't you arrange for this via the Optimize flag
(or some derived logic simply fiddling with the opcodes; the patterns
are sufficiently regular iirc)?

> @@ -13060,6 +13068,7 @@ const char *md_shortopts = "qnO::";
>  #define OPTION_MLFENCE_AFTER_LOAD (OPTION_MD_BASE + 31)
>  #define OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH (OPTION_MD_BASE + 32)
>  #define OPTION_MLFENCE_BEFORE_RET (OPTION_MD_BASE + 33)
> +#define OPTION_MUNALGNED_VECTOR_MOVE (OPTION_MD_BASE + 34)

Did you miss an I here and ...

> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -731,6 +731,7 @@ static bitfield opcode_modifiers[] =
>    BITFIELD (SIB),
>    BITFIELD (SSE2AVX),
>    BITFIELD (NoAVX),
> +  BITFIELD (UNALGNED_VECTOR_MOVE),

... here and ...

> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -636,6 +636,9 @@ enum
>    /* No AVX equivalent */
>    NoAVX,
>  
> +  /* Encode aligned vector move as unaligned vector move.  */
> +  UNALGNED_VECTOR_MOVE,

... here?

Jan