[PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

Richard Biener richard.guenther@gmail.com
Thu Jul 2 14:45:53 GMT 2020


On Thu, Jul 2, 2020 at 3:22 PM xiezhiheng <xiezhiheng@huawei.com> wrote:
>
> Hi,
>
> This is a fix for pr94442.
> I modify get_inner_reference to handle the case for MEM[ptr, off].
> I extract the "off" and add it to the recorded offset, then I build a
> MEM[ptr, 0] and return it later.
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 3c68b0d754c..8cc18449a0c 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -7362,7 +7362,8 @@ tree
>  get_inner_reference (tree exp, poly_int64_pod *pbitsize,
>                      poly_int64_pod *pbitpos, tree *poffset,
>                      machine_mode *pmode, int *punsignedp,
> -                    int *preversep, int *pvolatilep)
> +                    int *preversep, int *pvolatilep,
> +                    bool include_memref_p)
>  {
>    tree size_tree = 0;
>    machine_mode mode = VOIDmode;
> @@ -7509,6 +7510,21 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
>                 }
>               exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
>             }
> +         else if (include_memref_p
> +                  && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME)
> +           {
> +             tree off = TREE_OPERAND (exp, 1);
> +             if (!integer_zerop (off))
> +               {
> +                 poly_offset_int boff = mem_ref_offset (exp);
> +                 boff <<= LOG2_BITS_PER_UNIT;
> +                 bit_offset += boff;
> +
> +                 exp = build2 (MEM_REF, TREE_TYPE (exp),
> +                               TREE_OPERAND (exp, 0),
> +                               build_int_cst (TREE_TYPE (off), 0));
> +               }
> +           }
>           goto done;
>
>         default:
> @@ -10786,7 +10802,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
>         int reversep, volatilep = 0, must_force_mem;
>         tree tem
>           = get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1,
> -                                &unsignedp, &reversep, &volatilep);
> +                                &unsignedp, &reversep, &volatilep, true);
>         rtx orig_op0, memloc;
>         bool clear_mem_expr = false;
>
> diff --git a/gcc/tree.h b/gcc/tree.h
> index a74872f5f3e..7df0d15f7f9 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -6139,7 +6139,8 @@ extern bool complete_ctor_at_level_p (const_tree, HOST_WIDE_INT, const_tree);
>     look for the ultimate containing object, which is returned and specify
>     the access position and size.  */
>  extern tree get_inner_reference (tree, poly_int64_pod *, poly_int64_pod *,
> -                                tree *, machine_mode *, int *, int *, int *);
> +                                tree *, machine_mode *, int *, int *, int *,
> +                                bool = false);
>
>  extern tree build_personality_function (const char *);
>
>
> I add an argument "include_memref_p" to control whether to go into MEM_REF,
> because without it will cause the test case "Warray-bounds-46.c" to fail in regression.
>
> It because function set_base_and_offset in gimple-ssa-warn-restrict.c
>   base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,
>                               &mode, &sign, &reverse, &vol);
>   ...
>   ...
>   if (TREE_CODE (base) == MEM_REF)
>     {
>       tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 1));
>       extend_offset_range (memrefoff);
>       base = TREE_OPERAND (base, 0);
>
>       if (refoff != HOST_WIDE_INT_MIN
>           && TREE_CODE (expr) == COMPONENT_REF)
>         {
>           /* Bump up the offset of the referenced subobject to reflect
>              the offset to the enclosing object.  For example, so that
>              in
>                struct S { char a, b[3]; } s[2];
>                strcpy (s[1].b, "1234");
>              REFOFF is set to s[1].b - (char*)s.  */
>           offset_int off = tree_to_shwi (memrefoff);
>           refoff += off;
>         }
>
>       if (!integer_zerop (memrefoff))       <=================
>         /* A non-zero offset into an array of struct with flexible array
>            members implies that the array is empty because there is no
>            way to initialize such a member when it belongs to an array.
>            This must be some sort of a bug.  */
>         refsize = 0;
>     }
>
> needs MEM_REF offset to judge whether refsize should be set to zero.
> But I fold the offset into bitpos and the offset will always be zero.
>
> Suggestion?

The thing you want to fix is not get_inner_reference but the aarch64 backend
to not make __builtin_aarch64_sqaddv16qi clobber global memory.  That way
CSE can happen on GIMPLE which can handle the difference in the IL just
fine.

Richard.


More information about the Gcc-patches mailing list