[PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3
Richard Biener
richard.guenther@gmail.com
Thu Jul 2 14:45:53 GMT 2020
On Thu, Jul 2, 2020 at 3:22 PM xiezhiheng <xiezhiheng@huawei.com> wrote:
>
> Hi,
>
> This is a fix for pr94442.
> I modify get_inner_reference to handle the case for MEM[ptr, off].
> I extract the "off" and add it to the recorded offset, then I build a
> MEM[ptr, 0] and return it later.
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 3c68b0d754c..8cc18449a0c 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -7362,7 +7362,8 @@ tree
> get_inner_reference (tree exp, poly_int64_pod *pbitsize,
> poly_int64_pod *pbitpos, tree *poffset,
> machine_mode *pmode, int *punsignedp,
> - int *preversep, int *pvolatilep)
> + int *preversep, int *pvolatilep,
> + bool include_memref_p)
> {
> tree size_tree = 0;
> machine_mode mode = VOIDmode;
> @@ -7509,6 +7510,21 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
> }
> exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
> }
> + else if (include_memref_p
> + && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME)
> + {
> + tree off = TREE_OPERAND (exp, 1);
> + if (!integer_zerop (off))
> + {
> + poly_offset_int boff = mem_ref_offset (exp);
> + boff <<= LOG2_BITS_PER_UNIT;
> + bit_offset += boff;
> +
> + exp = build2 (MEM_REF, TREE_TYPE (exp),
> + TREE_OPERAND (exp, 0),
> + build_int_cst (TREE_TYPE (off), 0));
> + }
> + }
> goto done;
>
> default:
> @@ -10786,7 +10802,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
> int reversep, volatilep = 0, must_force_mem;
> tree tem
> = get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1,
> - &unsignedp, &reversep, &volatilep);
> + &unsignedp, &reversep, &volatilep, true);
> rtx orig_op0, memloc;
> bool clear_mem_expr = false;
>
> diff --git a/gcc/tree.h b/gcc/tree.h
> index a74872f5f3e..7df0d15f7f9 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -6139,7 +6139,8 @@ extern bool complete_ctor_at_level_p (const_tree, HOST_WIDE_INT, const_tree);
> look for the ultimate containing object, which is returned and specify
> the access position and size. */
> extern tree get_inner_reference (tree, poly_int64_pod *, poly_int64_pod *,
> - tree *, machine_mode *, int *, int *, int *);
> + tree *, machine_mode *, int *, int *, int *,
> + bool = false);
>
> extern tree build_personality_function (const char *);
>
>
> I add an argument "include_memref_p" to control whether to go into MEM_REF,
> because without it will cause the test case "Warray-bounds-46.c" to fail in regression.
>
> It because function set_base_and_offset in gimple-ssa-warn-restrict.c
> base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,
> &mode, &sign, &reverse, &vol);
> ...
> ...
> if (TREE_CODE (base) == MEM_REF)
> {
> tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 1));
> extend_offset_range (memrefoff);
> base = TREE_OPERAND (base, 0);
>
> if (refoff != HOST_WIDE_INT_MIN
> && TREE_CODE (expr) == COMPONENT_REF)
> {
> /* Bump up the offset of the referenced subobject to reflect
> the offset to the enclosing object. For example, so that
> in
> struct S { char a, b[3]; } s[2];
> strcpy (s[1].b, "1234");
> REFOFF is set to s[1].b - (char*)s. */
> offset_int off = tree_to_shwi (memrefoff);
> refoff += off;
> }
>
> if (!integer_zerop (memrefoff)) <=================
> /* A non-zero offset into an array of struct with flexible array
> members implies that the array is empty because there is no
> way to initialize such a member when it belongs to an array.
> This must be some sort of a bug. */
> refsize = 0;
> }
>
> needs MEM_REF offset to judge whether refsize should be set to zero.
> But I fold the offset into bitpos and the offset will always be zero.
>
> Suggestion?
The thing you want to fix is not get_inner_reference but the aarch64 backend
to not make __builtin_aarch64_sqaddv16qi clobber global memory. That way
CSE can happen on GIMPLE which can handle the difference in the IL just
fine.
Richard.
More information about the Gcc-patches
mailing list