This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PR24444] speed up locview resolution wiht relaxable frags


Targets such as xtensa incur a much higher overhead to resolve
location view numbers than e.g. x86, because the expressions used to
compute view numbers cannot be resolved soon enough.

Each view number is computed by incrementing the previous view, if
they are both at the same address, or by resetting it to zero
otherwise.  If PV is the previous view number, PL is its location, and
NL is the location of the next view, its number is computed by
evaluating NV = !(NL > PL) * (PV + 1).

set_or_check_view uses resolve_expression to decide whether portions
of this expression can be simplified to constants.  The (NL > PL)
subexpression is one that can often be resolved to a constant,
breaking chains of view number computations at instructions of nonzero
length, but not after alignment that might be unnecessary.

Alas, when nearly every frag ends with a relaxable instruction,
frag_offset_fixed_p will correctly fail to determine a known offset
between two unresolved addresses in neighboring frags, so the
unresolved symbolic operation will be constructed and used in the
computation of most view numbers.  This results in very deep
expressions.

As view numbers get referenced in location view lists, each operand in
the list goes through symbol_clone_if_forward_ref, which recurses on
every subexpression.  If each view number were to be referenced, this
would exhibit O(n^2) behavior, where n is the depth of the view number
expressions, i.e., the length of view number sequences without an
early resolution that cuts the expression short.

I see two ways to go about this:

- make forward_ref a tri-state, so that symbol_clone_if_forward_ref
can avoid revisiting the same nodes over and over when they have
already been visited and the conclusion was that their expression does
not contain forward refs.

- enable address compares used by view numbering to be resolved even
when exact offsets are not known, using new logic to determine when
the location either remained the same or changed for sure, even with
the possibility of relaxation.

This patch implements the latter, assuming fr_fix to be the minimum
size the frag will end up with, even for machine-dependent frags, but
not rs_org: for those, or for negative fr_fix, it conservatively
refrains from resolving the expression.  This enables most view number
expressions to be resolved with a small, reasonable depth.

Now, is the assumption on fr_fix a reasonable one to make across all
supported targets?

Tested on x86_64-linux-gnu, native and cross to xtensa-elf.  No new
testcase; the issue was just performance, not output correctness.  Ok to
install?


for  gas/ChangeLog

	PR gas/24444
	* frags.c (frag_gtoffset_p): New.
	* frags.h (frag_gtoffset_p): Declare it.
	* expr.c (resolve_expression): Use it.
---
 gas/expr.c  |    5 +++-
 gas/frags.c |   74 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 gas/frags.h |    2 ++
 3 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/gas/expr.c b/gas/expr.c
index ee85bda1cc60..3efde88cc0a2 100644
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -2179,7 +2179,10 @@ resolve_expression (expressionS *expressionP)
 		|| op == O_lt || op == O_le || op == O_ge || op == O_gt)
 	       && seg_left == seg_right
 	       && (finalize_syms
-		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off))
+		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off)
+		   || (op == O_gt
+		       && frag_gtoffset_p (left, frag_left,
+					   right, frag_right, &frag_off)))
 	       && (seg_left != reg_section || left == right)
 	       && (seg_left != undefined_section || add_symbol == op_symbol)))
 	{
diff --git a/gas/frags.c b/gas/frags.c
index 90096ff696d7..e2ff79fbcc04 100644
--- a/gas/frags.c
+++ b/gas/frags.c
@@ -460,3 +460,77 @@ frag_offset_fixed_p (const fragS *frag1, const fragS *frag2, offsetT *offset)
 
   return FALSE;
 }
+
+/* Return TRUE if we can determine whether FRAG1 OFF1 appears after
+   (strict >, not >=) FRAG2 OFF2, assuming it is not before.  Set
+   *OFFSET so that resolve_expression will resolve an O_gt operation
+   between them to false (0) if they are guaranteed to be at the same
+   location, or to true (-1) if they are guaranteed to be at different
+   locations.  Return FALSE conservatively, e.g. if neither result can
+   be guaranteed (yet).
+
+   They are known to be in the same segment, and not the same frag
+   (this is a fallback for frag_offset_fixed_p, that always takes care
+   of this case), and it is expected (from the uses this is designed
+   to simplify, namely location view increments) that frag1 is
+   reachable from frag2 following the fr_next links, rather than the
+   other way round.  */
+
+bfd_boolean
+frag_gtoffset_p (valueT off1, const fragS *frag1,
+		 valueT off2, const fragS *frag2, offsetT *offset)
+{
+  if (frag1 == frag2 || off2 > (valueT)frag2->fr_fix)
+    return FALSE;
+
+  /* If any frag from frag2, inclusive, to frag1, exclusive, has a
+     nonzero fixed length, the addresses won't be the same as long as
+     we reach frag1.  */
+  bfd_boolean maybe_same_addr = off1 == 0 && off2 == (valueT)frag2->fr_fix;
+  bfd_boolean found_varlength = FALSE;
+  const fragS *frag = frag2;
+  for (;;)
+    {
+      switch (frag->fr_type)
+	{
+	case rs_dummy:
+	  break;
+
+	case rs_fill:
+	case rs_align:
+	case rs_align_code:
+	case rs_align_test:
+	case rs_space:
+	case rs_space_nop:
+	case rs_fill_nop:
+	case rs_leb128:
+	case rs_cfa:
+	case rs_dwarf2dbg:
+	case rs_machine_dependent:
+	  found_varlength = TRUE;
+	  break;
+
+	default:
+     /* case rs_broken_word: */
+	case rs_org:
+	  return FALSE;
+	}
+
+      frag = frag->fr_next;
+      if (frag == frag1 || !frag)
+	break;
+      if (maybe_same_addr && frag->fr_fix > 0)
+	maybe_same_addr = FALSE;
+      if (frag->fr_fix < 0)
+	return FALSE;
+    }
+
+  /* If both are TRUE, we don't know whether or not it will resolve to
+     the same address.  */
+  if (maybe_same_addr && found_varlength)
+    return FALSE;
+
+  if (frag)
+    *offset = (off1 - off2 - !maybe_same_addr) * OCTETS_PER_BYTE;
+  return !!frag;
+}
diff --git a/gas/frags.h b/gas/frags.h
index 6ac5d4426c79..9eb066229085 100644
--- a/gas/frags.h
+++ b/gas/frags.h
@@ -154,6 +154,8 @@ char *frag_var (relax_stateT type,
 		char *opcode);
 
 bfd_boolean frag_offset_fixed_p (const fragS *, const fragS *, offsetT *);
+bfd_boolean frag_gtoffset_p (valueT, const fragS *, valueT, const fragS *,
+			     offsetT *);
 
 int get_frag_count (void);
 void clear_frag_count (void);


-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free!         FSF Latin America board member
GNU Toolchain Engineer                Free Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]