[gold][patch] Reduce heap usage for string merge sections

Cary Coutant ccoutant@google.com
Fri Jul 23 00:30:00 GMT 2010


> I'm worried that any fixed number I choose would be too benchmark
> specific, so I'm trying something more adaptive. I'll let you know how
> that works out.

The changes below seem to help somewhat (ignore the debug printf's --
this isn't for submission yet). I form a fairly conservative initial
estimate that's designed to get us about halfway through a typical
string merge section, then refine the estimate when we hit the
vector's capacity. From the debug output, I can calculate that the
total overhead (capacity / actual count - 1) is about 4.2% for
.debug_str sections, and 24% for .rodata sections. Out of 9385 input
string merge sections, it had to refine the estimate and resize the
vector 4 times for 5 sections, 3 times for 81 sections, twice for 827
sections, and once for 6687 sections. 1785 of the sections were fine
with the original estimate.

The penalty is a slight increase in link time, although I'm not sure
whether it's in the noise or not; I'd need to do this several times
under more isolated conditions to determine if it's really noticeable.

An earlier experiment with (len / 64 + 10) for the initial estimate
and (len / (avg - 1) + 5) for the revised estimate showed fewer
refinements but 33% overhead for the .rodata sections, with no
significant difference in link time.

Thoughts? Does this seem worth it?

-cary


diff --git a/gold/merge.cc b/gold/merge.cc
index 2b1c526..cc7fd78 100644
--- a/gold/merge.cc
+++ b/gold/merge.cc
@@ -551,7 +551,21 @@
Output_merge_string<Char_type>::do_add_input_section(Relobj* object,
   Merged_strings& merged_strings = merged_strings_list->merged_strings;

   // Estimate the number of strings in the section.
-  merged_strings.reserve(len / 64 + 2);
+  // We form a conservative estimate based on an average string size of 64,
+  // and bias the result to guarantee at least 5 strings.  If we later
+  // reach the capacity, we refine the estimate based on the strings seen
+  // so far.
+  typedef typename Output_merge_string<Char_type>::Merged_strings::size_type
+      Size_type;
+  Size_type capacity = len / 64 + 5;
+  merged_strings.reserve(capacity);
+  capacity = merged_strings.capacity();
+  // FIXME: Remove printf.
+  fprintf(stderr, "Output_merge_string (%s: %s): initial estimate:
%lu strings (%lu bytes)\n",
+	  object->name().c_str(),
+	  object->section_name(shndx).c_str(),
+	  static_cast<unsigned long>(capacity),
+	  static_cast<unsigned long>(len));

   // The index I is in bytes, not characters.
   section_size_type i = 0;
@@ -574,6 +588,40 @@
Output_merge_string<Char_type>::do_add_input_section(Relobj* object,
       this->stringpool_.add_with_length(p, pl - p, true, &key);

       section_size_type bytelen_with_null = ((pl - p) + 1) * sizeof(Char_type);
+
+      if (count + 2 > capacity)
+	{
+	  // Refine our estimate of the number of strings in the section,
+	  // using average string size so far.  We want to overestimate
+	  // slightly, to reduce the likelihood of having to expand the
+	  // vector again.
+	  section_size_type avg = (i + bytelen_with_null) / (count + 1);
+	  if (avg > 1)
+	    {
+	      Size_type new_capacity = len / (avg - 1) + 1;
+	      merged_strings.reserve(new_capacity);
+	      capacity = merged_strings.capacity();
+	      // FIXME: Remove printf.
+	      fprintf(stderr, "Output_merge_string (%s: %s): refined
estimate: %lu strings (%lu * %lu / %lu bytes)\n",
+		      object->name().c_str(),
+		      object->section_name(shndx).c_str(),
+		      static_cast<unsigned long>(capacity),
+		      static_cast<unsigned long>(len),
+		      static_cast<unsigned long>(count + 1),
+		      static_cast<unsigned long>(i + bytelen_with_null));
+	    }
+	  else
+	    {
+	      // FIXME: Remove printf.
+	      fprintf(stderr, "Output_merge_string (%s: %s): could not
estimate (%lu * %lu / %lu bytes)\n",
+		      object->name().c_str(),
+		      object->section_name(shndx).c_str(),
+		      static_cast<unsigned long>(len),
+		      static_cast<unsigned long>(count + 1),
+		      static_cast<unsigned long>(i + bytelen_with_null));
+	    }
+	}
+
       merged_strings.push_back(Merged_string(i, key));

       p = pl + 1;
@@ -585,6 +633,13 @@
Output_merge_string<Char_type>::do_add_input_section(Relobj* object,
   // compute the length of the last string.
   merged_strings.push_back(Merged_string(i, 0));

+  // FIXME: Remove printf.
+  fprintf(stderr, "Output_merge_string (%s: %s): final count: %lu
strings (capacity %lu)\n",
+	  object->name().c_str(),
+	  object->section_name(shndx).c_str(),
+	  static_cast<unsigned long>(count + 1),
+	  static_cast<unsigned long>(merged_strings.capacity()));
+
   this->input_count_ += count;
   this->input_size_ += len;



More information about the Binutils mailing list