[patch] ld speedup 1/3 (suffix merge)

David Mosberger davidm@napali.hpl.hp.com
Wed Sep 10 17:30:00 GMT 2003


>>>>> On Wed, 10 Sep 2003 18:42:44 +0930, Alan Modra <amodra@bigpond.net.au> said:

  Alan> On Wed, Sep 10, 2003 at 12:31:20AM -0700, David Mosberger
  Alan> wrote:
  >> dj: expand count 3070 size 4093 to 8191 dj: expand count 6144
  >> size 8191 to 16381 dj: expand count 3070 size 4093 to 8191 dj:
  >> expand count 6144 size 8191 to 16381 dj: expand count 12286 size
  >> 16381 to 32749 dj: expand count 24562 size 32749 to 65521

  >> Perhaps it would help to default to a larger initial value?

  Alan> Probably.  If you try it out, please post the results.  :)

Well, I'm not really supposed to work on this, but it was easy enough
to try out, so here we go: if I up the default size to the maximum
used during the kernel build (65521), performance does improve
significantly: from ~30 seconds down to about 25 seconds (real time).

I'm guessing it would make sense to increase the default hash-table
size (4093 feels rather small to me).

In any case, here is the top of the flat profile with the default
hash-table size increased:

 Each histogram sample counts as 1.00056m seconds
% time      self     cumul     calls self/call  tot/call name
 17.90      0.88      0.88      170k     5.21u     5.60u get_dyn_sym_info
  9.69      0.48      1.36     1.05M      457n      494n vfprintf
  7.24      0.36      1.72     1.79M      200n      201n sec_merge_hash_lookup
  5.98      0.30      2.01     5.93M     49.7n     49.7n _IO_str_overflow

get_dyn_sym_info() is ia64-specific and looks like it's doing a linear
list-traversal.  Seems like a fairly obvious candidate for
optimization.

The time spent in vfprintf and _IO_str_overflow is also interesting:
those come from get_local_sym_hash(), which does:

  sprintf (addr_name, "%x:%lx", ..)

Surely we can do better than that... ;-)

	--david



More information about the Binutils mailing list