This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
dynsym/dynstr sort speedup
- From: michael meeks <michael dot meeks at novell dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: binutils at sourceware dot org, Michael Matz <matz at suse dot de>
- Date: Thu, 22 Dec 2005 11:44:10 +0000
- Subject: dynsym/dynstr sort speedup
- Reply-to: michael dot meeks at novell dot com
Hi there,
It seems that by sorting dynsym & dynstr by (elf_hash % bucketcount) we
get >10% improvement for linking - of course, the majority of this is
L1/L2 cache miss speedups - since of course walking the chain comparing
strings is much more likely to be in cache after the 1st miss. Of
course, the total win is hopefully greater on a full system linked with
this, particularly for apps with lots of libraries with lots of
symbols :-) [ this is all based on the cachgrind output for OO.o startup
showing severe L2 cache hammering ].
Anyway - the patch is quite simple, but I'd love some feedback /
suggestions for improvement. I also created a standalone test harness to
demonstrate the improvement, just fixup LD path & run 'make test':
http://www.gnome.org/~michael/dynsort-test.tar.gz
time-wise:
UnSorted 29ms
Sorted 26ms
more reliable cachegrind metrics appended.
and of course, that's without libstdc++ being sorted etc. So - the
patch is a prototype / proof of concept & needs a little cleanup.
issues / queries:
* leaks sorted_syms [easy to fix]
* calculates elf hash of strtab symbols again
+ could we store the elf_link_hash_entry
pointer from bfd_elf_record_dynamic_symbol ?
where possible to avoid calculating that.
* qsort - has no closure, nasty global variable for
bucket count - how should that be fixed ?
* conditional
+ the sorting takes some time & space at link
time - should it be conditional ?
Thoughts appreciated,
Thanks,
Michael.
Cachegrind numbers:
** Before **
==27907== D refs: 98,069,403 (73,455,788 rd + 24,613,615 wr)
==27907== D1 misses: 3,963,722 ( 3,943,112 rd + 20,610 wr)
==27907== L2d misses: 179,710 ( 178,441 rd + 1,269 wr)
==27907== D1 miss rate: 4.0% ( 5.3% + 0.0% )
==27907== L2d miss rate: 0.1% ( 0.2% + 0.0% )
** After **
==27871== D refs: 98,067,932 (73,454,588 rd + 24,613,344 wr)
==27871== D1 misses: 1,916,141 ( 1,911,775 rd + 4,366 wr)
==27871== L2d misses: 175,893 ( 174,620 rd + 1,273 wr)
==27871== D1 miss rate: 1.9% ( 2.6% + 0.0% )
==27871== L2d miss rate: 0.1% ( 0.2% + 0.0% )
--
michael.meeks@novell.com <><, Pseudo Engineer, itinerant idiot