Dynamic linking optimizations
Tue Mar 3 17:51:00 GMT 2009
On Tue, 2009-03-03 at 07:57 -0800, Ian Lance Taylor wrote:
> John Richard Moser <firstname.lastname@example.org> writes:
> > Let's start playing with theory (nobody cares, show us the code! I
> > don't have any). I say something blindingly obvious, and you tell me
> > why this is an obviously bad approach. If I can A) prove you wrong;
> > or B) change the approach to erase the bad stuff, then I have a better
> > solution. If not, then the obvious has happened: A bunch of people
> > who are a hell of a lot smarter than me came up with something much
> > better than I could this morning over coffee, and I'll be unplugging
> > my speakers before they start emitting penis-shaped sound waves.
> Sorry, you have to show us the code. I skimmed your algorithm, and I
Figured as much. Maybe I'll write a proof of concept later.
> agree that it should work. However, you are ignoring constant factors.
> Character comparisons are fast. It doesn't help to minimize them if it
> makes each operation cost more on average.
Well, the "operations" would be following a branch, which involves
jumping by a computed offset and reading in a byte (to use as an index).
Everything else is effectively a character comparison. In the naive
case (no hash table, flat symbol list), iterating through the list of
branches off a particular branch effectively skips $PREFIX_LENGTH
character comparisons in favor of one (you are precisely reading a
(char) from an array, and then incrementing an index if that (char)
> Also, the number of hash
> table collisions is highly relevant. The linker tries to minimize them,
> via the simple expedient of using more hash table buckets. In
> particular the linker will normally use at more buckets than there are
> symbols, so the average number of hash table collisions is zero.
This is the part where my logic falls over. If you have a O(1) lookup
you're gold, because my algorithm would follow a matching prefix until a
mismatch (i.e. "some_long_symbol_name_A" exists, looking up
"some_long_function_name_A" would necessarily read down the tree as far
as "some_long_s" rather than just bouncing off the hash table and going
"no match"). Of course, if you land in a hash bucket with a symbol that
follows the same general behavior this happens too; but it's not likely.
As I said, it is possible to do some optimizations by maintaining the
state you're in, traversing the tries in alphabetical order during
symbol resolution, and in general trying to work from the furthest
branch out and step backwards rather than starting from the root on each
lookup. This logic would have to be somewhat simple to work out
efficiently though, and would eat a measure of memory because each
symbol resolution for each loaded object would go through the whole
process, and each loaded object would have its own state in memory
during the symbol resolution phase.
> So, you might be right that your algorithm is faster, but you aren't
> obviously right. You have to actually implement it and measure it.
I'll think about it for a while. Maybe I'll pop back in another year
when I find a weekend free and show a library that does it with
arbitrary data as a proof of concept.
> A more relevant problem is that when you have lots of shared libraries,
> you have to look in lots of hash tables. A couple of different
> solutions have been proposed for that, but the glibc maintainers have
> declined to accept them.
Right. You can easily run through the data structure I've proposed and
build one massive one for fast lookups; however, you still have to run
through each data structure and insert elements into a completely new
copy, which takes time and memory and probably isn't an overall win.
Thanks, you've been helpful, I'll mill around with it for a while.
More information about the Binutils