This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

An idea on library loading and symbol lookup


Hi all,

I have been trying to get smart about shared library loading on Linux
and wanted to float an idea to get an understanding of the related
issues and if the idea is feasible at all. From reading Ulrich's paper
(http://people.redhat.com/drepper/dsohowto.pdf) it appears that the good
people that came up with the C++ ABI standard threw performance concerns
out the window when coming up with the name mangling convention.

namespace - class name - function name - arguments

Since namespace - class name will be the same quite often and in many
cases even the function name will be the same between 2 symbols.

Taking for example a symbol from a template function which looks as
follows:

_ZNK7cow_COWI16bas_ShortcutImplI23kamC_BaselineCorrectionE11cow_VirtualIS2_EE8ConstGetEv

and then another function from the same template:

_ZNK7cow_COWI16bas_ShortcutImplI23kamC_BaselineCorrectionE11cow_VirtualIS2_EE6IsNullEv

Now if you put these two symbols next to each other and consider that
the loader has to walk the string and do a character comparison for each
char, at leat that is my understanding of what's happening, you can see 
and count that there are 77 characters that are equal. Thus, the loader
compares 77 characters before discarding the "wrong" symbol. Now you do
that a few times and I think there should not be an argument on why it
takes quite a while to load shared objects. If I would have a namespace
then the number of characters that are equal in these 2 symbols would be
even bigger since the namespace name comes first in the mangled name.

Now the idea:

Lets say I do a reverse lookup (I'm sure I'm not the first one to come
up with this idea but I couldn't find any discussions on it), i.e. I
start at the back of the string then I only compare 2 characters before
I discard the symbol. I just improved my lookup performance by a
gazillion. 

I know there are other issues to deal with such as symbol versioning
etc. and these are the issues I'd like to understand. Overall I think 
there's got to be a better way then comparing all these characters from
the beginning.

Comments?

Thanks,
Robert


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]