What does ld.so do that dlopen don't do when loading shared libraries

Celelibi celelibi@gmail.com
Tue Apr 30 00:06:00 GMT 2013


2013/4/30, Ángel González <keisial@gmail.com>:
> On 29/04/13 22:09, Celelibi wrote:
>> (triple reply combo :p)
>>
>> After a bit of investigation, it looks like it's not only TAU or
>> profilers that suffer from this bug, but every tool relying on
>> instrumentation by gcc.
>>
>> gcc's option -finstrument-functions add calls to
>> __cyg_profile_func_enter@plt and __cyg_profile_func_exit@plt at the
>> begining and end of every function. Thus, a tool just have to define
>> the real functions and then can do whatever it want.
>>
>> And the thing is that, even thought the calls to these PLT entries are
>> actually executed, execution never reach the actual functions (when
>> opened with dlopen). I tried to debug instruction-by-instruction, but
>> got lost inside ld-linux-x86-64.so.2.
>> I think it just refuse to resolve the symbol or something like that.
>> I think you should be able to explain a bit what happens. :)
>>
>> Here I join a small example (independant of TAU or anything) to show
>> this behavior
>>
>> Compilation commands:
>> gcc -shared -fPIC -finstrument-functions -o foo.so foo.c
>> gcc -ldl -o dyn dyn.c
>>
>> (dyn.c didn't change, but I still join it for self-containment of this
>> message.)
>>
>>
>> Regards,
>>
>> Celelibi
> Yes, it's strange. (I confirm the observed behavior)
>
> Kind of a workaround, if you do:
> LD_PRELOAD=$PWD/foo.so ./dyn
>
> Then the profile_funcs are called. Which supports your assumption, as
> foo.so is loaded by ld.so in that case.
> I have no idea why it does so. Looks as if dlopen() missed doing something.
>
>

Oh god, I think I found out.
The glibc actually define a default __cyg_profile_func_* that does nothing.
I'm not completely aware of the symbol resolution process, but I think
it take libraries in the order they were loaded. As libc always appear
last in the dependancies of an ELF, it shouldn't be a problem and the
libc implementation of these functions are called only when no other
symbol is found with this name.
And I think dlopen put the new library *after* libc. And thus the
symbol resolution fails to find the right implementation.

And LD_PRELOAD would work because it put the library *before* any other.

Addendum: "LD_DEBUG=symbols ./dyn" confirms this theory.

Is there a mechanism that could say "this symbol have a high/low
priority"? Weak symbols? (I don't exactly know what they do.) In that
case, I think a low priority should be applied to the libc definition
of these functions. My small tests shows that even with a weak symbol
in the libc I should define LD_DYNAMIC_WEAK=1. But at least that would
give me a chance. :)

I see the dlopen option RTLD_DEEPBIND which make things work, but
would imply that I modify the real application code that does a
dlopen, which I can't (I could still submit a bug, thought).

Well, that's all the solutions I see. Did I miss something?


Celelibi



More information about the Libc-help mailing list