pthread_mutex_lock hang during tls_get_addr_tail()

Paul Smith paul@mad-scientist.net
Sun Sep 11 19:38:00 GMT 2016


On Sun, 2016-09-11 at 06:17 -0400, Carlos O'Donell wrote:
> To me this looks like you have a thread A which calls dlopen, and
> starts running constructors, which in turn create thread B which
> touches a tls variable and therefore needs to wait for thread A to
> finish with dlopen, but that can't happen because thread A waits on
> thread B, and you have a deadlock only if you touch the tls variable.

That could be happening, but it's not in my code.  I have no variables
marked with a constructor attribute and very few global objects whose
C++ type has a constructor.  Further, everything other than
libc/libpthread/libm is statically linked with my library and I don't
use dlopen() in my library at all.

However, the environment around my library is complex and without being
to reproduce the issue myself it'll be almost impossible to discover
what's at fault.  We link with jemalloc as a replacement memory
allocator, which I know does some stuff at startup to initialize itself.
 We also link with the Google coredumper library to allow control over
coredumps, and that may do other magic on initialization (I haven't
delved into it).  Finally, the user is loading our library into a JVM
and they are also loading libjsig.so to deal with signal handling, so
there's definitely some magic happening there.

Fortunately for me, as a result of this conversation I suggested to the
user that they add our library to LD_PRELOAD to ensure that it's loaded
before anything else starts up and that seems to have solved their
problem.  Until they can provide me with enough details for a repro case
locally I don't plan to dig into it any further than that.

If I do come up with an interesting answer to this issue in the future
I'll be sure to post it here.

Thanks all!



More information about the Libc-help mailing list