In this case an application is calling a library via dlopen/dlsym/dlclose. This library created a service thread that it needs to terminate when dlclose is called. But if the service thread takes an exception after the call to dlclose and the libraries call to pthread_join then the service thread will hang in the unwind code and the join will never complete.
Created attachment 879 [details] dlclose/exception testcase Untar this file and cd in the dllock_bug directory. Then run "make". This will build and run the testcase.
Created attachment 880 [details] Hand built back trace of the two threads after the hang. gdb does not give a useful back trace on powerpc fo this condition. So the attached text file contains a handbuilt backtrace for the both threads.
Example test case log: parent main: called dlopen parent main: calling dlsym for lib_func parent main: calling lib_func parent lib_func: arg1=test string parent lib_func: thread created, rc=0 parent main: back, sleep(2)... child run_it: new thread child run_it: sleep(3)... parent main: calling dlclose parent fini_Lib: calling pthread_join child run_it: awake again child run_it: throw (this might hang)
This testcase hangs on all the systems I have tested so far. Including; powerpc and powerpc64, gcc-3.3.3/glibc-2.3.3, gcc-3.4.4/glibc-2.3.4, and gcc-4.1.0/glibc-trunk. I have try this on i686 gcc-3.3.3/glibc-2.3.3 and see the same failure there. I suspect this general problem involving dl-close and g++ exception/unwind processing.
I experienced the same problems on x86: Target: i486-linux-gnu gcc-4.0.3/glibc-2.3.5 I'll test on x86-64 if I can find some hardware lying around.
I tested this on x86-64 and it fails there as well. Target: x86-64 gcc-4.0.1/glibc-2.3.5
Nothing related to C++, exceptions, and dlopen can be critical.
Appears to work fine in 2.12.