glibc 2.34 introduced GL(dl_load_lock) in pthread_create to fix bug 19329. ctors in dlopen run while GL(dl_load_lock) which can cause deadlocks as explained in bug 15686. that bug is hard to fix so the deadlock at thread creation is dealt with separately here. copying the deadlock report from bug 19329: If I have a c++ dynamic library(named libA.so) that contains a global object, the global object will call the post-constructor at initialization and hold it's own lock(named A_lock) when dlopen loads libA.so. Assume that two threads execute the following process: Thread1:dlopen(libA.so) => hold dl_load_lock => load libA.so => init global object from libA.so => wait for hold A_lock Thread2:my own code hold A_lock => pthread_create => _dl_allocate_tls_init => wait for hold dl_load_lock In this case, an ABBA deadlock occurs. Is this a bug? My stack looks like this: Thread 1 (LWP 136013): #0 0x00007f57a108510d in ?? () from /usr/lib64/libpthread.so.0 #1 0x00007f57a107e4d1 in pthread_mutex_lock () from /usr/lib64/libpthread.so.0 #1 stack waiting for holding A_lock ... #6 0x00007f5781c1bb8b in LogProcess::Init (strProcName=..., nProcHandle=nProcHandle@entry=0) at ./service/biz_frame/code/server/src/logging/logprocess.cpp:107 ... #20 0x00007f57a0fef21f in _dl_catch_exception () from /usr/lib64/libc.so.6 #21 0x00007f57a786442b in ?? () from /lib64/ld-linux-x86-64.so.2 #22 0x00007f57a3de2296 in ?? () from /usr/lib64/libdl.so.2 #23 0x00007f57a0fef21f in _dl_catch_exception () from /usr/lib64/libc.so.6 #24 0x00007f57a0fef2af in _dl_catch_error () from /usr/lib64/libc.so.6 #25 0x00007f57a3de2985 in ?? () from /usr/lib64/libdl.so.2 #26 0x00007f57a3de2351 in dlopen () from /usr/lib64/libdl.so.2 ... ... #38 0x00007f57a0fb3520 in clone () from /usr/lib64/libc.so.6 Thread 2 (LWP 134627): #0 0x00007f57a108510d in ?? () from /usr/lib64/libpthread.so.0 #1 0x00007f57a107e580 in pthread_mutex_lock () from /usr/lib64/libpthread.so.0 #2 0x00007f57a7863835 in _dl_allocate_tls_init () from /lib64/ld-linux-x86-64.so.2 #3 0x00007f57a107cb7c in pthread_create () from /usr/lib64/libpthread.so.0 ... #10 Stack holding A_lock ... #14 0x0000561689e0d579 in main ()
The master branch has been updated by Szabolcs Nagy <nsz@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=83b5323261bb72313bffcf37476c1b8f0847c736 commit 83b5323261bb72313bffcf37476c1b8f0847c736 Author: Szabolcs Nagy <szabolcs.nagy@arm.com> Date: Wed Sep 15 15:16:19 2021 +0100 elf: Avoid deadlock between pthread_create and ctors [BZ #28357] The fix for bug 19329 caused a regression such that pthread_create can deadlock when concurrent ctors from dlopen are waiting for it to finish. Use a new GL(dl_load_tls_lock) in pthread_create that is not taken around ctors in dlopen. The new lock is also used in __tls_get_addr instead of GL(dl_load_lock). The new lock is held in _dl_open_worker and _dl_close_worker around most of the logic before/after the init/fini routines. When init/fini routines are running then TLS is in a consistent, usable state. In _dl_open_worker the new lock requires catching and reraising dlopen failures that happen in the critical section. The new lock is reinitialized in a fork child, to keep the existing behaviour and it is kept recursive in case malloc interposition or TLS access from signal handlers can retake it. It is not obvious if this is necessary or helps, but avoids changing the preexisting behaviour. The new lock may be more appropriate for dl_iterate_phdr too than GL(dl_load_write_lock), since TLS state of an incompletely loaded module may be accessed. If the new lock can replace the old one, that can be a separate change. Fixes bug 28357. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
fixed for glibc 2.35, will backport it later.
The release/2.34/master branch has been updated by Florian Weimer <fw@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=024a7640ab9ecea80e527f4e4d7f7a1868e952c5 commit 024a7640ab9ecea80e527f4e4d7f7a1868e952c5 Author: Szabolcs Nagy <szabolcs.nagy@arm.com> Date: Wed Sep 15 15:16:19 2021 +0100 elf: Avoid deadlock between pthread_create and ctors [BZ #28357] The fix for bug 19329 caused a regression such that pthread_create can deadlock when concurrent ctors from dlopen are waiting for it to finish. Use a new GL(dl_load_tls_lock) in pthread_create that is not taken around ctors in dlopen. The new lock is also used in __tls_get_addr instead of GL(dl_load_lock). The new lock is held in _dl_open_worker and _dl_close_worker around most of the logic before/after the init/fini routines. When init/fini routines are running then TLS is in a consistent, usable state. In _dl_open_worker the new lock requires catching and reraising dlopen failures that happen in the critical section. The new lock is reinitialized in a fork child, to keep the existing behaviour and it is kept recursive in case malloc interposition or TLS access from signal handlers can retake it. It is not obvious if this is necessary or helps, but avoids changing the preexisting behaviour. The new lock may be more appropriate for dl_iterate_phdr too than GL(dl_load_write_lock), since TLS state of an incompletely loaded module may be accessed. If the new lock can replace the old one, that can be a separate change. Fixes bug 28357. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit 83b5323261bb72313bffcf37476c1b8f0847c736)