Created attachment 15391 [details] Demonstrates infinite recursion with "-H on" When setting "-H on" with "gprofng collect app", starting a new thread triggers infinite recursion. The crash looks like this in gdb: #26129 0x00007f81ed538de5 in calloc (size=32, esize=16) at heaptrace.c:447 #26130 0x00007f81e7c9bbaf in ___pthread_setspecific (key=key@entry=38, value=value@entry=0x7f81e19fcac0) at ./nptl/pthread_setspecific.c:69 #26131 0x00007f81ed40bff7 in __collector_tsd_get_by_key (key_index=<optimized out>) at tsd.c:138 #26132 0x00007f81ed538de5 in calloc (size=32, esize=16) at heaptrace.c:447 #26133 0x00007f81e7c9bbaf in ___pthread_setspecific (key=key@entry=38, value=value@entry=0x7f81e19fcad0) at ./nptl/pthread_setspecific.c:69 #26134 0x00007f81ed40bff7 in __collector_tsd_get_by_key (key_index=<optimized out>) at tsd.c:138 #26135 0x00007f81ed538de5 in calloc (size=32, esize=16) at heaptrace.c:447 #26136 0x00007f81e7c9bbaf in ___pthread_setspecific (key=key@entry=40, value=value@entry=0x7f81e19fcae0) at ./nptl/pthread_setspecific.c:69 #26137 0x00007f81ed40bff7 in __collector_tsd_get_by_key (key_index=<optimized out>) at tsd.c:138 #26138 0x00007f81ed428bfc in __collector_ext_unwind_key_init (isPthread=isPthread@entry=1, stack=stack@entry=0x0) at unwind.c:342 #26139 0x00007f81ed40552a in collector_root (cargs=<optimized out>) at dispatcher.c:1103 #26140 0x00007f81e7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 #26141 0x00007f81e7d26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 Attached is a simple program that demonstrates the problem.
I cannot reproduce the problem on OL8 with libpthread-2.28. It looks like calloc() is called in ./nptl/pthread_setspecific.c:69 But this calloc() must be from libc, not from the user application. Can you run this test in your environment: % cat t.c #include <pthread.h> long long calloc = 0; void *func(void *arg) { return NULL; } int main(int argc, char **argv) { pthread_t thr; void *val; pthread_create(&thr, NULL, func, NULL); pthread_join(thr, &val); return 0; } % gcc -pthread t.c % ./a.out
First of all, your test case: % gcc -pthread t.c t.c:3:11: warning: built-in function 'calloc' declared as non-function [-Wbuiltin-declaration-mismatch] 3 | long long calloc = 0; | ^~~~~~ % ./a.out % echo $? 0 That works fine. The machine itself is running Ubuntu 22.04.4 LTS (Jammy Jellyfish). I made a clean sandbox and found no trouble at all with "-H on", so it's clearly related to something else in my environment. I'll spend some time searching. In the meantime, this can probably be closed out as "cannot reproduce" until I get to the bottom of whatever's going on. It's definitely reproducible and happens all the time in the giant proprietary C++ application I'm working on and in these tiny test programs when compiled inside that same environment, but I don't know yet how or why. Something causes collector's calloc hook to end up invoking itself.
I found a clue: the application uses libtcmalloc. It seems like this recursion problem happens when that library is either ldopen'd into the address space or LD_PRELOADed in.
It looks like the bug is in gprofng/libcollector/heaptrace.c. I see that init_heap_intf() is not thread safe.
The problem is: We use pthread_getspecific() and pthread_setspecific() to access thread local memory. We use this memory to check that our interposed functions (like malloc, calloc or free) don't have recursion. For example, the first time we call calloc(), we call pthread_setspecific() to create a thread-specific value. On your machine, pthread_setspecific() calls calloc(), and we cannot intercept such recursion. gcc supports thread-local storage. For example, static __thread int reentrance = 0; I rewrote code using this instead of pthread_getspecific() and pthread_setspecific(). It works on OL8, but may not work on Ubuntu. I will try to find Ubuntu machine.
Neat. I'd be more than happy to try out a patch.
Created attachment 15433 [details] proposed patch
(In reply to Vladimir Mezentsev from comment #7) > Created attachment 15433 [details] > proposed patch That patch works perfectly. Thank you!
The master branch has been updated by Vladimir Mezentsev <vmezents@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=99c3fe52d237eae546d7de484d0cfbd615ac192c commit 99c3fe52d237eae546d7de484d0cfbd615ac192c Author: Vladimir Mezentsev <vladimir.mezentsev@oracle.com> Date: Sat Mar 23 18:31:03 2024 -0700 gprofng: fix infinite recursion on calloc with multi-threaded applications libcollector uses pthread_getspecific() and pthread_setspecific() to access thread local memory. libcollector uses this memory to check that interposed functions (like malloc, calloc or free) don't have recursion. The first time we call calloc(), we call pthread_setspecific() to create a thread-specific value. On Ubuntu machine, pthread_setspecific() calls calloc(), and we cannot intercept such recursion. gcc supports thread-local storage. For example, static __thread int reentrance = 0; I rewrote code using this instead of pthread_setspecific(). gprofng/ChangeLog 2024-03-23 Vladimir Mezentsev <vladimir.mezentsev@oracle.com> PR gprofng/31460 * libcollector/heaptrace.c: Use the __thread variable to check for * reentry. Clean up code.
Update status as resolved/fixed.