With git systemtap, rawhide 3.8.0-rc7 kernel, dyninst 8.0, # make installcheck RUNTESTFLAGS=sdt.exp About half the time, we get a stapdyn hang/loop in the test: executing: stap --runtime=dyninst -w /home/fche/Private/DEVEL/DEVEL-systemtap/git/systemtap3/testsuite/systemtap.base/sdt.stp sdt.c.exe.2 -c ./sdt.c.exe.2 ... top shows one of the stapdyn threads spinning with 100% cpu (gdb) bt #0 0x00000031e18da4c7 in sched_yield () at ../sysdeps/unix/syscall-template.S:81 #1 0x00002aaaabc10ff5 in GeneratorLinux::evictFromWaitpid() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #2 0x00002aaaabc11079 in GeneratorLinux::~GeneratorLinux() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #3 0x00002aaaabc110a9 in GeneratorLinux::~GeneratorLinux() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #4 0x00000031e183944a in __cxa_finalize (d=0x2aaaabf04da0) at cxa_finalize.c:55 #5 0x00002aaaabc0f513 in __do_global_dtors_aux () from /usr/lib64/dyninst/libpcontrol.so.8.0 #6 0x00007fffeb5c2ab0 in ?? () #7 0x00000031e140f9da in _dl_fini () at dl-fini.c:253 Backtrace stopped: frame did not save the PC (gdb) info thread Id Target Id Frame 2 Thread 0x2aaaad232700 (LWP 2167) "stapdyn" (running) * 1 Thread 0x2aaaacc20a00 (LWP 2165) "stapdyn" (running) (gdb) quit ... detaching sometimes kills the errant stapdyn and lets the test resume.
global_var-m64-O2-dyninst (global_var.exp) has also triggered it.
Additional backtrace info from the affected process: (gdb) bt #0 0x00000031e18da4c7 in sched_yield () at ../sysdeps/unix/syscall-template.S:81 #1 0x00002aaaabc10ff5 in GeneratorLinux::evictFromWaitpid() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #2 0x00002aaaabc11079 in GeneratorLinux::~GeneratorLinux() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #3 0x00002aaaabc110a9 in GeneratorLinux::~GeneratorLinux() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #4 0x00000031e183944a in __cxa_finalize (d=0x2aaaabf04da0) at cxa_finalize.c:55 #5 0x00002aaaabc0f513 in __do_global_dtors_aux () from /usr/lib64/dyninst/libpcontrol.so.8.0 #6 0x00007fff2a0c2cc0 in ?? () #7 0x00000031e140f9da in _dl_fini () at dl-fini.c:253 Backtrace stopped: frame did not save the PC (gdb) info thread Id Target Id Frame 2 Thread 0x2aaaad24c700 (LWP 4634) "stapdyn" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 * 1 Thread 0x2aaaacc20a00 (LWP 4632) "stapdyn" 0x00000031e18da4c7 in sched_yield () at ../sysdeps/unix/syscall-template.S:81 (gdb) thread 2 [Switching to thread 2 (Thread 0x2aaaad24c700 (LWP 4634))] #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 185 62: movl (%rsp), %edi (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00002aaaabc248cd in CondVar::wait() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #2 0x00002aaaabc0eb48 in LinuxPtrace::main() () from /usr/lib64/dyninst/libpcontrol.so.8.0 #3 0x00002aaaabc2453c in thread_init(void*) () from /usr/lib64/dyninst/libpcontrol.so.8.0 #4 0x00000031e2407c63 in start_thread (arg=0x2aaaad24c700) at pthread_create.c:308 #5 0x00000031e18f524d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
I don't think we've seen this exact hang in a while -- perhaps fixed by this? http://git.dyninst.org/?p=dyninst.git;a=commit;h=8e556b1d0ebd9ab1c5c47cdcfabafeb683ba90d8 "Make sure SIGUSR2 is cleared from masked signals before using it in PC." (SIGUSR2 is how evictFromWaitpid() works.)
Seems consistenly okay now with: dyninst-9.3.1-1.fc25.x86_64