Running ./frysk-core/TestRunner -r 1000 'testStressMultiThreadedDetach(frysk.util.StressTestFStack)' eventually gives output of the form Task #5174 #0 0xe99402 in __kernel_vsyscall () #1 0x1392b7 in sigsuspend () #2 0x804913a in server () #3 0x29b3db in start_thread () #4 0x1dd26e in clone () Task #5175 #0 0x9305faa in [unknown] #1 0x104b5ce2 in [unknown] #2 0x8707f83 in [unknown] #3 0x8649e16 in [unknown] when in fact each task should be exactly the same (except the main task)
*** Bug 3691 has been marked as a duplicate of this bug. ***
Should note that this is similar, but not identical, to #3728. In that case, stepping through J.T.R. will accumulate unknown function names in various frames, seemingly at random.
While it is not a direct dependency Bug 3791 bugfix should get corrected as otherwise the addresses in the fail case (where symbols are no longer resolved) get corrupted and it messes up a bit the debugging.
Disowning this Bug as it is not libunwind related. libunwind only fails to resolve the symbols after the fd (file descriptor) process table gets filled up (>1024 fds). You can debug it by: valgrind --track-fds=yes ./frysk-core/TestRunner -r 1000 'testStressMultiThreadedDetach(frysk.util.StressTestFStack)' resulting in libunwind's failings: Warning: invalid file descriptor 1019 in syscall open() and the final reported leaked fds: Open file descriptor 178: /lib/libc-2.5.so at 0x40515C2: open64 (open64.c:45) by 0x85A5E5A: dwfl_linux_proc_find_elf (linux-proc-maps.c:297) by 0x85A407D: find_file (dwfl_module_getdwarf.c:103) by 0x85A4B71: find_dw (dwfl_module_getdwarf.c:395) by 0x85A4CC6: dwfl_module_getdwarf (dwfl_module_getdwarf.c:462) by 0x85AADB2: dwfl_module_getsrc (dwfl_module_getsrc.c:57) by 0x85A6CA3: dwfl_getsrc (dwfl_getsrc.c:55) by 0x8161192: _ZN3lib2dw4Dwfl11dwfl_getsrcEJxx (Dwfl.cxx:138) by 0x81583B6: _ZN3lib2dw4Dwfl13getSourceLineEJPNS0_8DwflLineEx (Dwfl.java:110) by 0x8133C36: frysk::rt::StackFrame::StackFrame(lib::unwind::FrameCursor*, frysk::proc::Task*, frysk::rt::StackFrame*) (StackFrame.java:127) by 0x8133952: _ZN5frysk2rt12StackFactory16createStackFrameEJPNS0_10StackFrameEPNS_4proc4TaskEi (StackFactory.java:79) by 0x8133AB4: _ZN5frysk2rt12StackFactory16createStackFrameEJPNS0_10StackFrameEPNS_4proc4TaskE (StackFactory.java:112) Open file descriptor 177: /usr/lib/debug/lib/ld-2.5.so.debug at 0x40515C2: open64 (open64.c:45) by 0x85A501E: try_open (find-debuginfo.c:79) by 0x85A53D4: dwfl_standard_find_debuginfo (find-debuginfo.c:178) by 0x85A45AF: find_debuginfo (dwfl_module_getdwarf.c:178) by 0x85A4BEA: find_dw (dwfl_module_getdwarf.c:417) by 0x85A4CC6: dwfl_module_getdwarf (dwfl_module_getdwarf.c:462) by 0x85AADB2: dwfl_module_getsrc (dwfl_module_getsrc.c:57) by 0x85A6CA3: dwfl_getsrc (dwfl_getsrc.c:55) by 0x8161192: _ZN3lib2dw4Dwfl11dwfl_getsrcEJxx (Dwfl.cxx:138) by 0x81583B6: _ZN3lib2dw4Dwfl13getSourceLineEJPNS0_8DwflLineEx (Dwfl.java:110) by 0x8133C08: frysk::rt::StackFrame::StackFrame(lib::unwind::FrameCursor*, frysk::proc::Task*, frysk::rt::StackFrame*) (StackFrame.java:125) by 0x8133B12: frysk::rt::StackFrame::StackFrame(lib::unwind::FrameCursor*, frysk::proc::Task*) (StackFrame.java:92) There is some leakage, libdwfl's dwfl_end() is not called appropriately. As it is called from the Dwfl binding's finalize() I assume there are some leaked Java object references. But I did not analyse it more as the bug looks to lie in the Java land. Also running of the testcase above eats about 0.5GB of memory also suggesting some leakage occurs there.
*** Bug 3241 has been marked as a duplicate of this bug. ***
Found that it is not (may not) be a Java code problem (not calling dwfl_end() from its finalizers). The leakage is present in elfutils even if one calls dwfl_end(): https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=230793
elfutils 0.127 fixes this.