There are a number of places in the translator that call exit(), which means that session tempdirs are not removed. The most obvious case is "stap -V". Some of these exit() calls may have been an issue for a while, but I think some are also a regression due to PR13516 commit b96901b7, which creates the tempdir much earlier than before. So for example, now exits during option-parsing need to be more careful and clean up.
Note, there is a call to _exit() in handle_interrupt(), which is sort of an emergency case that doesn't need to clean up. For all plain exit() calls, we can probably get clever with setting up atexit() or on_exit(). Or perhaps those exit() calls should convert to the new interrupt_exception instead.
commit e2d0f787a648eefe4e5a152058f92c3f3274242e
On RHEL5 systems commit e2d0f787a648eefe4e5a152058f92c3f3274242e causes the tests to hang on systemtap.base/cmd_parse.exp. It reports "PASS: cmd_parse14" but doesn't seem to progress past that message.
When comparing the output of "stap -v -v --vp 01020 -h" of the working and hanging versions of stap the hanging one has the following lines at the end of the output: +Running rm -rf /tmp/stappdKUJd +Spawn waitpid result (0x0): 0 +Removed temporary directory "/tmp/stappdKUJd"
For cmd_parse14 doesn't seems to be exiting stap. See something like the following pstee when running the test: $ pstree -p 15099 make(15099)───sh(15100)───execrc(15136)───expect(15137)─┬─sh(15424)───stap(15427) └─{expect}(15158) Started gdb to see where stap is stuck: (gdb) where #0 0x00000034a52c5630 in __write_nocancel () from /lib64/libc.so.6 #1 0x00000034a526a6b3 in _IO_new_file_write (f=0x34a5551860, data=0xfbd14a8, n=29) at fileops.c:1260 #2 0x00000034a526baf3 in _IO_new_file_xsputn (f=0x34a5551860, data=<value optimized out>, n=29) at fileops.c:514 #3 0x00000034a5260dfb in _IO_fwrite (buf=0xfbd14a8, size=1, count=29, fp=0x34a5551860) at iofwrite.c:45 #4 0x00000034aba8e5dd in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib64/libstdc++.so.6 #5 0x000000000053dcee in stap_waitpid (verbose=<value optimized out>, pid=15428) at ../systemtap/util.cxx:618 #6 0x0000000000540adc in stap_system (verbose=2, description="rm", args=std::vector of length 3, capacity 4 = {...}, null_out=false, null_err=<value optimized out>) at ../systemtap/util.cxx:817 #7 0x000000000041818a in stap_system (this=0x7fffe37b6830) at ../systemtap/util.h:73 #8 systemtap_session::remove_tmp_dir (this=0x7fffe37b6830) at ../systemtap/session.cxx:1781 #9 0x0000000000418497 in systemtap_session::~systemtap_session (this=0x2, __in_chrg=<value optimized out>) at ../systemtap/session.cxx:360 #10 0x000000000041332b in main (argc=6, argv=0x7fffe37b71f8) ---Type <return> to continue, or q <return> to quit--- at ../systemtap/main.cxx:1135 (gdb)
(In reply to comment #4) > When comparing the output of "stap -v -v --vp 01020 -h" of the working and > hanging versions of stap the hanging one has the following lines at the end of > the output: > > +Running rm -rf /tmp/stappdKUJd > +Spawn waitpid result (0x0): 0 > +Removed temporary directory "/tmp/stappdKUJd" This much is a good thing - exactly what the commit was intended to solve. But it doesn't gel with: (In reply to comment #5) > #5 0x000000000053dcee in stap_waitpid (verbose=<value optimized out>, > pid=15428) at ../systemtap/util.cxx:618 This line is the clog which prints the "Spawn waitpid..." above, plus > #8 systemtap_session::remove_tmp_dir (this=0x7fffe37b6830) > at ../systemtap/session.cxx:1781 This is a couple lines before that which prints "Removed temporary directory". So I don't see how that backtrace could possibly correspond with the new output.
I believe cmd_parse's issues are more of a testcase bug -- cloned to PR14560.