Built gdb 8.2.1 using GCC-8 under OSX Mojave Properly code signed the executable and assigned the necessary entitlements (see https://gist.github.com/kemlath/7285356f97f67af5023cf99ee57317d6) Debugging a simple hello world c++ app #include <iostream> int main() { std::string buffer("herbert go bome"); std::cout << "Hello, World!" << std::endl; std::cout << buffer << std::endl; return 0; } gdb ./Helloworld (gdb) run -> Starts program but then locks up. Could only terminate via ForceQuit Taking a sample form the hanging gdb processes revealed that it hangs in darwin_decode_message at a wait4 call. Digging into darwin-nat.c I think I found the problem: In darwin_decode_message() in line 1154, wait4 is called twice for a terminating thread. The second call sometimes hangs because there seems nothing left to wait for. From the comments in the code this second call (whose results are not used in any way) was introduced for OSX-Snow Leopard. To fix this I've change the line 1154 in darwin-nat.c from wait4 (inf->pid, &wstatus, 0, NULL); to wait4 (inf->pid, &wstatus, WNOHANG, NULL); WNOHANG means that wait4 does not block if there is nothing the wait for. I've now tested this version of gdb from eclipse, the command line on C++ and fortran code and had no problems so far.
I had the same problem with Ada, and this has fixed it (GDB 8.2.50.20190218-git, GCC 9.0.1 20190219 (experimental)). I still have problems ("During startup program terminated with signal ?, Unknown signal") but a combination of run (see the error), start (hits the temporary breakpoint), run (all OK) has got me past that so far.
(In reply to Simon Wright from comment #1) > I had the same problem with Ada, and this has fixed it (GDB > 8.2.50.20190218-git, GCC 9.0.1 20190219 (experimental)). > > I still have problems ("During startup program terminated with signal ?, > Unknown signal") but a combination of run (see the error), start (hits the > temporary breakpoint), run (all OK) has got me past that so far. I should have said 'run, start, continue". And, that only happened on Mojave.
(In reply to Simon Wright from comment #2) > (In reply to Simon Wright from comment #1) > > I had the same problem with Ada, and this has fixed it (GDB > > 8.2.50.20190218-git, GCC 9.0.1 20190219 (experimental)). > > > > I still have problems ("During startup program terminated with signal ?, > > Unknown signal") but a combination of run (see the error), start (hits the > > temporary breakpoint), run (all OK) has got me past that so far. > > I should have said 'run, start, continue". And, that only happened on Mojave. Indeed I can confirm that, strangely enough, 3 to 5 times after rebuild of an executable I also get "During startup program terminated with signal ?, Unknown signal" then the problem goes away... I've been unable to find the origin of this intermittent problem so far....
prbolem is still there with osx catalina (10.15.1) and latest gdb installed via homebrew: brew install gdb gdb -v GNU gdb (GDB) 8.3 sudo gdb ./hello_world (gdb) r Starting program: /private/tmp/z00 [New Thread 0xd03 of process 47436] [...hangs here...] lldb -p <pid of gdb> (lldb) bt ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x00007fff71d698b2 libsystem_kernel.dylib __wait4_nocancel + 10 frame #1: 0x00000001000f1b22 gdb darwin_decode_message(mach_msg_header_t*, darwin_thread_info**, inferior**, target_waitstatus*) + 1065 frame #2: 0x00000001000ef00b gdb darwin_wait(ptid_t, target_waitstatus*) + 322 frame #3: 0x00000001000eeebf gdb darwin_nat_target::wait(ptid_t, target_waitstatus*, int) + 37 frame #4: 0x0000000100369664 gdb target_wait(ptid_t, target_waitstatus*, int) + 61 frame #5: 0x0000000100222556 gdb startup_inferior(int, int, target_waitstatus*, ptid_t*) + 190 frame #6: 0x00000001001567a2 gdb gdb_startup_inferior(int, int) + 22 frame #7: 0x00000001000f0022 gdb darwin_ptrace_him(int) + 99 frame #8: 0x0000000100222308 gdb fork_inferior(char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char**, void (*)(), void (*)(int), void (*)(), char const*, void (*)(char const*, char* const*, char* const*)) + 384 frame #9: 0x00000001000efb8b gdb darwin_nat_target::create_inferior(char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char**, int) + 951 frame #10: 0x00000001001a2674 gdb run_command_1(char const*, int, run_how) + 475 frame #11: 0x00000001000a79bc gdb cmd_func(cmd_list_element*, char const*, int) + 104 frame #12: 0x0000000100385957 gdb execute_command(char const*, int) + 464 frame #13: 0x0000000100147b92 gdb command_handler(char const*) + 97 frame #14: 0x0000000100147e8a gdb command_line_handler(std::__1::unique_ptr<char, gdb::xfree_deleter<char> >&&) + 112 frame #15: 0x00000001001476cb gdb gdb_rl_callback_handler(char*) + 59 frame #16: 0x00000001003e67fc gdb rl_callback_read_char + 506 frame #17: 0x0000000100148538 gdb gdb_rl_callback_read_char_wrapper_noexcept() + 61 frame #18: 0x0000000100147460 gdb gdb_rl_callback_read_char_wrapper(void*) + 9 frame #19: 0x0000000100147a89 gdb stdin_event_handler(int, void*) + 93 frame #20: 0x00000001001468a9 gdb gdb_wait_for_event(int) + 835 frame #21: 0x00000001001464e8 gdb gdb_do_one_event() + 261 frame #22: 0x000000010014698e gdb start_event_loop() + 149 frame #23: 0x00000001001e596a gdb captured_command_loop() + 47 frame #24: 0x00000001001e5408 gdb gdb_main(captured_main_args*) + 3908 frame #25: 0x000000010000332c gdb main + 44 frame #26: 0x00007fff71c182e5 libdyld.dylib start + 1 ```
I also encounter freezing of gdb on Mac OS Catalina 10.15.4. Tried both installing via Homebrew and building from source (gdb version 9.1).
I tried building 8.3.1 on macOS 10.15.5 (latest Catalina), and about half of the time GDB hangs just as kemlath and timothee cour describe. I also attached LLDB and GDB is indeed stuck in `darwin_decode_message`. If I recompile with the change that kemlath suggests, diff --git a/gdb/darwin-nat.c b/gdb/darwin-nat.c index 8c34aa8a3f..d30374b673 100644 --- a/gdb/darwin-nat.c +++ b/gdb/darwin-nat.c @@ -1151,7 +1151,7 @@ darwin_decode_message (mach_msg_header_t *hdr, res, wstatus); /* Looks necessary on Leopard and harmless... */ - wait4 (inf->pid, &wstatus, 0, NULL); + wait4 (inf->pid, &wstatus, WNOHANG, NULL); inferior_ptid = ptid_t (inf->pid, 0, 0); return inferior_ptid; then a little more than half the time I get "During startup program terminated with signal ?, Unknown signal" and the rest of the time it works fine. I'll try with 9.1 and 9.2.
Same behavior with 9.2: hangs without the patch (about half the time), "Unknown signal" with the patch.
As of today, with gdb 10.1 (forked with `brew tap-new` + `brew extract`), the result is target.c:2149: internal-error: void target_mourn_inferior(ptid_t): Assertion `ptid == inferior_ptid' failed.
(In reply to DomQ from comment #8) > As of today, with gdb 10.1 (forked with `brew tap-new` + `brew extract`), > the result is > > target.c:2149: internal-error: void target_mourn_inferior(ptid_t): Assertion > `ptid == inferior_ptid' failed. https://sourceware.org/bugzilla/show_bug.cgi?id=26861 ?
The correct patch these days is --- gdb-10.1-ORIG/gdb/darwin-nat.c 2020-10-24 06:23:02.000000000 +0200 +++ gdb-10.1/gdb/darwin-nat.c 2021-04-07 19:52:06.000000000 +0200 @@ -1108,10 +1108,8 @@ inferior_debug (4, _("darwin_wait: pid=%d exit, status=0x%x\n"), res_pid, wstatus); - /* Looks necessary on Leopard and harmless... */ - wait4 (inf->pid, &wstatus, 0, NULL); - - return ptid_t (inf->pid); + inferior_ptid = ptid_t (inf->pid, 0, 0); + return inferior_ptid; } else { and indeed applying it brings back a behavior similar to the one observed by Philippe Blain in 2020. There are a few additional observations to make though: • I don't get nearly as good a success rate anymore — The inferior process starts successfully less than 10% of the time • I still got one deadlock (look for "killed" in the dump below) • The whole experiment leaks zombie processes — ps(1) says they are children of PID 12 and killing launchd with -HUP doesn't help, which leads me to believe that these processes are in some kind of half-debug state that was not cleaned up correctly. The conclusion: it appears that gdb 10.1 + Mac OS Big Sur don't quite do the Mach dance correctly together. My ~/.gdbinit contains =-=-=-=-=-=-=-=-=-=-= set startup-with-shell off set debug darwin 10 =-=-=-=-=-=-=-=-=-=-= Below is a transcript of a session with the patched gdb 10.1 (whence ~/Dev/tmp/ls-x86_64-unsigned was produced using lipo -thin x86_64 -output ls-x86_64-unsigned /bin/ls): =-=-=-=-=-=-=-=-=-=-= /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned GNU gdb (GDB) 10.1 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-apple-darwin20.3.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /Users/quatrava/Dev/tmp/ls-x86_64-unsigned... (No debugging symbols found in /Users/quatrava/Dev/tmp/ls-x86_64-unsigned) (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [39620 inferior]: inferior task: 0x2803, pid: 39622 [New Thread 0x1a03 of process 39622] [39620 inferior]: darwin_wait: waiting for a message pid=39622 thread=0 [39622 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [39622 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [39620 inferior]: darwin_wait: pid=39622 exit, status=0x57f [39620 inferior]: task=0x2803, prev=0x0, notify_port=0x2503 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [39620 inferior]: inferior task: 0x2407, pid: 39723 [New Thread 0x1b03 of process 39723] [39620 inferior]: darwin_wait: waiting for a message pid=39723 thread=0 [39620 inferior]: darwin_decode_exception_message: unknown task 0x1d03 [39723 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [39723 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [39620 inferior]: darwin_wait: pid=39723 exit, status=0x57f [39620 inferior]: task=0x2407, prev=0x0, notify_port=0x2507 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [39620 inferior]: inferior task: 0x1a0b, pid: 39824 [New Thread 0x1c07 of process 39824] [39620 inferior]: darwin_wait: waiting for a message pid=39824 thread=0 [39620 inferior]: darwin_decode_exception_message: unknown task 0x1d07 [39824 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [39824 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [39620 inferior]: darwin_wait: pid=39824 exit, status=0x57f [39620 inferior]: task=0x1a0b, prev=0x0, notify_port=0x250b During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [39620 inferior]: inferior task: 0x1b0b, pid: 39925 [New Thread 0x1e0b of process 39925] [39620 inferior]: darwin_wait: waiting for a message pid=39925 thread=0 [39620 inferior]: darwin_decode_exception_message: unknown task 0x1d0b [39925 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [39925 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [1] 39620 killed /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned ~ /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned GNU gdb (GDB) 10.1 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-apple-darwin20.3.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /Users/quatrava/Dev/tmp/ls-x86_64-unsigned... (No debugging symbols found in /Users/quatrava/Dev/tmp/ls-x86_64-unsigned) (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x2603, pid: 40291 [New Thread 0x1903 of process 40291] [40289 inferior]: darwin_wait: waiting for a message pid=40291 thread=0 [40291 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40291 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40291 exit, status=0x57f [40289 inferior]: task=0x2603, prev=0x0, notify_port=0x2303 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x1807, pid: 40293 [New Thread 0x1a03 of process 40293] [40289 inferior]: darwin_wait: waiting for a message pid=40293 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x2103 [40293 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40293 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40293 exit, status=0x57f [40289 inferior]: task=0x1807, prev=0x0, notify_port=0x2307 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x190b, pid: 40295 [New Thread 0x2207 of process 40295] [40289 inferior]: darwin_wait: waiting for a message pid=40295 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x2107 [40295 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40295 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40295 exit, status=0x57f [40289 inferior]: task=0x190b, prev=0x0, notify_port=0x230b During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x1a0b, pid: 40297 [New Thread 0x200b of process 40297] [40289 inferior]: darwin_wait: waiting for a message pid=40297 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x210b [40297 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40297 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40297 exit, status=0x57f [40289 inferior]: task=0x1a0b, prev=0x0, notify_port=0x230f During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x220f, pid: 40299 [New Thread 0x1f0b of process 40299] [40289 inferior]: darwin_wait: waiting for a message pid=40299 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x210f [40299 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40299 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40299 exit, status=0x57f [40289 inferior]: task=0x220f, prev=0x0, notify_port=0x2313 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x2013, pid: 40301 [New Thread 0x1b0b of process 40301] [40289 inferior]: darwin_wait: waiting for a message pid=40301 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x2113 [40301 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40301 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40301 exit, status=0x57f [40289 inferior]: task=0x2013, prev=0x0, notify_port=0x2317 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x1f13, pid: 40303 [New Thread 0x1c0b of process 40303] [40289 inferior]: darwin_wait: waiting for a message pid=40303 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x2117 [40303 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40303 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40303 exit, status=0x57f [40289 inferior]: task=0x1f13, prev=0x0, notify_port=0x231b During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x1b13, pid: 40305 [New Thread 0x1e0b of process 40305] [40289 inferior]: darwin_wait: waiting for a message pid=40305 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x211b [40305 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40305 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [40289 inferior]: darwin_wait: pid=40305 exit, status=0x57f [40289 inferior]: task=0x1b13, prev=0x0, notify_port=0x231f During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [40289 inferior]: inferior task: 0x1c13, pid: 40307 [New Thread 0x1d0b of process 40307] [40289 inferior]: darwin_wait: waiting for a message pid=40307 thread=0 [40289 inferior]: darwin_decode_exception_message: unknown task 0x211f [40307 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [40307 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [New Thread 0x2123 of process 40307] [40289 inferior]: darwin_wait: thread=0x2123, got EXC_SOFTWARE [40289 inferior]: (signal 5: SIGTRAP) [40289 inferior]: darwin_wait: unhandled pending message [40289 inferior]: darwin_xfer_partial(0x0000000000000000, 4096, rbuf=0x7fcb2a82ba00, wbuf=0x0) pid=40307 [40289 inferior]: darwin_xfer_partial(0x0000000000000000, 8, rbuf=0x7ffee76a4798, wbuf=0x0) pid=40307 [40289 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffee76a4730, wbuf=0x0) pid=40307 [40289 inferior]: darwin_read_write_inferior(task=0x5407, 0x00000001000b4000, len=24) warning: unhandled dyld version (17) [40289 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffee76a45e0, wbuf=0x0) pid=40307 [40289 inferior]: darwin_read_write_inferior(task=0x5407, 0x00000001000b4000, len=24) [40289 inferior]: darwin_resume: pid=40307, tid=0x0, step=0, signal=0 [40289 inferior]: darwin_resume_thread: state=2, thread=0x2123, step=0 nsignal=0 [40289 inferior]: ptrace (PT_THUPDATE, 40307, 0x2123, 0): 0 (no error) [40289 inferior]: darwin_set_sstep (thread=0x2123, enable=0) [40289 inferior]: darwin_wait: waiting for a message pid=-1 thread=0 Applications MdcsDataLocation _build Archive MetroGit access.log.sample Autodesk Movies bin Desktop Music diff.GOLD Dev MyJabberFiles diff.NEW Documents Pictures dvbern-tax Downloads Public iterm2 Fusion 360 CAM VaudTax out.json Google Drive VaudTax2019 script Library VirtualBox VMs test.mv.db [40289 inferior]: darwin_wait: pid=40307 exit, status=0x0 --Type <RET> for more, q to quit, c to continue without paging--q Quit Error calling thread_get_state for GP registers for thread 0x2123 warning: Mach error at "../../gdb/i386-darwin-nat.c:83" in function "virtual void i386_darwin_nat_target::fetch_registers(struct regcache *, int)": (ipc/send) invalid destination port (0x10000003) (gdb)
(In reply to Simon Marchi from comment #9) > https://sourceware.org/bugzilla/show_bug.cgi?id=26861 ? I don't know, maybe... Building upon your suggestion I tried diff -U3 gdb-10.1-ORIG/gdb/darwin-nat.c gdb-10.1/gdb/darwin-nat.c --- gdb-10.1-ORIG/gdb/darwin-nat.c 2020-10-24 06:23:02.000000000 +0200 +++ gdb-10.1/gdb/darwin-nat.c 2021-04-07 20:17:15.000000000 +0200 @@ -1108,9 +1108,6 @@ inferior_debug (4, _("darwin_wait: pid=%d exit, status=0x%x\n"), res_pid, wstatus); - /* Looks necessary on Leopard and harmless... */ - wait4 (inf->pid, &wstatus, 0, NULL); - return ptid_t (inf->pid); } else diff -U3 gdb-10.1-ORIG/gdb/target.c gdb-10.1/gdb/target.c --- gdb-10.1-ORIG/gdb/target.c 2020-10-24 06:23:02.000000000 +0200 +++ gdb-10.1/gdb/target.c 2021-04-07 20:18:56.000000000 +0200 @@ -2146,7 +2146,7 @@ void target_mourn_inferior (ptid_t ptid) { - gdb_assert (ptid == inferior_ptid); + gdb_assert (ptid.pid () == inferior_ptid.pid ()); current_top_target ()->mourn_inferior (); /* We no longer need to keep handles on any of the object files. and I got a lot less zombies and an improved rate of success (see new transcript below which produced 4 zombies) =-=-=-=-=-=-=-=-=-=-=-=-=-= /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned GNU gdb (GDB) 10.1 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-apple-darwin20.3.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /Users/quatrava/Dev/tmp/ls-x86_64-unsigned... (No debugging symbols found in /Users/quatrava/Dev/tmp/ls-x86_64-unsigned) (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8076 inferior]: inferior task: 0x1703, pid: 8078 [New Thread 0x2203 of process 8078] [8076 inferior]: darwin_wait: waiting for a message pid=8078 thread=0 [8078 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8078 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [1] 8076 killed /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned ~ /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned GNU gdb (GDB) 10.1 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-apple-darwin20.3.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /Users/quatrava/Dev/tmp/ls-x86_64-unsigned... (No debugging symbols found in /Users/quatrava/Dev/tmp/ls-x86_64-unsigned) (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8220 inferior]: inferior task: 0x1a03, pid: 8222 [New Thread 0x2703 of process 8222] [8220 inferior]: darwin_wait: waiting for a message pid=8222 thread=0 [8222 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8222 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [New Thread 0x2503 of process 8222] [8220 inferior]: darwin_wait: thread=0x2503, got EXC_SOFTWARE [8220 inferior]: (signal 5: SIGTRAP) [8220 inferior]: darwin_wait: unhandled pending message [8220 inferior]: darwin_xfer_partial(0x0000000000000000, 4096, rbuf=0x7f861c072c00, wbuf=0x0) pid=8222 [8220 inferior]: darwin_xfer_partial(0x0000000000000000, 8, rbuf=0x7ffeee847798, wbuf=0x0) pid=8222 [8220 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffeee847730, wbuf=0x0) pid=8222 [8220 inferior]: darwin_read_write_inferior(task=0x1e03, 0x00000001000b4000, len=24) warning: unhandled dyld version (17) [8220 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffeee8475e0, wbuf=0x0) pid=8222 [8220 inferior]: darwin_read_write_inferior(task=0x1e03, 0x00000001000b4000, len=24) [8220 inferior]: darwin_resume: pid=8222, tid=0x0, step=0, signal=0 [8220 inferior]: darwin_resume_thread: state=2, thread=0x2503, step=0 nsignal=0 [8220 inferior]: ptrace (PT_THUPDATE, 8222, 0x2503, 0): 0 (no error) [8220 inferior]: darwin_set_sstep (thread=0x2503, enable=0) [8220 inferior]: darwin_wait: waiting for a message pid=-1 thread=0 Applications MdcsDataLocation _build Archive MetroGit access.log.sample Autodesk Movies bin Desktop Music diff.GOLD Dev MyJabberFiles diff.NEW Documents Pictures dvbern-tax Downloads Public iterm2 Fusion 360 CAM VaudTax out.json Google Drive VaudTax2019 script Library VirtualBox VMs test.mv.db [8220 inferior]: darwin_wait: pid=8222 exit, status=0x0 --Type <RET> for more, q to quit, c to continue without paging--q Quit Error calling thread_get_state for GP registers for thread 0x2503 warning: Mach error at "../../gdb/i386-darwin-nat.c:83" in function "virtual void i386_darwin_nat_target::fetch_registers(struct regcache *, int)": (ipc/send) invalid destination port (0x10000003) (gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y warning: Mach error at "../../gdb/darwin-nat.c:1499" in function "virtual void darwin_nat_target::kill()": (ipc/send) invalid destination port (0x10000003) [8220 inferior]: darwin_wait: waiting for a message pid=8222 thread=0 [1] 8220 killed /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned ~ /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned GNU gdb (GDB) 10.1 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-apple-darwin20.3.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /Users/quatrava/Dev/tmp/ls-x86_64-unsigned... (No debugging symbols found in /Users/quatrava/Dev/tmp/ls-x86_64-unsigned) (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8363 inferior]: inferior task: 0x2703, pid: 8365 [New Thread 0x1b03 of process 8365] [8363 inferior]: darwin_wait: waiting for a message pid=8365 thread=0 [8365 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8365 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8363 inferior]: darwin_wait: pid=8365 exit, status=0x57f [8363 inferior]: task=0x2703, prev=0x0, notify_port=0x1a03 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8363 inferior]: inferior task: 0x2507, pid: 8367 [New Thread 0x2403 of process 8367] [8363 inferior]: darwin_wait: waiting for a message pid=8367 thread=0 [8363 inferior]: darwin_decode_exception_message: unknown task 0x1d03 [8367 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8367 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [New Thread 0x1d07 of process 8367] [8363 inferior]: darwin_wait: thread=0x1d07, got EXC_SOFTWARE [8363 inferior]: (signal 5: SIGTRAP) [8363 inferior]: darwin_wait: unhandled pending message [8363 inferior]: darwin_xfer_partial(0x0000000000000000, 4096, rbuf=0x7f995082ee00, wbuf=0x0) pid=8367 [8363 inferior]: darwin_xfer_partial(0x0000000000000000, 8, rbuf=0x7ffeecc57798, wbuf=0x0) pid=8367 [8363 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffeecc57730, wbuf=0x0) pid=8367 [8363 inferior]: darwin_read_write_inferior(task=0x1e07, 0x00000001000b4000, len=24) warning: unhandled dyld version (17) [8363 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffeecc575e0, wbuf=0x0) pid=8367 [8363 inferior]: darwin_read_write_inferior(task=0x1e07, 0x00000001000b4000, len=24) [8363 inferior]: darwin_resume: pid=8367, tid=0x0, step=0, signal=0 [8363 inferior]: darwin_resume_thread: state=2, thread=0x1d07, step=0 nsignal=0 [8363 inferior]: ptrace (PT_THUPDATE, 8367, 0x1d07, 0): 0 (no error) [8363 inferior]: darwin_set_sstep (thread=0x1d07, enable=0) [8363 inferior]: darwin_wait: waiting for a message pid=-1 thread=0 Applications MdcsDataLocation _build Archive MetroGit access.log.sample Autodesk Movies bin Desktop Music diff.GOLD Dev MyJabberFiles diff.NEW Documents Pictures dvbern-tax Downloads Public iterm2 Fusion 360 CAM VaudTax out.json Google Drive VaudTax2019 script Library VirtualBox VMs test.mv.db [8363 inferior]: darwin_wait: pid=8367 exit, status=0x0 --Type <RET> for more, q to quit, c to continue without paging--q Quit Error calling thread_get_state for GP registers for thread 0x1d07 warning: Mach error at "../../gdb/i386-darwin-nat.c:83" in function "virtual void i386_darwin_nat_target::fetch_registers(struct regcache *, int)": (ipc/send) invalid destination port (0x10000003) (gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y warning: Mach error at "../../gdb/darwin-nat.c:1499" in function "virtual void darwin_nat_target::kill()": (ipc/send) invalid destination port (0x10000003) [8363 inferior]: darwin_wait: waiting for a message pid=8367 thread=0 [1] 8363 killed /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned ~ /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned GNU gdb (GDB) 10.1 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-apple-darwin20.3.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /Users/quatrava/Dev/tmp/ls-x86_64-unsigned... (No debugging symbols found in /Users/quatrava/Dev/tmp/ls-x86_64-unsigned) (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8657 inferior]: inferior task: 0x2603, pid: 8659 [New Thread 0x2203 of process 8659] [8657 inferior]: darwin_wait: waiting for a message pid=8659 thread=0 [8659 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8659 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8657 inferior]: darwin_wait: pid=8659 exit, status=0x57f [8657 inferior]: task=0x2603, prev=0x0, notify_port=0x2303 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8657 inferior]: inferior task: 0x1807, pid: 8735 [New Thread 0x2103 of process 8735] [8657 inferior]: darwin_wait: waiting for a message pid=8735 thread=0 [8657 inferior]: darwin_decode_exception_message: unknown task 0x1903 [8735 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8735 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8657 inferior]: darwin_wait: pid=8735 exit, status=0x57f [8657 inferior]: task=0x1807, prev=0x0, notify_port=0x2307 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8657 inferior]: inferior task: 0x220b, pid: 8737 [New Thread 0x2007 of process 8737] [8657 inferior]: darwin_wait: waiting for a message pid=8737 thread=0 [8657 inferior]: darwin_decode_exception_message: unknown task 0x1907 [8737 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8737 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8657 inferior]: darwin_wait: pid=8737 exit, status=0x57f [8657 inferior]: task=0x220b, prev=0x0, notify_port=0x230b During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8657 inferior]: inferior task: 0x210b, pid: 8739 [New Thread 0x1a0b of process 8739] [8657 inferior]: darwin_wait: waiting for a message pid=8739 thread=0 [8657 inferior]: darwin_decode_exception_message: unknown task 0x190b [8739 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8739 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8657 inferior]: darwin_wait: pid=8739 exit, status=0x57f [8657 inferior]: task=0x210b, prev=0x0, notify_port=0x230f During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8657 inferior]: inferior task: 0x200f, pid: 8741 [New Thread 0x1b0b of process 8741] [8657 inferior]: darwin_wait: waiting for a message pid=8741 thread=0 [8657 inferior]: darwin_decode_exception_message: unknown task 0x190f [8741 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8741 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8657 inferior]: darwin_wait: pid=8741 exit, status=0x57f [8657 inferior]: task=0x200f, prev=0x0, notify_port=0x2313 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8657 inferior]: inferior task: 0x1a13, pid: 8743 [New Thread 0x1f0b of process 8743] [8657 inferior]: darwin_wait: waiting for a message pid=8743 thread=0 [8657 inferior]: darwin_decode_exception_message: unknown task 0x1913 [8743 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8743 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8657 inferior]: darwin_wait: pid=8743 exit, status=0x57f [8657 inferior]: task=0x1a13, prev=0x0, notify_port=0x2317 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8657 inferior]: inferior task: 0x1b13, pid: 8745 [New Thread 0x1c0b of process 8745] [8657 inferior]: darwin_wait: waiting for a message pid=8745 thread=0 [8657 inferior]: darwin_decode_exception_message: unknown task 0x1917 [8745 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8745 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8657 inferior]: darwin_wait: pid=8745 exit, status=0x57f [8657 inferior]: task=0x1b13, prev=0x0, notify_port=0x231b During startup program terminated with signal ?, Unknown signal. (gdb) quit ~ /usr/local/Cellar/gdb@10.1/10.1_1/bin/gdb ~/Dev/tmp/ls-x86_64-unsigned GNU gdb (GDB) 10.1 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-apple-darwin20.3.0". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /Users/quatrava/Dev/tmp/ls-x86_64-unsigned... (No debugging symbols found in /Users/quatrava/Dev/tmp/ls-x86_64-unsigned) (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8890 inferior]: inferior task: 0x2703, pid: 8892 [New Thread 0x1b03 of process 8892] [8890 inferior]: darwin_wait: waiting for a message pid=8892 thread=0 [8892 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8892 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [8890 inferior]: darwin_wait: pid=8892 exit, status=0x57f [8890 inferior]: task=0x2703, prev=0x0, notify_port=0x1a03 During startup program terminated with signal ?, Unknown signal. (gdb) run Starting program: /Users/quatrava/Dev/tmp/ls-x86_64-unsigned [8890 inferior]: inferior task: 0x2507, pid: 8894 [New Thread 0x2403 of process 8894] [8890 inferior]: darwin_wait: waiting for a message pid=8894 thread=0 [8890 inferior]: darwin_decode_exception_message: unknown task 0x1c03 [8894 inferior]: ptrace (PT_TRACE_ME, 0, 0x0, 0): 0 (no error) [8894 inferior]: ptrace (PT_SIGEXC, 0, 0x0, 0): 0 (no error) [New Thread 0x1c07 of process 8894] [8890 inferior]: darwin_wait: thread=0x1c07, got EXC_SOFTWARE [8890 inferior]: (signal 5: SIGTRAP) [8890 inferior]: darwin_wait: unhandled pending message [8890 inferior]: darwin_xfer_partial(0x0000000000000000, 4096, rbuf=0x7f9d5f864000, wbuf=0x0) pid=8894 [8890 inferior]: darwin_xfer_partial(0x0000000000000000, 8, rbuf=0x7ffee3a90798, wbuf=0x0) pid=8894 [8890 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffee3a90730, wbuf=0x0) pid=8894 [8890 inferior]: darwin_read_write_inferior(task=0x2207, 0x00000001000b4000, len=24) warning: unhandled dyld version (17) [8890 inferior]: darwin_xfer_partial(0x00000001000b4000, 24, rbuf=0x7ffee3a905e0, wbuf=0x0) pid=8894 [8890 inferior]: darwin_read_write_inferior(task=0x2207, 0x00000001000b4000, len=24) [8890 inferior]: darwin_resume: pid=8894, tid=0x0, step=0, signal=0 [8890 inferior]: darwin_resume_thread: state=2, thread=0x1c07, step=0 nsignal=0 [8890 inferior]: ptrace (PT_THUPDATE, 8894, 0x1c07, 0): 0 (no error) [8890 inferior]: darwin_set_sstep (thread=0x1c07, enable=0) [8890 inferior]: darwin_wait: waiting for a message pid=-1 thread=0 Applications MdcsDataLocation _build Archive MetroGit access.log.sample Autodesk Movies bin Desktop Music diff.GOLD Dev MyJabberFiles diff.NEW Documents Pictures dvbern-tax Downloads Public iterm2 Fusion 360 CAM VaudTax out.json Google Drive VaudTax2019 script Library VirtualBox VMs test.mv.db [8890 inferior]: darwin_wait: pid=8894 exit, status=0x0 --Type <RET> for more, q to quit, c to continue without paging--q Quit Error calling thread_get_state for GP registers for thread 0x1c07 warning: Mach error at "../../gdb/i386-darwin-nat.c:83" in function "virtual void i386_darwin_nat_target::fetch_registers(struct regcache *, int)": (ipc/send) invalid destination port (0x10000003)
I did the same experiments on git HEAD, with almost the same results (still a lot of zombies leaking, still way less than 20% of the runs resulting in a successful execution, and still the occasional hang despite the NOHANG part in WNOHANG). There is one improvement in HEAD though — The debugger no longer crashes after the debuggee exits. One may reproduce my work by doing brew install --build-from-source domq/gdb/gdb or brew install --build-from-source --head domq/gdb/gdb respectively (or browse https://github.com/domq/homebrew-gdb ). I'm afraid I don't have much more to contribute right now in terms of time or competence.
Thanks for investigating that. As far as I know, no current GDB contributor uses macOS (or at least, develops and uses GDB on macOS), so it's a bit hard to do any fixes, and that's why things are not in a good shape. To really fix things, it takes somebody willing to invest the time to understand the debug API on macOS and understand what GDB is doing wrong. If you need some help about the GDB internals, we can answer your questions.
(In reply to Simon Marchi from comment #13) > To really > fix things, it takes somebody willing to invest the time to understand the > debug API on macOS and understand what GDB is doing wrong. If you need some > help about the GDB internals, we can answer your questions. Thanks for the kind words Simon - You prompted me to put my nose to the grindstone again, with encouraging results (patch below). Now gdb works well on (a lipo'd copy of) /bin/ls: no more zombies, (close to) 100% `run` success rate — but only if `set startup-with-shell off` is set. I'll be doing some thinking about WIFSTOPPED and the two-subprocesses state machine (i.e. when `set startup-with-shell on`), and get back to you if I have questions. Thanks again, Dominique =-=-=-=-=-=-=-=-=-=-=-=-= From 4ac07c7f51b67d909d20a3e4f2dde0af1c0214e8 Mon Sep 17 00:00:00 2001 From: Dominique Quatravaux <dominique.quatravaux@epfl.ch> Date: Thu, 8 Apr 2021 18:01:04 +0200 Subject: [PATCH] [fix] Skip over WIFSTOPPED wait4 status On modern Darwin's, there appears to be a new circumstance in which a MACH_NOTIFY_DEAD_NAME message can be received, and which was not previously accounted for: to signal the WIFSTOPPED condition in the debuggee. In that case the debuggee is not quite dead (and in fact, treating it as dead, as we previously did, would cause a zombie leak - A process in such a state reparents to PID 1 when killed, but remains in the zombie state with no obvious way to fix it, besides rebooting). - Read and ignore WIFSTOPPED outcomes to the wait4 that we do upon receiving a MACH_NOTIFY_DEAD_NAME message (counting on the next exception message to let us know about the inferior's new state) - Refactor logging so as to clearly distinguish between the three MACH_NOTIFY_DEAD_NAME cases (WIFEXITED, WIFSTOPPED, signal) --- gdb/darwin-nat.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/gdb/darwin-nat.c b/gdb/darwin-nat.c index b23a8051b09..5eaee743c4a 100644 --- a/gdb/darwin-nat.c +++ b/gdb/darwin-nat.c @@ -1053,7 +1053,7 @@ darwin_nat_target::decode_message (mach_msg_header_t *hdr, } else if (hdr->msgh_id == 0x48) { - /* MACH_NOTIFY_DEAD_NAME: notification for exit. */ + /* MACH_NOTIFY_DEAD_NAME: notification for exit *or* WIFSTOPPED. */ int res; res = darwin_decode_notify_message (hdr, &inf); @@ -1096,16 +1096,23 @@ darwin_nat_target::decode_message (mach_msg_header_t *hdr, { status->kind = TARGET_WAITKIND_EXITED; status->value.integer = WEXITSTATUS (wstatus); + inferior_debug (4, _("darwin_wait: pid=%d exit, status=0x%x\n"), + res_pid, wstatus); + } + else if (WIFSTOPPED (wstatus)) + { + status->kind = TARGET_WAITKIND_IGNORE; + inferior_debug (4, _("darwin_wait: pid %d received WIFSTOPPED\n"), res_pid); + return minus_one_ptid; } else { status->kind = TARGET_WAITKIND_SIGNALLED; status->value.sig = gdb_signal_from_host (WTERMSIG (wstatus)); + inferior_debug (4, _("darwin_wait: pid=%d received signal %d\n"), + res_pid, status->value.sig); } - inferior_debug (4, _("darwin_wait: pid=%d exit, status=0x%x\n"), - res_pid, wstatus); - /* Looks necessary on Leopard and harmless... */ wait4 (inf->pid, &wstatus, WNOHANG, NULL);
Happy to report further good news: • both `brew install --build-from-source --head domq/gdb/gdb` and `brew install --force --build-from-source domq/gdb/gdb` (based on released version 10.1) now work well (no crashes, no zombies); • the latter doesn't appear to have trouble with `set startup-with-shell on`, so that symptom must be an unrelated regression between 10.1 and HEAD \o/
(In reply to DomQ from comment #14) > (In reply to Simon Marchi from comment #13) > > To really > > fix things, it takes somebody willing to invest the time to understand the > > debug API on macOS and understand what GDB is doing wrong. If you need some > > help about the GDB internals, we can answer your questions. > > Thanks for the kind words Simon - You prompted me to put my nose to the > grindstone again, with encouraging results (patch below). Now gdb works well > on (a lipo'd copy of) /bin/ls: no more zombies, (close to) 100% `run` > success rate — but only if `set startup-with-shell off` is set. > > I'll be doing some thinking about WIFSTOPPED and the two-subprocesses state > machine (i.e. when `set startup-with-shell on`), and get back to you if I > have questions. Thanks again, Dominique > > =-=-=-=-=-=-=-=-=-=-=-=-= > From 4ac07c7f51b67d909d20a3e4f2dde0af1c0214e8 Mon Sep 17 00:00:00 2001 > From: Dominique Quatravaux <dominique.quatravaux@epfl.ch> > Date: Thu, 8 Apr 2021 18:01:04 +0200 > Subject: [PATCH] [fix] Skip over WIFSTOPPED wait4 status > > On modern Darwin's, there appears to be a new circumstance in which a > MACH_NOTIFY_DEAD_NAME message can be received, and which was not > previously accounted for: to signal the WIFSTOPPED condition in the > debuggee. In that case the debuggee is not quite dead (and in fact, > treating it as dead, as we previously did, would cause a zombie leak - > A process in such a state reparents to PID 1 when killed, but remains > in the zombie state with no obvious way to fix it, besides rebooting). > > - Read and ignore WIFSTOPPED outcomes to the wait4 that we do upon receiving > a MACH_NOTIFY_DEAD_NAME message (counting on the next exception > message to let us know about the inferior's new state) > - Refactor logging so as to clearly distinguish between the three > MACH_NOTIFY_DEAD_NAME cases (WIFEXITED, WIFSTOPPED, signal) > --- > gdb/darwin-nat.c | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/gdb/darwin-nat.c b/gdb/darwin-nat.c > index b23a8051b09..5eaee743c4a 100644 > --- a/gdb/darwin-nat.c > +++ b/gdb/darwin-nat.c > @@ -1053,7 +1053,7 @@ darwin_nat_target::decode_message (mach_msg_header_t > *hdr, > } > else if (hdr->msgh_id == 0x48) > { > - /* MACH_NOTIFY_DEAD_NAME: notification for exit. */ > + /* MACH_NOTIFY_DEAD_NAME: notification for exit *or* WIFSTOPPED. */ > int res; > > res = darwin_decode_notify_message (hdr, &inf); > @@ -1096,16 +1096,23 @@ darwin_nat_target::decode_message (mach_msg_header_t > *hdr, > { > status->kind = TARGET_WAITKIND_EXITED; > status->value.integer = WEXITSTATUS (wstatus); > + inferior_debug (4, _("darwin_wait: pid=%d exit, status=0x%x\n"), > + res_pid, wstatus); > + } > + else if (WIFSTOPPED (wstatus)) > + { > + status->kind = TARGET_WAITKIND_IGNORE; > + inferior_debug (4, _("darwin_wait: pid %d received WIFSTOPPED\n"), > res_pid); > + return minus_one_ptid; > } > else > { > status->kind = TARGET_WAITKIND_SIGNALLED; > status->value.sig = gdb_signal_from_host (WTERMSIG (wstatus)); > + inferior_debug (4, _("darwin_wait: pid=%d received signal %d\n"), > + res_pid, status->value.sig); > } > > - inferior_debug (4, _("darwin_wait: pid=%d exit, status=0x%x\n"), > - res_pid, wstatus); > - > /* Looks necessary on Leopard and harmless... */ > wait4 (inf->pid, &wstatus, WNOHANG, NULL); Awesome, would you mind sending the patch to gdb-patches@sourceware.org using git-send-email? It will be easier to apply locally like this. I'll take a closer look then and probably just merge it, given that you gave a good explanation of what happens in the commit message, and the current state is simply broken. One boring and annoying detail though: to let us merge your patch (and subsequent ones), since it's not trivial, you'll need to have a copyright assignment on file (if you don't have one already): https://sourceware.org/gdb/wiki/ContributionChecklist#FSF_copyright_Assignment It can take a bit of time for the FSF to process them, so the earlier you send it the better. You can still send the patch on the mailing list in the mean time, it's just that it will have to sit there until you hear back from the FSF.
Done — I now have a proper patch list (also after the __END__ in https://github.com/domq/homebrew-gdb/blob/master/Formula/gdb.rb#L117 ) and I think I succeeded in sending it as you requested (please confirm). The first step for my copyright assignment paperwork is also pending (you were Cc'd). Yours truly, Dominique
FYI: I included your WIFSTOPPED patch in my old gdb 8.3 build and it worked miracles! I now have a 100% launch rate and no lock-ups anymore. Thanx a lot!
(In reply to DomQ from comment #17) > Done — I now have a proper patch list (also after the __END__ in > https://github.com/domq/homebrew-gdb/blob/master/Formula/gdb.rb#L117 ) and I > think I succeeded in sending it as you requested (please confirm). The first > step for my copyright assignment paperwork is also pending (you were Cc'd). > > Yours truly, Dominique Thanks a lot for your work patching this bug! I've having trouble over the last day/was thinking of using lldb instead. Do you know if the gdb team is working on merging your patch? Still having the problems in gdb version 10.2
(In reply to Jordan Mandel from comment #19) > (In reply to DomQ from comment #17) > > Done — I now have a proper patch list (also after the __END__ in > > https://github.com/domq/homebrew-gdb/blob/master/Formula/gdb.rb#L117 ) and I > > think I succeeded in sending it as you requested (please confirm). The first > > step for my copyright assignment paperwork is also pending (you were Cc'd). > > > > Yours truly, Dominique > > Thanks a lot for your work patching this bug! I've having trouble over the > last day/was thinking of using lldb instead. Do you know if the gdb team is > working on merging your patch? Still having the problems in gdb version 10.2 The thread is here: https://sourceware.org/pipermail/gdb-patches/2021-April/177598.html I was waiting for Dominique to tell us his copyright assignment process was completed, but it looks like it's done since May 18th :). I'll reply to the thread.
Hi, I am a bit confused as there appear to be three threads of emails a) Dominque's remove, Simon's push (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=83a559f7b93f2a06306e46d0d9ac094c599396ae) b) Spurious call to wait4 (can't find these patched files on git for sourceware.org) c) WIFSTOPPED in debugee fixes (again, can't find these patched files) are these all in 10.2 already? Sam ps, https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=83a559f7b93f2a06306e46d0d9ac094c599396ae found the 'a' changes
Simon showed my how to get the two patches (raw emails from the piper mails, and git-am the raw email to apply the pre-committed patches to the git repository I have).
Created attachment 13544 [details] log of 'run' commands with 2 of 3 changes in
I have two of the three changes in. GDB is more reliable on macOS with these modifications, yet, it's not reliably starting the debug target - it's simply reliably not losing the debug target. (see "log of run" attachment) I run the test-suite, and then after studying the third mod add that into my tree too. The third change uses some artistic macros, so I want to make sure I fully understand what's its doing. At the moment, without this last set the return value is not accurate - when the run fails.
Created attachment 13547 [details] after applying the two patches teststuites on macOS with two of three patches in place. Testsuites fail rate due to bit-rot it appears.
Hi, after studying the third patch a bit I found out a few things: 1) it was submmitted with an editor that seems to have altered some of the characters. I confused the odd characters as some possible macros. 2) The changes intent was to send a unique exit code, when the target of debug wasn't being traced. 3) the particular routine looks to have a rather complex control flow in the present implementation. For me, given the combination, I am going to study and see if there isn't more concise implementation than to add additional conditionals. So, I go investigate how to get the first two patches of Dominque - resubmitted from me. Sam
Created attachment 13548 [details] attachment-4189443-0.html Le jeu. 8 juil. 2021 à 22:52, Samuel.r.warner at me dot com < sourceware-bugzilla@sourceware.org> a écrit : > So, I go > investigate how to get the first two patches of Dominque - resubmitted > from me. > Just to let you know I'm still alive, even if a bit under the weather as far as the day job is concerned; so, many thanks to Samuel for taking over! (And yes, I should have known about the test suite.) Yours truly, -- Dominique Quatravaux dominique@quatravaux.org
(In reply to DomQ from comment #27) > Created attachment 13548 [details] > attachment-4189443-0.html > > Le jeu. 8 juil. 2021 à 22:52, Samuel.r.warner at me dot com < > sourceware-bugzilla@sourceware.org> a écrit : > > > So, I go > > investigate how to get the first two patches of Dominque - resubmitted > > from me. > > > > Just to let you know I'm still alive, even if a bit under the weather as > far as the day job is concerned; so, many thanks to Samuel for taking over! > > (And yes, I should have known about the test suite.) > > Yours truly, > > -- > Dominique Quatravaux > dominique@quatravaux.org Thanks for the notice. The issue is marked for the 11.1 release. Do you think you'll have time to send the v2 before 2-3 weeks? If not, I could try to pick it up, but I don't think I would be able to test it as well as you.
Created attachment 13549 [details] Patch file redone for Dominique 2cnd (remove wait4) change Hi Simon, Here's a PATCH file for Dominique's 2cnd change. You've already committed his 1st change since the 10.2 release. This good? Sam
Removing the 11.1 target milestone due to lack of progress for the past couple of months.
Dominique, just a friendly ping to see if you would have time to send an updated patch for this.
I could jump on it too, if needed from phone On Nov 21, 2021, at 6:41 AM, simark at simark dot ca <sourceware-bugzilla@sourceware.org> wrote: https://sourceware.org/bugzilla/show_bug.cgi?id=24069 --- Comment #31 from Simon Marchi <simark at simark dot ca> --- Dominique, just a friendly ping to see if you would have time to send an updated patch for this.
Created attachment 13803 [details] attachment-1445541-0.html Le dim. 21 nov. 2021 à 18:21, Samuel.r.warner at me dot com < sourceware-bugzilla@sourceware.org> a écrit : > https://sourceware.org/bugzilla/show_bug.cgi?id=24069 > > --- Comment #32 from Sam Warner <Samuel.r.warner at me dot com> --- > I could jump on it too, if needed > > Very much appreciated Samuel, as right now I don't have the time. Best regards to all involved, -- Dominique Quatravaux dominique@quatravaux.org
I’ll jump on this next week\ Sam > On Nov 22, 2021, at 7:23 AM, dominique at quatravaux dot org <sourceware-bugzilla@sourceware.org> wrote: > > https://sourceware.org/bugzilla/show_bug.cgi?id=24069 > > --- Comment #33 from DomQ <dominique at quatravaux dot org> --- > Le dim. 21 nov. 2021 à 18:21, Samuel.r.warner at me dot com < > sourceware-bugzilla@sourceware.org> a écrit : > >> https://sourceware.org/bugzilla/show_bug.cgi?id=24069 >> >> --- Comment #32 from Sam Warner <Samuel.r.warner at me dot com> --- >> I could jump on it too, if needed >> >> > Very much appreciated Samuel, as right now I don't have the time. > > Best regards to all involved, > > -- > Dominique Quatravaux > dominique@quatravaux.org > > -- > You are receiving this mail because: > You are on the CC list for the bug.
Hi all, I am new to the GDB community and would like to contribute to this bug report. The bug still persists in the latest GDB 11.2 release. Based on the change from DomQ: https://github.com/domq/homebrew-gdb/blob/master/Formula/gdb.rb I am not sure why the changes (from line 137 - 178) in the above link is not included in the latest GDB release. I tried to move that part of the code to the latest GDB 11.2 and most of the time, the GDB on the darwin machine works. The modified source code locates in my GitHub Repo: https://github.com/Louis-He/gdb_darwin_hang_fix I have tested it on my Intel-based Macbook pro with Mac OSX 12.2. Best, Louis
(In reply to Louis He from comment #35) > Hi all, > > I am new to the GDB community and would like to contribute to this bug > report. The bug still persists in the latest GDB 11.2 release. Based on the > change from DomQ: > https://github.com/domq/homebrew-gdb/blob/master/Formula/gdb.rb > > I am not sure why the changes (from line 137 - 178) in the above link is not > included in the latest GDB release. > > I tried to move that part of the code to the latest GDB 11.2 and most of the > time, the GDB on the darwin machine works. The modified source code locates > in my GitHub Repo: https://github.com/Louis-He/gdb_darwin_hang_fix > > I have tested it on my Intel-based Macbook pro with Mac OSX 12.2. > > Best, > Louis Can you please provide a patch that applies on master? The latest attempt, I wasn't able to apply the patch on master.
Created attachment 13951 [details] The patch for PR-24069
(In reply to Simon Marchi from comment #36) > (In reply to Louis He from comment #35) > > Hi all, > > > > I am new to the GDB community and would like to contribute to this bug > > report. The bug still persists in the latest GDB 11.2 release. Based on the > > change from DomQ: > > https://github.com/domq/homebrew-gdb/blob/master/Formula/gdb.rb > > > > I am not sure why the changes (from line 137 - 178) in the above link is not > > included in the latest GDB release. > > > > I tried to move that part of the code to the latest GDB 11.2 and most of the > > time, the GDB on the darwin machine works. The modified source code locates > > in my GitHub Repo: https://github.com/Louis-He/gdb_darwin_hang_fix > > > > I have tested it on my Intel-based Macbook pro with Mac OSX 12.2. > > > > Best, > > Louis > > Can you please provide a patch that applies on master? The latest attempt, > I wasn't able to apply the patch on master. Hi Simon, I have created a patch and attached it to this thread. Best, Louis
Comment on attachment 13951 [details] The patch for PR-24069 From 951136540042788b78ef4aec8895a889a3b083f1 Mon Sep 17 00:00:00 2001 From: Louis-He <1726110778@qq.com> Date: Wed, 2 Feb 2022 14:51:43 -0500 Subject: [PATCH] gdb: A potential fix for PR-24069 After this fix, the possibility of successful run on mac os increases. However, it is still possible that the gdb hangs. A workaround is issue "control+c" and the gdb will continue running as expected. The root cause of the bug hasn't been identified, and this fix is a temporary patch to PR-24069. --- gdb/darwin-nat.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/gdb/darwin-nat.c b/gdb/darwin-nat.c index d96ce1a6c65..844dbf499fe 100644 --- a/gdb/darwin-nat.c +++ b/gdb/darwin-nat.c @@ -1102,8 +1102,17 @@ darwin_nat_target::decode_message (mach_msg_header_t *hdr, status->set_ignore (); return minus_one_ptid; } + if (WIFEXITED (wstatus)) - status->set_exited (WEXITSTATUS (wstatus)); + { + status->set_exited (WEXITSTATUS (wstatus)); + } + else if (WIFSTOPPED (wstatus)) + { + status->set_ignore (); + inferior_debug (4, _("darwin_wait: pid %d received WIFSTOPPED\n"), res_pid); + return minus_one_ptid; + } else { status->set_signalled -- 2.32.0 (Apple Git-132)
Hi all, I am so sorry that I created some duplicated comments in the thread. I didn't notice that modifying the attachment is equivalent as creating a new comment. Sorry for the inconvenience. I am requesting for the editbugs group access so that I can redo my operations. Best, Louis
Created attachment 13952 [details] The patch for PR-24069
Created attachment 13953 [details] The patch for PR-24069
I've just sent a rebased version of Dominique's patches, including Louis' recent rework of patch 3/3, at https://sourceware.org/pipermail/gdb-patches/2022-February/185936.html (also at https://pi.simark.ca/gdb-patches/20220216141540.96514-1-levraiphilippeblain@gmail.com/)
The master branch has been updated by Simon Marchi <simark@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9cca177baec32a1ed1422a87a1f57cda2d2eb21a commit 9cca177baec32a1ed1422a87a1f57cda2d2eb21a Author: Dominique Quatravaux <dominique.quatravaux@epfl.ch> Date: Wed Feb 16 09:15:39 2022 -0500 gdb/darwin: remove not-so-harmless spurious call to `wait4` As seen in https://sourceware.org/bugzilla/show_bug.cgi?id=24069 this code will typically wait4() a second time on the same process that was already wait4()'d a few lines above. While this used to be harmless/idempotent (when we assumed that the process already exited), this now causes a deadlock in the WIFSTOPPED case. The early (~2019) history of bug #24069 cautiously suggests to use WNOHANG instead of outright deleting the call. However, tests on the current version of Darwin (Big Sur) demonstrate that gdb runs just fine without a redundant call to wait4(), as would be expected. Notwithstanding the debatable value of conserving bug compatibility with an OS release that is more than a decade old, there is scant evidence of what that double-wait4() was supposed to achieve in the first place - A cursory investigation with `git blame` pinpoints commits bb00b29d7802 and a80b95ba67e2 from the 2008-2009 era, but fails to answer the "why" question conclusively. Co-Authored-By: Philippe Blain <levraiphilippeblain@gmail.com> Change-Id: Id4e4415d66d6ff6b3552b60d761693f17015e4a0
Pushed: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=7ff917016a203cdff3074abfcf96c1553944af94 Should be fixed now, if not feel free to re-open.
Hey, just stopping by a little late–since you've already merged a patch for this–with information that might be potentially helpful. I looked into this several years back but never finished a patch for it, but my impression for the sporadic failures was that GDB multiplexes Mach messages through a port set and they would race to arrive. If they came in the right order then things would work, and if not then it would hang. It's definitely possible that there's more to the story or that I completely misdiagnosed the problem, but my impression (from way back, mind you) was that the second wait was actually useful and the hang was from this rather than the extra call. Just my 2 cents :)