Between fetching a thread stop event (on Linux, after waitpid returns a stop) and fully handling the event (e.g., all the way evaluating a breakpoint condition and reresuming if it evals false), some other thread that is still running may cause the whole process to exit. Usually in such a situation, the lower level code that tries to access memory or registers throws an error (usually a perror_with_name/ESRCH), and many code paths in GDB don't expect this, resulting in a broken debug session.
I'm adding a test to the testsuite that exposes several of these on at least native GNU/Linux: E.g., on one run: [Thread 0x7ffff6fbe700 (LWP 1332) exited] Cannot access memory at address 0x400872 And on another: [Thread 0x7ffff3fb8700 (LWP 2161) exited] Cannot find user-level thread for LWP 2163: generic error (gdb) [Thread 0x7ffff2fb6700 (LWP 2163) exited] [Thread 0x7ffff47b9700 (LWP 2160) exited]
The master branch has been updated by Pedro Alves <palves@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=465a859e0a991d3bfe8a9ee65a29a223c42e2ce7 commit 465a859e0a991d3bfe8a9ee65a29a223c42e2ce7 Author: Pedro Alves <palves@redhat.com> Date: Fri Aug 21 19:52:36 2015 +0100 Fix gdbserver crash exposed by gdb.threads/process-dies-while-handling-bp.exp Running that test in a loop, I found a gdbserver core dump with the following back trace: Core was generated by `../gdbserver/gdbserver --once --multi :2346'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000406ab6 in inferior_regcache_data (inferior=0x0) at src/gdb/gdbserver/inferiors.c:236 236 return inferior->regcache_data; (gdb) up #1 0x0000000000406d7f in get_thread_regcache (thread=0x0, fetch=1) at src/gdb/gdbserver/regcache.c:31 31 regcache = (struct regcache *) inferior_regcache_data (thread); (gdb) bt #0 0x0000000000406ab6 in inferior_regcache_data (inferior=0x0) at src/gdb/gdbserver/inferiors.c:236 #1 0x0000000000406d7f in get_thread_regcache (thread=0x0, fetch=1) at src/gdb/gdbserver/regcache.c:31 #2 0x0000000000409271 in prepare_resume_reply (buf=0x20dd593 "", ptid=..., status=0x20edce0) at src/gdb/gdbserver/remote-utils.c:1147 #3 0x000000000040ab0a in vstop_notif_reply (event=0x20edcc0, own_buf=0x20dd590 "T05") at src/gdb/gdbserver/server.c:183 #4 0x0000000000426b38 in notif_write_event (notif=0x66e6c0 <notif_stop>, own_buf=0x20dd590 "T05") at src/gdb/gdbserver/notif.c:69 #5 0x0000000000426c55 in handle_notif_ack (own_buf=0x20dd590 "T05", packet_len=8) at src/gdb/gdbserver/notif.c:113 #6 0x000000000041118f in handle_v_requests (own_buf=0x20dd590 "T05", packet_len=8, new_packet_len=0x7fff742c77b8) at src/gdb/gdbserver/server.c:2862 #7 0x0000000000413850 in process_serial_event () at src/gdb/gdbserver/server.c:4148 #8 0x0000000000413945 in handle_serial_event (err=0, client_data=0x0) at src/gdb/gdbserver/server.c:4196 #9 0x000000000041a1ef in handle_file_event (event_file_desc=5) at src/gdb/gdbserver/event-loop.c:429 #10 0x00000000004199b6 in process_event () at src/gdb/gdbserver/event-loop.c:184 #11 0x000000000041a735 in start_event_loop () at src/gdb/gdbserver/event-loop.c:547 #12 0x00000000004123d2 in captured_main (argc=4, argv=0x7fff742c7ac8) at src/gdb/gdbserver/server.c:3562 #13 0x000000000041252e in main (argc=4, argv=0x7fff742c7ac8) at src/gdb/gdbserver/server.c:3631 Clearly this means that a thread pushed a stop reply in the event queue, and then before GDB confused the event, the whole process died, along with its thread. But the pending thread event was left dangling. When GDB fetched that event, gdbserver looked up the corresponding thread, but found NULL; not expecting this, gdbserver crashes when it tries to read this thread's registers. gdb/gdbserver/ 2015-08-21 Pedro Alves <palves@redhat.com> PR gdb/18749 * inferiors.c (remove_thread): Discard any pending stop reply for this thread. * server.c (remove_all_on_match_pid): Rename to ... (remove_all_on_match_ptid): ... this. Work with a filter ptid instead of a pid. (discard_queued_stop_replies): Change parameter to a ptid. Now extern. (handle_v_kill, kill_inferior_callback) (process_serial_event): Adjust. (captured_main): Call initialize_notif before starting the program, thus before threads are created. * server.h (discard_queued_stop_replies): Declare.
*** Bug 19508 has been marked as a duplicate of this bug. ***
The master branch has been updated by Yao Qi <qiyao@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9a70f35c8d872bb5542e98e34b8b044228dcb844 commit 9a70f35c8d872bb5542e98e34b8b044228dcb844 Author: Yao Qi <yao.qi@linaro.org> Date: Tue Jan 16 09:05:39 2018 +0000 Mark register unavailable when PTRACE_PEEKUSER fails As described in PR 18749, GDB/GDBserver may get an error on accessing memory or register because the thread may disappear. However, some path doesn't expect the error. This patch fixes this problem by marking the register unavailable when PTRACE_PEEKUSER fails instead of throwing error. gdb/gdbserver: 2018-01-16 Yao Qi <yao.qi@linaro.org> PR gdb/18749 * linux-low.c (fetch_register): Call supply_register instead of error.
(In reply to Sourceware Commits from comment #4) > The master branch has been updated by Yao Qi <qiyao@sourceware.org>: > > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git; > h=9a70f35c8d872bb5542e98e34b8b044228dcb844 > > commit 9a70f35c8d872bb5542e98e34b8b044228dcb844 > Author: Yao Qi <yao.qi@linaro.org> > Date: Tue Jan 16 09:05:39 2018 +0000 > > Mark register unavailable when PTRACE_PEEKUSER fails > > As described in PR 18749, GDB/GDBserver may get an error on accessing > memory or register because the thread may disappear. However, some > path doesn't expect the error. This patch fixes this problem by > marking the register unavailable when PTRACE_PEEKUSER fails instead > of throwing error. > > gdb/gdbserver: > > 2018-01-16 Yao Qi <yao.qi@linaro.org> > > PR gdb/18749 > * linux-low.c (fetch_register): Call supply_register instead of > error. Can this be closed?