Bug 18749 - problems if whole process dies while (ptrace-) stopped
Summary: problems if whole process dies while (ptrace-) stopped
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 19508 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-07-31 11:18 UTC by Pedro Alves
Modified: 2024-01-03 14:49 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pedro Alves 2015-07-31 11:18:46 UTC
Between fetching a thread stop event (on Linux, after waitpid returns a stop) and fully handling the event (e.g., all the way evaluating a breakpoint condition and reresuming if it evals false), some other thread that is still running may cause the whole process to exit.  Usually in such a situation, the lower level code that tries to access memory or registers throws an error (usually a perror_with_name/ESRCH), and many code paths in GDB don't expect this, resulting in a broken debug session.
Comment 1 Pedro Alves 2015-07-31 11:18:56 UTC
I'm adding a test to the testsuite that exposes several of these on at least native GNU/Linux:

E.g., on one run:

 [Thread 0x7ffff6fbe700 (LWP 1332) exited]
 Cannot access memory at address 0x400872

And on another:
 [Thread 0x7ffff3fb8700 (LWP 2161) exited]
 Cannot find user-level thread for LWP 2163: generic error
 (gdb) [Thread 0x7ffff2fb6700 (LWP 2163) exited]
 [Thread 0x7ffff47b9700 (LWP 2160) exited]
Comment 2 Sourceware Commits 2015-08-21 19:13:17 UTC
The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=465a859e0a991d3bfe8a9ee65a29a223c42e2ce7

commit 465a859e0a991d3bfe8a9ee65a29a223c42e2ce7
Author: Pedro Alves <palves@redhat.com>
Date:   Fri Aug 21 19:52:36 2015 +0100

    Fix gdbserver crash exposed by gdb.threads/process-dies-while-handling-bp.exp
    
    Running that test in a loop, I found a gdbserver core dump with the
    following back trace:
    
     Core was generated by `../gdbserver/gdbserver --once --multi :2346'.
     Program terminated with signal SIGSEGV, Segmentation fault.
     #0  0x0000000000406ab6 in inferior_regcache_data (inferior=0x0) at src/gdb/gdbserver/inferiors.c:236
     236       return inferior->regcache_data;
     (gdb) up
     #1  0x0000000000406d7f in get_thread_regcache (thread=0x0, fetch=1) at src/gdb/gdbserver/regcache.c:31
     31        regcache = (struct regcache *) inferior_regcache_data (thread);
     (gdb) bt
     #0  0x0000000000406ab6 in inferior_regcache_data (inferior=0x0) at src/gdb/gdbserver/inferiors.c:236
     #1  0x0000000000406d7f in get_thread_regcache (thread=0x0, fetch=1) at src/gdb/gdbserver/regcache.c:31
     #2  0x0000000000409271 in prepare_resume_reply (buf=0x20dd593 "", ptid=..., status=0x20edce0) at src/gdb/gdbserver/remote-utils.c:1147
     #3  0x000000000040ab0a in vstop_notif_reply (event=0x20edcc0, own_buf=0x20dd590 "T05") at src/gdb/gdbserver/server.c:183
     #4  0x0000000000426b38 in notif_write_event (notif=0x66e6c0 <notif_stop>, own_buf=0x20dd590 "T05") at src/gdb/gdbserver/notif.c:69
     #5  0x0000000000426c55 in handle_notif_ack (own_buf=0x20dd590 "T05", packet_len=8) at src/gdb/gdbserver/notif.c:113
     #6  0x000000000041118f in handle_v_requests (own_buf=0x20dd590 "T05", packet_len=8, new_packet_len=0x7fff742c77b8)
         at src/gdb/gdbserver/server.c:2862
     #7  0x0000000000413850 in process_serial_event () at src/gdb/gdbserver/server.c:4148
     #8  0x0000000000413945 in handle_serial_event (err=0, client_data=0x0) at src/gdb/gdbserver/server.c:4196
     #9  0x000000000041a1ef in handle_file_event (event_file_desc=5) at src/gdb/gdbserver/event-loop.c:429
     #10 0x00000000004199b6 in process_event () at src/gdb/gdbserver/event-loop.c:184
     #11 0x000000000041a735 in start_event_loop () at src/gdb/gdbserver/event-loop.c:547
     #12 0x00000000004123d2 in captured_main (argc=4, argv=0x7fff742c7ac8) at src/gdb/gdbserver/server.c:3562
     #13 0x000000000041252e in main (argc=4, argv=0x7fff742c7ac8) at src/gdb/gdbserver/server.c:3631
    
    Clearly this means that a thread pushed a stop reply in the event
    queue, and then before GDB confused the event, the whole process died,
    along with its thread.  But the pending thread event was left
    dangling.  When GDB fetched that event, gdbserver looked up the
    corresponding thread, but found NULL; not expecting this, gdbserver
    crashes when it tries to read this thread's registers.
    
    gdb/gdbserver/
    2015-08-21  Pedro Alves  <palves@redhat.com>
    
    	PR gdb/18749
    	* inferiors.c (remove_thread): Discard any pending stop reply for
    	this thread.
    	* server.c (remove_all_on_match_pid): Rename to ...
    	(remove_all_on_match_ptid): ... this.  Work with a filter ptid
    	instead of a pid.
    	(discard_queued_stop_replies): Change parameter to a ptid.  Now
    	extern.
    	(handle_v_kill, kill_inferior_callback)
    	(process_serial_event): Adjust.
    	(captured_main): Call initialize_notif before starting the
    	program, thus before threads are created.
    	* server.h (discard_queued_stop_replies): Declare.
Comment 3 Yao Qi 2016-01-21 17:03:23 UTC
*** Bug 19508 has been marked as a duplicate of this bug. ***
Comment 4 Sourceware Commits 2018-01-16 09:09:35 UTC
The master branch has been updated by Yao Qi <qiyao@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9a70f35c8d872bb5542e98e34b8b044228dcb844

commit 9a70f35c8d872bb5542e98e34b8b044228dcb844
Author: Yao Qi <yao.qi@linaro.org>
Date:   Tue Jan 16 09:05:39 2018 +0000

    Mark register unavailable when PTRACE_PEEKUSER fails
    
    As described in PR 18749, GDB/GDBserver may get an error on accessing
    memory or register because the thread may disappear.  However, some
    path doesn't expect the error.  This patch fixes this problem by
    marking the register unavailable when PTRACE_PEEKUSER fails instead
    of throwing error.
    
    gdb/gdbserver:
    
    2018-01-16  Yao Qi  <yao.qi@linaro.org>
    
    	PR gdb/18749
    	* linux-low.c (fetch_register): Call supply_register instead of
    	error.
Comment 5 Hannes Domani 2024-01-03 14:49:44 UTC
(In reply to Sourceware Commits from comment #4)
> The master branch has been updated by Yao Qi <qiyao@sourceware.org>:
> 
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;
> h=9a70f35c8d872bb5542e98e34b8b044228dcb844
> 
> commit 9a70f35c8d872bb5542e98e34b8b044228dcb844
> Author: Yao Qi <yao.qi@linaro.org>
> Date:   Tue Jan 16 09:05:39 2018 +0000
> 
>     Mark register unavailable when PTRACE_PEEKUSER fails
>     
>     As described in PR 18749, GDB/GDBserver may get an error on accessing
>     memory or register because the thread may disappear.  However, some
>     path doesn't expect the error.  This patch fixes this problem by
>     marking the register unavailable when PTRACE_PEEKUSER fails instead
>     of throwing error.
>     
>     gdb/gdbserver:
>     
>     2018-01-16  Yao Qi  <yao.qi@linaro.org>
>     
>     	PR gdb/18749
>     	* linux-low.c (fetch_register): Call supply_register instead of
>     	error.

Can this be closed?