[PATCH v3 5/5] gdb: better handling of 'S' packets

Andrew Burgess andrew.burgess@embecosm.com
Fri Jan 8 18:19:45 GMT 2021


* Simon Marchi <simon.marchi@polymtl.ca> [2021-01-07 23:17:34 -0500]:

> From: Andrew Burgess <andrew.burgess@embecosm.com>
> 
> New in v3 (Simon Marchi): rely on the resume state saved in
> remote_thread_info to find the first non-exited, resumed thread.  This
> simplifies the code a bit, as we don't need to fall back on the first
> non-exited thread on initial connection.
> 
> This commit builds on work started in the following two commits:
> 
>   commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
>   Date:   Thu Jan 30 14:35:40 2020 +0000
> 
>       gdb/remote: Restore support for 'S' stop reply packet
> 
>   commit cada5fc921e39a1945c422eea055c8b326d8d353
>   Date:   Wed Mar 11 12:30:13 2020 +0000
> 
>       gdb: Handle W and X remote packets without giving a warning
> 
> This is related to how GDB handles remote targets that send back 'S'
> packets.
> 
> In the first of the above commits we fixed GDB's ability to handle a
> single process, single threaded target that sends back 'S' packets.
> Although the 'T' packet would always be preferred to 'S' these days,
> there's nothing really wrong with 'S' for this situation.
> 
> The second commit above fixed an oversight in the first commit, a
> single-process, multi-threaded target can send back a process wide
> event, for example the process exited event 'W' without including a
> process-id, this also is fine as there is no ambiguity in this case.
> 
> In PR gdb/26819 we run into yet another problem with the above
> commits.  In this case we have a single process with two threads, GDB
> hits a breakpoint in thread 2 and then performs a stepi:
> 
>   (gdb) b main
>   Breakpoint 1 at 0x1212340830: file infinite_loop.S, line 10.
>   (gdb) c
>   Continuing.
> 
>   Thread 2 hit Breakpoint 1, main () at infinite_loop.S:10
>   10    in infinite_loop.S
>   (gdb) set debug remote 1
>   (gdb) stepi
>   Sending packet: $vCont;s:2#24...Packet received: S05
>   ../binutils-gdb/gdb/infrun.c:5807: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> 
> What happens in this case is that on the RISC-V target displaced
> stepping is not supported, so when the stepi is issued GDB steps just
> thread 2.  As only a single thread was set running the target decides
> that is can get away with sending back an 'S' packet without a
> thread-id.  GDB then associates the stop with thread 1 (the first
> non-exited thread), but as thread 1 was not previously set executing
> the assertion seen above triggers.
> 
> As an aside I am surprised that the target sends pack 'S' in this
> situation.  The target is happy to send back 'T' (including thread-id)
> when multiple threads are set running, so (to me) it would seem easier
> to just always use the 'T' packet when multiple threads are in use.
> However, the target only uses 'T' when multiple threads are actually
> executing, otherwise an 'S' packet it used.
> 
> Still, when looking at the above situation we can see that GDB should
> be able to understand which thread the 'S' reply is referring too.
> 
> The problem is that is that in commit 24ed6739b699 (above) when a stop
> reply comes in with no thread-id we look for the first non-exited
> thread and select that as the thread the stop applies too.
> 
> What we should really do is select the first non-exited, resumed thread,
> and associate the stop event with this thread.  In the above example
> both thread 1 and 2 are non-exited, but only thread 2 is resumed, so
> this is what we should use.
> 
> There's a test for this issue included which works with stock
> gdbserver by disabling use of the 'T' packet, and enabling
> 'scheduler-locking' within GDB so only one thread is set running.
> 
> gdb/ChangeLog:
> 
> 	PR gdb/26819
> 	* remote.c
> 	(remote_target::select_thread_for_ambiguous_stop_reply): New
> 	member function.
> 	(remote_target::process_stop_reply): Call
> 	select_thread_for_ambiguous_stop_reply.
> 
> gdb/testsuite/ChangeLog:
> 
> 	PR gdb/26819
> 	* gdb.server/stop-reply-no-thread-multi.c: New file.
> 	* gdb.server/stop-reply-no-thread-multi.exp: New file.

Simon,

Thanks for integrating this with your series.  I like that this patch
has gotten even simpler (and clearer) now.

I had a few minor nits, but otherwise I'm happy with this.

Thanks,
Andrew

> 
> Change-Id: I9b49d76c2a99063dcc76203fa0f5270a72825d15
> ---
>  gdb/remote.c                                  | 153 +++++++++++-------
>  .../gdb.server/stop-reply-no-thread-multi.c   |  77 +++++++++
>  .../gdb.server/stop-reply-no-thread-multi.exp | 136 ++++++++++++++++
>  3 files changed, 312 insertions(+), 54 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
>  create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> 
> diff --git a/gdb/remote.c b/gdb/remote.c
> index be53886c1837..f12a86f66a14 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
>    ptid_t process_stop_reply (struct stop_reply *stop_reply,
>  			     target_waitstatus *status);
>  
> +  ptid_t select_thread_for_ambiguous_stop_reply
> +    (const struct target_waitstatus *status);
> +
>    void remote_notice_new_inferior (ptid_t currthread, int executing);
>  
>    void process_initial_stop_replies (int from_tty);
> @@ -7796,75 +7799,117 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
>    remote->remote_notif_get_pending_events (nc);
>  }
>  
> -/* Called when it is decided that STOP_REPLY holds the info of the
> -   event that is to be returned to the core.  This function always
> -   destroys STOP_REPLY.  */
> +/* Called from process_stop_reply when the stop packet we are responding
> +   to didn't include a process-id or thread-id.  STATUS is the stop event
> +   we are responding to.
> +
> +   It is the task of this function to select a suitable thread (or process)
> +   and return its ptid, this is the thread (or process) we will assume the
> +   stop event came from.
> +
> +   In some cases there isn't really any choice about which thread (or
> +   process) is selected, a basic remote with a single process containing a
> +   single thread might choose not to send any process-id or thread-id in
> +   its stop packets, this function will select and return the one and only
> +   thread.
> +
> +   However, if a target supports multiple threads (or processes) and still
> +   doesn't include a thread-id (or process-id) in its stop packet then
> +   first, this is a badly behaving target, and second, we're going to have
> +   to select a thread (or process) at random and use that.  This function
> +   will print a warning to the user if it detects that there is the
> +   possibility that GDB is guessing which thread (or process) to
> +   report.  */
>  
>  ptid_t
> -remote_target::process_stop_reply (struct stop_reply *stop_reply,
> -				   struct target_waitstatus *status)
> +remote_target::select_thread_for_ambiguous_stop_reply
> +  (const struct target_waitstatus *status)
>  {
> -  ptid_t ptid;
> +  /* Some stop events apply to all threads in an inferior, while others
> +     only apply to a single thread.  */
> +  bool is_stop_for_all_threads
> +    = (status->kind == TARGET_WAITKIND_EXITED
> +       || status->kind == TARGET_WAITKIND_SIGNALLED);
>  
> -  *status = stop_reply->ws;
> -  ptid = stop_reply->ptid;
> +  thread_info *first_resumed_thread = nullptr;
> +  bool multiple_resumed_thread = false;

This might be better named 'multiple_resumed_threads'.  Apologies if
this was just copied from my original code.

>  
> -  /* If no thread/process was reported by the stub then use the first
> -     non-exited thread in the current target.  */
> -  if (ptid == null_ptid)
> +  /* Consider all non-exited threads of the target, find the first resumed
> +     one.  */
> +  for (thread_info *thr : all_non_exited_threads (this))
>      {
> -      /* Some stop events apply to all threads in an inferior, while others
> -	 only apply to a single thread.  */
> -      bool is_stop_for_all_threads
> -	= (status->kind == TARGET_WAITKIND_EXITED
> -	   || status->kind == TARGET_WAITKIND_SIGNALLED);
> +      remote_thread_info *remote_thr =get_remote_thread_info (thr);

Missing space after '='.

> +
> +      if (remote_thr->resume_state () != resume_state::RESUMED)
> +	continue;
> +
> +      if (first_resumed_thread == nullptr)
> +	first_resumed_thread = thr;
> +      else if (!is_stop_for_all_threads
> +	       || first_resumed_thread->ptid.pid () != thr->ptid.pid ())
> +	multiple_resumed_thread = true;
> +    }
>  
> -      for (thread_info *thr : all_non_exited_threads (this))
> +  gdb_assert (first_resumed_thread != nullptr);
> +
> +  /* Warn if the remote target is sending ambiguous stop replies.  */
> +  if (multiple_resumed_thread)
> +    {
> +      static bool warned = false;
> +
> +      if (!warned)
>  	{
> -	  if (ptid != null_ptid
> -	      && (!is_stop_for_all_threads
> -		  || ptid.pid () != thr->ptid.pid ()))
> -	    {
> -	      static bool warned = false;
> +	  /* If you are seeing this warning then the remote target has
> +	     stopped without specifying a thread-id, but the target
> +	     does have multiple threads (or inferiors), and so GDB is
> +	     having to guess which thread stopped.
>  
> -	      if (!warned)
> -		{
> -		  /* If you are seeing this warning then the remote target
> -		     has stopped without specifying a thread-id, but the
> -		     target does have multiple threads (or inferiors), and
> -		     so GDB is having to guess which thread stopped.
> -
> -		     Examples of what might cause this are the target
> -		     sending and 'S' stop packet, or a 'T' stop packet and
> -		     not including a thread-id.
> -
> -		     Additionally, the target might send a 'W' or 'X
> -		     packet without including a process-id, when the target
> -		     has multiple running inferiors.  */
> -		  if (is_stop_for_all_threads)
> -		    warning (_("multi-inferior target stopped without "
> -			       "sending a process-id, using first "
> -			       "non-exited inferior"));
> -		  else
> -		    warning (_("multi-threaded target stopped without "
> -			       "sending a thread-id, using first "
> -			       "non-exited thread"));
> -		  warned = true;
> -		}
> -	      break;
> -	    }
> +	     Examples of what might cause this are the target sending
> +	     and 'S' stop packet, or a 'T' stop packet and not
> +	     including a thread-id.
>  
> -	  /* If this is a stop for all threads then don't use a particular
> -	     threads ptid, instead create a new ptid where only the pid
> -	     field is set.  */
> +	     Additionally, the target might send a 'W' or 'X packet
> +	     without including a process-id, when the target has
> +	     multiple running inferiors.  */
>  	  if (is_stop_for_all_threads)
> -	    ptid = ptid_t (thr->ptid.pid ());
> +	    warning (_("multi-inferior target stopped without "
> +		       "sending a process-id, using first "
> +		       "non-exited inferior"));
>  	  else
> -	    ptid = thr->ptid;
> +	    warning (_("multi-threaded target stopped without "
> +		       "sending a thread-id, using first "
> +		       "non-exited thread"));
> +	  warned = true;
>  	}
> -      gdb_assert (ptid != null_ptid);
>      }
>  
> +  /* If this is a stop for all threads then don't use a particular threads
> +     ptid, instead create a new ptid where only the pid field is set.  */
> +  if (is_stop_for_all_threads)
> +    return ptid_t (first_resumed_thread->ptid.pid ());
> +  else
> +    return first_resumed_thread->ptid;
> +}
> +
> +/* Called when it is decided that STOP_REPLY holds the info of the
> +   event that is to be returned to the core.  This function always
> +   destroys STOP_REPLY.  */
> +
> +ptid_t
> +remote_target::process_stop_reply (struct stop_reply *stop_reply,
> +				   struct target_waitstatus *status)
> +{
> +  ptid_t ptid;

Shouldn't this just be inline below?

> +
> +  *status = stop_reply->ws;
> +  ptid = stop_reply->ptid;
> +
> +  /* If no thread/process was reported by the stub then select a suitable
> +     thread/process.  */
> +  if (ptid == null_ptid)
> +    ptid = select_thread_for_ambiguous_stop_reply (status);
> +  gdb_assert (ptid != null_ptid);
> +
>    if (status->kind != TARGET_WAITKIND_EXITED
>        && status->kind != TARGET_WAITKIND_SIGNALLED
>        && status->kind != TARGET_WAITKIND_NO_RESUMED)
> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
> new file mode 100644
> index 000000000000..01f6d3c07ff4
> --- /dev/null
> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
> @@ -0,0 +1,77 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2020 Free Software Foundation, Inc.

The copyright year will need updating now.

> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <stdlib.h>
> +#include <pthread.h>
> +#include <unistd.h>
> +
> +volatile int worker_blocked = 1;
> +volatile int main_blocked = 1;
> +
> +void
> +unlock_worker (void)
> +{
> +  worker_blocked = 0;
> +}
> +
> +void
> +unlock_main (void)
> +{
> +  main_blocked = 0;
> +}
> +
> +void
> +breakpt (void)
> +{
> +  /* Nothing.  */
> +}
> +
> +static void *
> +worker (void *data)
> +{
> +  unlock_main ();
> +
> +  while (worker_blocked)
> +    ;
> +
> +  breakpt ();
> +
> +  return NULL;
> +}
> +
> +int
> +main (void)
> +{
> +  pthread_t thr;
> +  void *retval;
> +
> +  /* Ensure the test doesn't run forever.  */
> +  alarm (99);
> +
> +  if (pthread_create (&thr, NULL, worker, NULL) != 0)
> +    abort ();
> +
> +  while (main_blocked)
> +    ;
> +
> +  unlock_worker ();
> +
> +  if (pthread_join (thr, &retval) != 0)
> +    abort ();
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> new file mode 100644
> index 000000000000..f394ca8ed0c4
> --- /dev/null
> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> @@ -0,0 +1,136 @@
> +# This testcase is part of GDB, the GNU debugger.
> +#
> +# Copyright 2020 Free Software Foundation, Inc.

And again.

> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test how GDB handles the case where a target either doesn't use 'T'
> +# packets at all or doesn't include a thread-id in a 'T' packet, AND,
> +# where the test program contains multiple threads.
> +#
> +# In general if multiple threads are executing and the target doesn't
> +# include a thread-id in its stop response then GDB will not be able
> +# to correctly figure out which thread the stop applies to.
> +#
> +# However, this test covers a very specific case, there are multiple
> +# threads but only a single thread is actually executing.  So, when
> +# the stop comes from the target, without a thread-id, GDB should be
> +# able to correctly figure out which thread has stopped.
> +
> +load_lib gdbserver-support.exp
> +
> +if { [skip_gdbserver_tests] } {
> +    verbose "skipping gdbserver tests"
> +    return -1
> +}
> +
> +standard_testfile
> +if { [build_executable "failed to prepare" $testfile $srcfile {debug pthreads}] == -1 } {
> +    return -1
> +}
> +
> +# Run the tests with different features of GDBserver disabled.
> +proc run_test { disable_feature } {
> +    global binfile gdb_prompt decimal hex
> +
> +    clean_restart ${binfile}
> +
> +    # Make sure we're disconnected, in case we're testing with an
> +    # extended-remote board, therefore already connected.
> +    gdb_test "disconnect" ".*"
> +
> +    set packet_arg ""
> +    if { $disable_feature != "" } {
> +	set packet_arg "--disable-packet=${disable_feature}"
> +    }
> +    set res [gdbserver_start $packet_arg $binfile]
> +    set gdbserver_protocol [lindex $res 0]
> +    set gdbserver_gdbport [lindex $res 1]
> +
> +    # Disable XML-based thread listing, and multi-process extensions.
> +    gdb_test_no_output "set remote threads-packet off"
> +    gdb_test_no_output "set remote multiprocess-feature-packet off"
> +
> +    set res [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
> +    if ![gdb_assert {$res == 0} "connect"] {
> +	return
> +    }
> +
> +    # There should be only one thread listed at this point.
> +    gdb_test_multiple "info threads" "" {
> +	-re "2 Thread.*$gdb_prompt $" {
> +	    fail $gdb_test_name
> +	}
> +	-re "has terminated.*$gdb_prompt $" {
> +	    fail $gdb_test_name
> +	}
> +	-re "\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n$gdb_prompt $" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    gdb_breakpoint "unlock_worker"
> +    gdb_continue_to_breakpoint "run to unlock_worker"
> +
> +    # There should be two threads at this point with thread 1 selected.
> +    gdb_test "info threads" \
> +	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should now exist"
> +
> +    # Switch threads.
> +    gdb_test "thread 2" ".*" "switch to second thread"
> +
> +    # Now turn on scheduler-locking so that when we step thread 2 only
> +    # that one thread will be set running.
> +    gdb_test_no_output "set scheduler-locking on"
> +
> +    # Single step thread 2.  Only the one thread will step.  When the
> +    # thread stops, if the stop packet doesn't include a thread-id
> +    # then GDB should still understand which thread stopped.
> +    gdb_test_multiple "stepi" "" {
> +	-re "Thread 1 received signal SIGTRAP" {
> +	    fail $gdb_test_name
> +	}
> +	-re -wrap "$hex.*$decimal.*while \\(worker_blocked\\).*" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    # Check that thread 2 is still selected.
> +    gdb_test "info threads" \
> +	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should still be selected after stepi"
> +
> +    # Turn scheduler locking off again so that when we continue all
> +    # threads will be set running.
> +    gdb_test_no_output "set scheduler-locking off"
> +
> +    # Continue until exit.  The server sends a 'W' with no PID.
> +    # Bad GDB gave an error like below when target is nonstop:
> +    #  (gdb) c
> +    #  Continuing.
> +    #  No process or thread specified in stop reply: W00
> +    gdb_continue_to_end "" continue 1
> +}
> +
> +# Disable different features within gdbserver:
> +#
> +# Tthread: Start GDBserver, with ";thread:NNN" in T stop replies disabled,
> +#          emulating old gdbservers when debugging single-threaded programs.
> +#
> +# T: Start GDBserver with the entire 'T' stop reply packet disabled,
> +#    GDBserver will instead send the 'S' stop reply.
> +foreach_with_prefix to_disable { "" Tthread T } {
> +    run_test $to_disable
> +}
> -- 
> 2.29.2
> 


More information about the Gdb-patches mailing list