[PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop

Aktemur, Tankut Baris tankut.baris.aktemur@intel.com
Mon Apr 20 20:43:40 GMT 2020


On Thursday, April 16, 2020 7:07 PM, Pedro Alves wrote:
> On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> > In stop_all_threads, GDB sends signals to other threads in an attempt
> > to stop them.  While in a typical scenario the expected wait status is
> > TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted
> > to stop has already terminated.  If so, a waitstatus other than
> > TARGET_WAITKIND_STOPPED would be received.  Handle this case
> > appropriately.
> >
> > If a wait status that denotes thread termination is ignored, GDB goes
> > into an infinite loop in stop_all_threads.
> > E.g.:
> >
> >   $ gdb ./a.out
> >   (gdb) start
> >   ...
> >   (gdb) add-inferior -exec ./a.out
> >   ...
> >   (gdb) inferior 2
> >   ...
> >   (gdb) start
> >   ...
> >   (gdb) set schedule-multiple on
> >   (gdb) set debug infrun 2
> >   (gdb) continue
> >   Continuing.
> >   infrun: clear_proceed_status_thread (process 10449)
> >   infrun: clear_proceed_status_thread (process 10453)
> >   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> >   infrun: proceed: resuming process 10449
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10449] at 0x55555555514e
> >   infrun: infrun_async(1)
> >   infrun: prepare_to_wait
> >   infrun: proceed: resuming process 10453
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10453] at 0x55555555514e
> >   infrun: prepare_to_wait
> >   infrun: Found 2 inferiors, starting at #0
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   10449.10449.0 [process 10449],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: handle_inferior_event status->kind = exited, status = 0
> >   [Inferior 1 (process 10449) exited normally]
> >   infrun: stop_waiting
> >   infrun: stop_all_threads
> >   infrun: stop_all_threads, pass=0, iterations=0
> >   infrun:   process 10453 executing, need stop
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   10453.10453.0 [process 10453],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: stop_all_threads status->kind = exited, status = 0 process 10453
> >   infrun:   process 10453 executing, already stopping
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   -1.0.0 [process -1],
> >   infrun:   status->kind = no-resumed
> >   infrun: infrun_async(0)
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   ...
> >
> > And this polling goes on forever.  This patch prevents the infinite
> > looping behavior.  For the same scenario above, we obtain the
> > following behavior:
> >
> >   ...
> >   (gdb) continue
> >   Continuing.
> >   infrun: clear_proceed_status_thread (process 31229)
> >   infrun: clear_proceed_status_thread (process 31233)
> >   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> >   infrun: proceed: resuming process 31229
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31229] at 0x55555555514e
> >   infrun: infrun_async(1)
> >   infrun: prepare_to_wait
> >   infrun: proceed: resuming process 31233
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31233] at 0x55555555514e
> >   infrun: prepare_to_wait
> >   infrun: Found 2 inferiors, starting at #0
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   31229.31229.0 [process 31229],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: handle_inferior_event status->kind = exited, status = 0
> >   [Inferior 1 (process 31229) exited normally]
> >   infrun: stop_waiting
> >   infrun: stop_all_threads
> >   infrun: stop_all_threads, pass=0, iterations=0
> >   infrun:   process 31233 executing, need stop
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   31233.31233.0 [process 31233],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: stop_all_threads status->kind = exited, status = 0 process 31233
> >   infrun: saving status status->kind = exited, status = 0 for 31233.31233.0
> >   infrun:   process 31233 not executing
> >   infrun: stop_all_threads, pass=1, iterations=1
> >   infrun:   process 31233 not executing
> >   infrun: stop_all_threads done
> >   (gdb)
> >
> > The exit event from Inferior 1 is received and shown to the user.
> > The exit event from Inferior 2 is not displayed, but kept pending.
> >
> >   (gdb) info inferiors
> >     Num  Description       Connection           Executable
> >   * 1    <null>                                 a.out
> >     2    process 31233     1 (native)           a.out
> >   (gdb) inferior 2
> >   [Switching to inferior 2 [process 31233] (a.out)]
> >   [Switching to thread 2.1 (process 31233)]
> >   Couldn't get registers: No such process.
> >   (gdb)
> >
> > Note the "Couldn't get regsiters" error above.  As of writing this patch,
> > GDB normally goes into another hang, infinitely trying register access.
> > A patch such as
> >
> >   https://sourceware.org/pipermail/gdb-patches/2020-March/166982.html
> >
> 
> I've reviewed that patch today.  Once that is in, remember to update
> this commit log.

Done.
 
> > eliminates the infinite polling.  Resuming Inferior 2 processes the
> > pending exit event.
> >
> >   (gdb) continue
> >   Continuing.
> >   infrun: clear_proceed_status_thread (process 31233)
> >   infrun: clear_proceed_status_thread: thread process 31233 has pending wait status status->kind = exited, status
> = 0 (currently_stepping=0).
> >   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> >   infrun: proceed: resuming process 31233
> >   infrun: resume: thread process 31233 has pending wait status status->kind = exited, status = 0
> (currently_stepping=0).
> >   infrun: prepare_to_wait
> >   infrun: Using pending wait status status->kind = exited, status = 0 for process 31233.
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   31233.31233.0 [process 31233],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: handle_inferior_event status->kind = exited, status = 0
> >   [Inferior 2 (process 31233) exited normally]
> >   infrun: stop_waiting
> >   (gdb) info inferiors
> >     Num  Description       Connection           Executable
> >     1    <null>                                 a.out
> >   * 2    <null>                                 a.out
> >   (gdb)
> >
> > Regression-tested on X86_64 Linux.
> 
> Awesome, thanks much for tackling this.

Thanks for your review.  I repeated regression testing for v6 using
the default board and native-extended-gdbserver.
 
> >
> > gdb/ChangeLog:
> > 2020-02-05  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> > 	    Tom de Vries  <tdevries@suse.de>
> >
> > 	PR threads/25478
> > 	* infrun.c (stop_all_threads): Do NOT ignore
> > 	TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED,
> > 	TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses
> > 	received from threads we attempt to stop.
> >
> > gdb/testsuite/ChangeLog:
> > 2019-11-04  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> >
> > 	* gdb.multi/multi-exit.c: New file.
> > 	* gdb.multi/multi-exit.exp: New file.
> > 	* gdb.multi/multi-kill.c: New file.
> > 	* gdb.multi/multi-kill.exp: New file.
> >
> > Change-Id: I7cec98f40283773b79255d998511da434e9cd408
> > ---
> >  gdb/infrun.c                           |  59 +++++++++-
> >  gdb/testsuite/gdb.multi/multi-exit.c   |  22 ++++
> >  gdb/testsuite/gdb.multi/multi-exit.exp | 147 +++++++++++++++++++++++++
> >  gdb/testsuite/gdb.multi/multi-kill.c   |  34 ++++++
> >  gdb/testsuite/gdb.multi/multi-kill.exp | 104 +++++++++++++++++
> >  5 files changed, 360 insertions(+), 6 deletions(-)
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp
> >
> > diff --git a/gdb/infrun.c b/gdb/infrun.c
> > index ce8b544ab8d..ead60a0d152 100644
> > --- a/gdb/infrun.c
> > +++ b/gdb/infrun.c
> > @@ -4854,13 +4854,60 @@ stop_all_threads (void)
> >  				  target_pid_to_str (event.ptid).c_str ());
> >  	    }
> >
> > -	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED
> > -	      || event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > -	      || event.ws.kind == TARGET_WAITKIND_EXITED
> > -	      || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
> > +	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED)
> > +	    {
> > +	      /* All resumed threads exited.  */
> > +	    }
> > +	  else if (event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > +		   || event.ws.kind == TARGET_WAITKIND_EXITED
> > +		   || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
> >  	    {
> > -	      /* All resumed threads exited
> > -		 or one thread/process exited/signalled.  */
> > +	      /* One thread/process exited/signalled.  */
> > +
> > +	      thread_info *t = nullptr;
> > +
> > +	      /* The target may have reported just a pid.  If so, try
> > +		 the first non-exited thread.  */
> > +	      if (event.ptid.is_pid ())
> > +		{
> > +		  int pid  = event.ptid.pid ();
> > +		  inferior *inf = find_inferior_pid (event.target, pid);
> > +		  for (thread_info *tp : inf->non_exited_threads ())
> > +		    {
> > +		      t = tp;
> > +		      break;
> > +		    }
> > +
> > +		  /* FIXME: If there is no available thread, the event
> > +		     would have to be appended to a per-inferior event
> > +		     list, which, unfortunately, does not exist yet.  We
> > +		     assert here instead of going into an infinite loop.  */
> > +		  gdb_assert (t != nullptr);
> > +
> > +		  if (debug_infrun)
> > +		    fprintf_unfiltered (gdb_stdlog,
> > +					"infrun: stop_all_threads, using %s\n",
> > +					target_pid_to_str (t->ptid).c_str ());
> > +		}
> > +	      else
> > +		{
> > +		  t = find_thread_ptid (event.target, event.ptid);
> > +		  /* Check if this is the first time we see this thread.
> > +		     Don't bother adding if it individually exited.  */
> > +		  if (t == nullptr
> > +		      && event.ws.kind != TARGET_WAITKIND_THREAD_EXITED)
> > +		    t = add_thread (event.target, event.ptid);
> > +		}
> > +
> > +	      if (t != nullptr)
> > +		{
> > +		  /* Set the threads as non-executing to avoid
> > +		     another stop attempt on them.  */
> > +		  mark_non_executing_threads (event.target, event.ptid,
> > +					      event.ws);
> > +		  save_waitstatus (t, &event.ws);
> > +		  t->stop_requested = false;
> > +		}
> >  	    }
> >  	  else
> >  	    {
> > diff --git a/gdb/testsuite/gdb.multi/multi-exit.c b/gdb/testsuite/gdb.multi/multi-exit.c
> > new file mode 100644
> > index 00000000000..f4825c8a7c1
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.multi/multi-exit.c
> > @@ -0,0 +1,22 @@
> > +/* This testcase is part of GDB, the GNU debugger.
> > +
> > +   Copyright 2020 Free Software Foundation, Inc.
> > +
> > +   This program is free software; you can redistribute it and/or modify
> > +   it under the terms of the GNU General Public License as published by
> > +   the Free Software Foundation; either version 3 of the License, or
> > +   (at your option) any later version.
> > +
> > +   This program is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +   GNU General Public License for more details.
> > +
> > +   You should have received a copy of the GNU General Public License
> > +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> > +
> > +int
> > +main ()
> > +{
> > +  return 0;
> > +}
> > diff --git a/gdb/testsuite/gdb.multi/multi-exit.exp b/gdb/testsuite/gdb.multi/multi-exit.exp
> > new file mode 100644
> > index 00000000000..2236243461d
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.multi/multi-exit.exp
> > @@ -0,0 +1,147 @@
> > +# This testcase is part of GDB, the GNU debugger.
> > +
> > +# Copyright 2020 Free Software Foundation, Inc.
> > +
> > +# This program is free software; you can redistribute it and/or modify
> > +# it under the terms of the GNU General Public License as published by
> > +# the Free Software Foundation; either version 3 of the License, or
> > +# (at your option) any later version.
> > +#
> > +# This program is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +# Test receiving TARGET_WAITKIND_EXITED events from multiple
> > +# inferiors.  In all stop-mode, upon receiving the exit event from one
> > +# of the inferiors, GDB will try to stop the other inferior, too.  So,
> > +# a stop request will be sent.  Receiving a TARGET_WAITKIND_EXITED
> > +# status kind as a response to that stop request instead of a
> > +# TARGET_WAITKIND_STOPPED should be handled by GDB without problems.
> > +
> > +load_lib gdbserver-support.exp
> > +
> > +if {[skip_gdbserver_tests]} {
> > +    return 0
> > +}
> > +
> > +standard_testfile
> > +
> > +# The plain remote target can't do multiple inferiors.
> > +if {[target_info gdb_protocol] != ""} {
> 
> Use:
> 
>  if [use_gdb_stub] {
> 
> instead.
> 
> multi-target.exp is a bit special, in that it really needs to
> control which targets are run.  It is not the case in this
> testcase, AFAICT.  See more below.

Done.

> 
> > +    return 0
> > +}
> > +
> > +if {[build_executable "failed to build" $testfile $srcfile {debug}] == -1} {
> > +    return -1
> > +}
> > +
> > +# Set up the current inferior with a gdbserver in multi mode as its
> > +# target, if TARGET is "extended-remote".  Otherwise the target is native.
> > +
> > +proc setup_inferior {target infnum} {
> > +    global binfile
> > +
> > +    gdb_test "file ${binfile}" ".*" "load file in inferior $infnum"
> > +
> > +    if {$target == "extended-remote"} {
> > +	gdb_test_no_output "set remote exec-file ${binfile}" \
> > +	    "set remote-exec file in inferior $infnum"
> > +	set res [gdbserver_start "--multi" ""]
> > +	set gdbserver_gdbport [lindex $res 1]
> > +	if {[gdb_target_cmd "extended-remote" $gdbserver_gdbport]} {
> > +	    return 0
> > +	}
> > +    }
> > +
> > +    if {![runto_main]} {
> > +	fail "starting inferior $infnum"
> > +	return 0
> > +    }
> > +    return 1
> > +}
> > +
> > +# Set up two inferiors and start the processes.  The underlying target
> > +# of each inferior is determined by the TARGET argument.
> > +
> > +proc setup {target} {
> > +    clean_restart
> > +
> > +    # This is a test specific for GDB's ability to stop all threads.
> > +    gdb_test_no_output "maint set target-non-stop on"
> > +
> > +    if {![setup_inferior $target 1]} {
> > +	return 0
> > +    }
> > +
> > +    gdb_test "add-inferior -no-connection" "Added inferior 2" \
> > +	"add the second inferior"
> > +    gdb_test "inferior 2" "Switching to inferior 2.*" \
> > +	"switch to inferior 2"
> > +
> > +    if {![setup_inferior $target 2]} {
> > +	return 0
> > +    }
> > +
> > +    # We want to continue both processes.
> > +    gdb_test_no_output "set schedule-multiple on"
> > +
> > +    return 1
> > +}
> > +
> > +# Run the test using TARGET as the target of the inferiors.
> > +
> > +proc test {target} {
> > +    if {![setup $target]} {
> > +	untested "could not do setup"
> > +	return
> > +    }
> > +
> > +    # We want GDB to complete the command and return the prompt
> > +    # instead of going into an infinite loop.
> > +    global decimal gdb_prompt inferior_exited_re
> > +    gdb_test_multiple "continue" "first continue" {
> > +	-re "Inferior ($decimal) \[^\n\r\]+ exited normally.*$gdb_prompt $" {
> > +	    set exited_inferior $expect_out(1,string)
> > +	    pass $gdb_test_name
> > +	}
> > +    }
> > +
> > +    if {![info exists exited_inferior]} {
> > +	fail "first continue"
> > +	return 0
> > +    }
> > +
> > +    if {$exited_inferior == 1} {
> > +	set other_inferior 2
> > +    } else {
> > +	set other_inferior 1
> > +    }
> > +
> > +    # Switch to the other inferior and check it, too, continues to the end.
> > +    gdb_test "inferior $other_inferior" \
> > +	".*Switching to inferior $other_inferior.*" \
> > +	"switch to the other inferior"
> > +
> > +    gdb_continue_to_end
> > +
> > +    # Finally, check if we can re-run both inferiors.
> > +    delete_breakpoints
> > +
> > +    gdb_test "run" "$inferior_exited_re normally\]" \
> > +	"re-run the other inferior"
> > +
> > +    gdb_test "inferior $exited_inferior" \
> > +	".*Switching to inferior $exited_inferior.*" \
> > +	"switch to the first exited inferior"
> > +
> > +    gdb_test "run" "$inferior_exited_re normally\]" \
> > +	"re-run the first exited inferior"
> > +}
> > +
> > +foreach_with_prefix target {"native" "extended-remote"} {
> 
> This is really usually not the right thing to do.  Better write
> the testcase in a target-neutral form, and then let
> 
>  make check RUNTESTFLAGS="--target_board=native-extended-gdbserver"
> 
> cover testing with extended-remote.

I believe this comment is addressed in v6.
 
> AFAICT, when testing with extended-remote, you're spawning
> a gdbserver for each inferior.  You don't really need that,
> right?  A single gdbserver with both inferiors should
> trigger the problem as well.

Correct.  New problems arose, though, that I explained in the reply
to Patch #3.
 
> I.e., remove this specific target stuff, and just use regular
> commands.
> 
> Instead of issuing the "file" + "set remote exec-file", use
> gdb_load.

Done.

> Instead of the "run" command, try gdb_run_cmd.

Done.
 
> > +gdb_test_multiple "continue" "continue processes" {
> > +    -re "Continuing.\[\r\n\]+" {
> > +	# Kill both processes at once.
> > +	remote_exec build "kill -9 ${testpid1} ${testpid2}"
> 
> Pedantically, "remote_exec target"

Done.
 
> > +# Find the current inferior's id.
> > +set current_inferior 1
> > +gdb_test_multiple "info inferiors" "find the current inf id" {
> > +    -re "\\* 1 .*$gdb_prompt $" {
> > +        set current_inferior 1
> > +	set other_inferior 2
> > +	pass $gdb_test_name
> > +    }
> > +    -re "\\* 2 .*$gdb_prompt $" {
> > +        set current_inferior 2
> > +	set other_inferior 1
> > +	pass $gdb_test_name
> > +    }
> 
> Watch out for tabs vs spaces above.

Fixed.
 
> Does multi-kill.exp really need to attach to processes instead of
> spawning them via GDB?  If we do the latter, then the testcase
> will run on more configurations.

No, it does not have to attach.  I revised the test and it can now
run with native-extended-gdbserver, too.

Thanks.
Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


More information about the Gdb-patches mailing list