[PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop
Aktemur, Tankut Baris
tankut.baris.aktemur@intel.com
Mon Apr 20 20:43:40 GMT 2020
On Thursday, April 16, 2020 7:07 PM, Pedro Alves wrote:
> On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> > In stop_all_threads, GDB sends signals to other threads in an attempt
> > to stop them. While in a typical scenario the expected wait status is
> > TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted
> > to stop has already terminated. If so, a waitstatus other than
> > TARGET_WAITKIND_STOPPED would be received. Handle this case
> > appropriately.
> >
> > If a wait status that denotes thread termination is ignored, GDB goes
> > into an infinite loop in stop_all_threads.
> > E.g.:
> >
> > $ gdb ./a.out
> > (gdb) start
> > ...
> > (gdb) add-inferior -exec ./a.out
> > ...
> > (gdb) inferior 2
> > ...
> > (gdb) start
> > ...
> > (gdb) set schedule-multiple on
> > (gdb) set debug infrun 2
> > (gdb) continue
> > Continuing.
> > infrun: clear_proceed_status_thread (process 10449)
> > infrun: clear_proceed_status_thread (process 10453)
> > infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> > infrun: proceed: resuming process 10449
> > infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10449] at 0x55555555514e
> > infrun: infrun_async(1)
> > infrun: prepare_to_wait
> > infrun: proceed: resuming process 10453
> > infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10453] at 0x55555555514e
> > infrun: prepare_to_wait
> > infrun: Found 2 inferiors, starting at #0
> > infrun: target_wait (-1.0.0, status) =
> > infrun: 10449.10449.0 [process 10449],
> > infrun: status->kind = exited, status = 0
> > infrun: handle_inferior_event status->kind = exited, status = 0
> > [Inferior 1 (process 10449) exited normally]
> > infrun: stop_waiting
> > infrun: stop_all_threads
> > infrun: stop_all_threads, pass=0, iterations=0
> > infrun: process 10453 executing, need stop
> > infrun: target_wait (-1.0.0, status) =
> > infrun: 10453.10453.0 [process 10453],
> > infrun: status->kind = exited, status = 0
> > infrun: stop_all_threads status->kind = exited, status = 0 process 10453
> > infrun: process 10453 executing, already stopping
> > infrun: target_wait (-1.0.0, status) =
> > infrun: -1.0.0 [process -1],
> > infrun: status->kind = no-resumed
> > infrun: infrun_async(0)
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > infrun: stop_all_threads status->kind = no-resumed process -1
> > infrun: process 10453 executing, already stopping
> > ...
> >
> > And this polling goes on forever. This patch prevents the infinite
> > looping behavior. For the same scenario above, we obtain the
> > following behavior:
> >
> > ...
> > (gdb) continue
> > Continuing.
> > infrun: clear_proceed_status_thread (process 31229)
> > infrun: clear_proceed_status_thread (process 31233)
> > infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> > infrun: proceed: resuming process 31229
> > infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31229] at 0x55555555514e
> > infrun: infrun_async(1)
> > infrun: prepare_to_wait
> > infrun: proceed: resuming process 31233
> > infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31233] at 0x55555555514e
> > infrun: prepare_to_wait
> > infrun: Found 2 inferiors, starting at #0
> > infrun: target_wait (-1.0.0, status) =
> > infrun: 31229.31229.0 [process 31229],
> > infrun: status->kind = exited, status = 0
> > infrun: handle_inferior_event status->kind = exited, status = 0
> > [Inferior 1 (process 31229) exited normally]
> > infrun: stop_waiting
> > infrun: stop_all_threads
> > infrun: stop_all_threads, pass=0, iterations=0
> > infrun: process 31233 executing, need stop
> > infrun: target_wait (-1.0.0, status) =
> > infrun: 31233.31233.0 [process 31233],
> > infrun: status->kind = exited, status = 0
> > infrun: stop_all_threads status->kind = exited, status = 0 process 31233
> > infrun: saving status status->kind = exited, status = 0 for 31233.31233.0
> > infrun: process 31233 not executing
> > infrun: stop_all_threads, pass=1, iterations=1
> > infrun: process 31233 not executing
> > infrun: stop_all_threads done
> > (gdb)
> >
> > The exit event from Inferior 1 is received and shown to the user.
> > The exit event from Inferior 2 is not displayed, but kept pending.
> >
> > (gdb) info inferiors
> > Num Description Connection Executable
> > * 1 <null> a.out
> > 2 process 31233 1 (native) a.out
> > (gdb) inferior 2
> > [Switching to inferior 2 [process 31233] (a.out)]
> > [Switching to thread 2.1 (process 31233)]
> > Couldn't get registers: No such process.
> > (gdb)
> >
> > Note the "Couldn't get regsiters" error above. As of writing this patch,
> > GDB normally goes into another hang, infinitely trying register access.
> > A patch such as
> >
> > https://sourceware.org/pipermail/gdb-patches/2020-March/166982.html
> >
>
> I've reviewed that patch today. Once that is in, remember to update
> this commit log.
Done.
> > eliminates the infinite polling. Resuming Inferior 2 processes the
> > pending exit event.
> >
> > (gdb) continue
> > Continuing.
> > infrun: clear_proceed_status_thread (process 31233)
> > infrun: clear_proceed_status_thread: thread process 31233 has pending wait status status->kind = exited, status
> = 0 (currently_stepping=0).
> > infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> > infrun: proceed: resuming process 31233
> > infrun: resume: thread process 31233 has pending wait status status->kind = exited, status = 0
> (currently_stepping=0).
> > infrun: prepare_to_wait
> > infrun: Using pending wait status status->kind = exited, status = 0 for process 31233.
> > infrun: target_wait (-1.0.0, status) =
> > infrun: 31233.31233.0 [process 31233],
> > infrun: status->kind = exited, status = 0
> > infrun: handle_inferior_event status->kind = exited, status = 0
> > [Inferior 2 (process 31233) exited normally]
> > infrun: stop_waiting
> > (gdb) info inferiors
> > Num Description Connection Executable
> > 1 <null> a.out
> > * 2 <null> a.out
> > (gdb)
> >
> > Regression-tested on X86_64 Linux.
>
> Awesome, thanks much for tackling this.
Thanks for your review. I repeated regression testing for v6 using
the default board and native-extended-gdbserver.
> >
> > gdb/ChangeLog:
> > 2020-02-05 Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
> > Tom de Vries <tdevries@suse.de>
> >
> > PR threads/25478
> > * infrun.c (stop_all_threads): Do NOT ignore
> > TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED,
> > TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses
> > received from threads we attempt to stop.
> >
> > gdb/testsuite/ChangeLog:
> > 2019-11-04 Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
> >
> > * gdb.multi/multi-exit.c: New file.
> > * gdb.multi/multi-exit.exp: New file.
> > * gdb.multi/multi-kill.c: New file.
> > * gdb.multi/multi-kill.exp: New file.
> >
> > Change-Id: I7cec98f40283773b79255d998511da434e9cd408
> > ---
> > gdb/infrun.c | 59 +++++++++-
> > gdb/testsuite/gdb.multi/multi-exit.c | 22 ++++
> > gdb/testsuite/gdb.multi/multi-exit.exp | 147 +++++++++++++++++++++++++
> > gdb/testsuite/gdb.multi/multi-kill.c | 34 ++++++
> > gdb/testsuite/gdb.multi/multi-kill.exp | 104 +++++++++++++++++
> > 5 files changed, 360 insertions(+), 6 deletions(-)
> > create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c
> > create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp
> > create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c
> > create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp
> >
> > diff --git a/gdb/infrun.c b/gdb/infrun.c
> > index ce8b544ab8d..ead60a0d152 100644
> > --- a/gdb/infrun.c
> > +++ b/gdb/infrun.c
> > @@ -4854,13 +4854,60 @@ stop_all_threads (void)
> > target_pid_to_str (event.ptid).c_str ());
> > }
> >
> > - if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED
> > - || event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > - || event.ws.kind == TARGET_WAITKIND_EXITED
> > - || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
> > + if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED)
> > + {
> > + /* All resumed threads exited. */
> > + }
> > + else if (event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > + || event.ws.kind == TARGET_WAITKIND_EXITED
> > + || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
> > {
> > - /* All resumed threads exited
> > - or one thread/process exited/signalled. */
> > + /* One thread/process exited/signalled. */
> > +
> > + thread_info *t = nullptr;
> > +
> > + /* The target may have reported just a pid. If so, try
> > + the first non-exited thread. */
> > + if (event.ptid.is_pid ())
> > + {
> > + int pid = event.ptid.pid ();
> > + inferior *inf = find_inferior_pid (event.target, pid);
> > + for (thread_info *tp : inf->non_exited_threads ())
> > + {
> > + t = tp;
> > + break;
> > + }
> > +
> > + /* FIXME: If there is no available thread, the event
> > + would have to be appended to a per-inferior event
> > + list, which, unfortunately, does not exist yet. We
> > + assert here instead of going into an infinite loop. */
> > + gdb_assert (t != nullptr);
> > +
> > + if (debug_infrun)
> > + fprintf_unfiltered (gdb_stdlog,
> > + "infrun: stop_all_threads, using %s\n",
> > + target_pid_to_str (t->ptid).c_str ());
> > + }
> > + else
> > + {
> > + t = find_thread_ptid (event.target, event.ptid);
> > + /* Check if this is the first time we see this thread.
> > + Don't bother adding if it individually exited. */
> > + if (t == nullptr
> > + && event.ws.kind != TARGET_WAITKIND_THREAD_EXITED)
> > + t = add_thread (event.target, event.ptid);
> > + }
> > +
> > + if (t != nullptr)
> > + {
> > + /* Set the threads as non-executing to avoid
> > + another stop attempt on them. */
> > + mark_non_executing_threads (event.target, event.ptid,
> > + event.ws);
> > + save_waitstatus (t, &event.ws);
> > + t->stop_requested = false;
> > + }
> > }
> > else
> > {
> > diff --git a/gdb/testsuite/gdb.multi/multi-exit.c b/gdb/testsuite/gdb.multi/multi-exit.c
> > new file mode 100644
> > index 00000000000..f4825c8a7c1
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.multi/multi-exit.c
> > @@ -0,0 +1,22 @@
> > +/* This testcase is part of GDB, the GNU debugger.
> > +
> > + Copyright 2020 Free Software Foundation, Inc.
> > +
> > + This program is free software; you can redistribute it and/or modify
> > + it under the terms of the GNU General Public License as published by
> > + the Free Software Foundation; either version 3 of the License, or
> > + (at your option) any later version.
> > +
> > + This program is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > + GNU General Public License for more details.
> > +
> > + You should have received a copy of the GNU General Public License
> > + along with this program. If not, see <http://www.gnu.org/licenses/>. */
> > +
> > +int
> > +main ()
> > +{
> > + return 0;
> > +}
> > diff --git a/gdb/testsuite/gdb.multi/multi-exit.exp b/gdb/testsuite/gdb.multi/multi-exit.exp
> > new file mode 100644
> > index 00000000000..2236243461d
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.multi/multi-exit.exp
> > @@ -0,0 +1,147 @@
> > +# This testcase is part of GDB, the GNU debugger.
> > +
> > +# Copyright 2020 Free Software Foundation, Inc.
> > +
> > +# This program is free software; you can redistribute it and/or modify
> > +# it under the terms of the GNU General Public License as published by
> > +# the Free Software Foundation; either version 3 of the License, or
> > +# (at your option) any later version.
> > +#
> > +# This program is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> > +
> > +# Test receiving TARGET_WAITKIND_EXITED events from multiple
> > +# inferiors. In all stop-mode, upon receiving the exit event from one
> > +# of the inferiors, GDB will try to stop the other inferior, too. So,
> > +# a stop request will be sent. Receiving a TARGET_WAITKIND_EXITED
> > +# status kind as a response to that stop request instead of a
> > +# TARGET_WAITKIND_STOPPED should be handled by GDB without problems.
> > +
> > +load_lib gdbserver-support.exp
> > +
> > +if {[skip_gdbserver_tests]} {
> > + return 0
> > +}
> > +
> > +standard_testfile
> > +
> > +# The plain remote target can't do multiple inferiors.
> > +if {[target_info gdb_protocol] != ""} {
>
> Use:
>
> if [use_gdb_stub] {
>
> instead.
>
> multi-target.exp is a bit special, in that it really needs to
> control which targets are run. It is not the case in this
> testcase, AFAICT. See more below.
Done.
>
> > + return 0
> > +}
> > +
> > +if {[build_executable "failed to build" $testfile $srcfile {debug}] == -1} {
> > + return -1
> > +}
> > +
> > +# Set up the current inferior with a gdbserver in multi mode as its
> > +# target, if TARGET is "extended-remote". Otherwise the target is native.
> > +
> > +proc setup_inferior {target infnum} {
> > + global binfile
> > +
> > + gdb_test "file ${binfile}" ".*" "load file in inferior $infnum"
> > +
> > + if {$target == "extended-remote"} {
> > + gdb_test_no_output "set remote exec-file ${binfile}" \
> > + "set remote-exec file in inferior $infnum"
> > + set res [gdbserver_start "--multi" ""]
> > + set gdbserver_gdbport [lindex $res 1]
> > + if {[gdb_target_cmd "extended-remote" $gdbserver_gdbport]} {
> > + return 0
> > + }
> > + }
> > +
> > + if {![runto_main]} {
> > + fail "starting inferior $infnum"
> > + return 0
> > + }
> > + return 1
> > +}
> > +
> > +# Set up two inferiors and start the processes. The underlying target
> > +# of each inferior is determined by the TARGET argument.
> > +
> > +proc setup {target} {
> > + clean_restart
> > +
> > + # This is a test specific for GDB's ability to stop all threads.
> > + gdb_test_no_output "maint set target-non-stop on"
> > +
> > + if {![setup_inferior $target 1]} {
> > + return 0
> > + }
> > +
> > + gdb_test "add-inferior -no-connection" "Added inferior 2" \
> > + "add the second inferior"
> > + gdb_test "inferior 2" "Switching to inferior 2.*" \
> > + "switch to inferior 2"
> > +
> > + if {![setup_inferior $target 2]} {
> > + return 0
> > + }
> > +
> > + # We want to continue both processes.
> > + gdb_test_no_output "set schedule-multiple on"
> > +
> > + return 1
> > +}
> > +
> > +# Run the test using TARGET as the target of the inferiors.
> > +
> > +proc test {target} {
> > + if {![setup $target]} {
> > + untested "could not do setup"
> > + return
> > + }
> > +
> > + # We want GDB to complete the command and return the prompt
> > + # instead of going into an infinite loop.
> > + global decimal gdb_prompt inferior_exited_re
> > + gdb_test_multiple "continue" "first continue" {
> > + -re "Inferior ($decimal) \[^\n\r\]+ exited normally.*$gdb_prompt $" {
> > + set exited_inferior $expect_out(1,string)
> > + pass $gdb_test_name
> > + }
> > + }
> > +
> > + if {![info exists exited_inferior]} {
> > + fail "first continue"
> > + return 0
> > + }
> > +
> > + if {$exited_inferior == 1} {
> > + set other_inferior 2
> > + } else {
> > + set other_inferior 1
> > + }
> > +
> > + # Switch to the other inferior and check it, too, continues to the end.
> > + gdb_test "inferior $other_inferior" \
> > + ".*Switching to inferior $other_inferior.*" \
> > + "switch to the other inferior"
> > +
> > + gdb_continue_to_end
> > +
> > + # Finally, check if we can re-run both inferiors.
> > + delete_breakpoints
> > +
> > + gdb_test "run" "$inferior_exited_re normally\]" \
> > + "re-run the other inferior"
> > +
> > + gdb_test "inferior $exited_inferior" \
> > + ".*Switching to inferior $exited_inferior.*" \
> > + "switch to the first exited inferior"
> > +
> > + gdb_test "run" "$inferior_exited_re normally\]" \
> > + "re-run the first exited inferior"
> > +}
> > +
> > +foreach_with_prefix target {"native" "extended-remote"} {
>
> This is really usually not the right thing to do. Better write
> the testcase in a target-neutral form, and then let
>
> make check RUNTESTFLAGS="--target_board=native-extended-gdbserver"
>
> cover testing with extended-remote.
I believe this comment is addressed in v6.
> AFAICT, when testing with extended-remote, you're spawning
> a gdbserver for each inferior. You don't really need that,
> right? A single gdbserver with both inferiors should
> trigger the problem as well.
Correct. New problems arose, though, that I explained in the reply
to Patch #3.
> I.e., remove this specific target stuff, and just use regular
> commands.
>
> Instead of issuing the "file" + "set remote exec-file", use
> gdb_load.
Done.
> Instead of the "run" command, try gdb_run_cmd.
Done.
> > +gdb_test_multiple "continue" "continue processes" {
> > + -re "Continuing.\[\r\n\]+" {
> > + # Kill both processes at once.
> > + remote_exec build "kill -9 ${testpid1} ${testpid2}"
>
> Pedantically, "remote_exec target"
Done.
> > +# Find the current inferior's id.
> > +set current_inferior 1
> > +gdb_test_multiple "info inferiors" "find the current inf id" {
> > + -re "\\* 1 .*$gdb_prompt $" {
> > + set current_inferior 1
> > + set other_inferior 2
> > + pass $gdb_test_name
> > + }
> > + -re "\\* 2 .*$gdb_prompt $" {
> > + set current_inferior 2
> > + set other_inferior 1
> > + pass $gdb_test_name
> > + }
>
> Watch out for tabs vs spaces above.
Fixed.
> Does multi-kill.exp really need to attach to processes instead of
> spawning them via GDB? If we do the latter, then the testcase
> will run on more configurations.
No, it does not have to attach. I revised the test and it can now
run with native-extended-gdbserver, too.
Thanks.
Baris
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
More information about the Gdb-patches
mailing list