[PATCH v2 2/2] gdb: raise and handle NOT_AVAILABLE_ERROR when accessing frame PC
Bruno Larsen
blarsen@redhat.com
Wed Mar 16 17:26:58 GMT 2022
Hello Tankut,
On 2/8/22 06:15, Tankut Baris Aktemur via Gdb-patches wrote:
> This patch can be considered a continuation of
>
> commit 4778a5f87d253399083565b4919816f541ebe414
> Author: Tom de Vries <tdevries@suse.de>
> Date: Tue Apr 21 15:45:57 2020 +0200
>
> [gdb] Fix hang after ext sigkill
>
> and
>
> commit 47f1aceffa02be4726b854082d7587eb259136e0
> Author: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
> Date: Thu May 14 13:59:54 2020 +0200
>
> gdb/infrun: handle already-exited threads when attempting to stop
>
> If a process dies before GDB reports the exit error to the user, we
> may see the "Couldn't get registers: No such process." error message
> in various places. For instance:
>
> (gdb) start
> ...
> (gdb) info inferior
> Num Description Connection Executable
> * 1 process 31943 1 (native) /tmp/a.out
> (gdb) shell kill -9 31943
> (gdb) maintenance flush register-cache
> Register cache flushed.
> Couldn't get registers: No such process.
> (gdb) info threads
> Id Target Id Frame
> * 1 process 31943 "a.out" Couldn't get registers: No such process.
> (gdb) backtrace
> Python Exception <class 'gdb.error'>: Couldn't get registers: No such process.
> Couldn't get registers: No such process.
> (gdb) inferior 1
> Couldn't get registers: No such process.
> (gdb) thread
> [Current thread is 1 (process 31943)]
> Couldn't get registers: No such process.
> (gdb)
>
> The gdb.threads/killed-outside.exp, gdb.multi/multi-kill.exp, and
> gdb.multi/multi-exit.exp tests also check related scenarios.
>
> To improve the situation,
>
> 1. when printing the frame info, catch and process a NOT_AVAILABLE_ERROR.
>
> 2. when accessing the target to fetch registers, if the operation
> fails, raise a NOT_AVAILABLE_ERROR instead of a generic error, so
> that clients can attempt to recover accordingly. This patch updates
> the amd64_linux_nat_target and remote_target in this direction.
>
> With this patch, we obtain the following behavior:
>
> (gdb) start
> ...
> (gdb) info inferior
> Num Description Connection Executable
> * 1 process 748 1 (native) /tmp/a.out
> (gdb) shell kill -9 748
> (gdb) maintenance flush register-cache
> Register cache flushed.
> (gdb) info threads
> Id Target Id Frame
> * 1 process 748 "a.out" <PC register is not available>
> (gdb) backtrace
> #0 <PC register is not available>
> Backtrace stopped: not enough registers or memory available to unwind further
> (gdb) inferior 1
> [Switching to inferior 1 [process 748] (/tmp/a.out)]
> [Switching to thread 1 (process 748)]
> #0 <PC register is not available>
> (gdb) thread
> [Current thread is 1 (process 748)]
> (gdb)
>
> Here is another "before/after" case. Suppose we have two inferiors,
> each having its own remote target underneath. Before this patch, we
> get the following output:
>
> # Create two inferiors on two remote targets, resume both until
> # termination. Exit event from one of them is shown first, but the
> # other also exited -- just not yet shown.
> (gdb) maint set target-non-stop on
> (gdb) target remote | gdbserver - ./a.out
> (gdb) add-inferior -no-connection
> (gdb) inferior 2
> (gdb) target remote | gdbserver - ./a.out
> (gdb) set schedule-multiple on
> (gdb) continue
> ...
> [Inferior 2 (process 22127) exited normally]
> (gdb) inferior 1
> [Switching to inferior 1 [process 22111] (target:/tmp/a.out)]
> [Switching to thread 1.1 (Thread 22111.22111)]
> Could not read registers; remote failure reply 'E01'
> (gdb) info threads
> Id Target Id Frame
> * 1.1 Thread 22111.22111 "a.out" Could not read registers; remote failure reply 'E01'
> (gdb) backtrace
> Python Exception <class 'gdb.error'>: Could not read registers; remote failure reply 'E01'
> Could not read registers; remote failure reply 'E01'
> (gdb) thread
> [Current thread is 1.1 (Thread 22111.22111)]
> Could not read registers; remote failure reply 'E01'
> (gdb)
>
> With this patch, it becomes:
>
> ...
> [Inferior 1 (process 11759) exited normally]
> (gdb) inferior 2
> [Switching to inferior 2 [process 13440] (target:/path/to/a.out)]
> [Switching to thread 2.1 (Thread 13440.13440)]
> #0 <unavailable> in ?? ()
> (gdb) info threads
> Id Target Id Frame
> * 2.1 Thread 13440.13440 "a.out" <unavailable> in ?? ()
> (gdb) backtrace
> #0 <unavailable> in ?? ()
> Backtrace stopped: not enough registers or memory available to unwind further
> (gdb) thread
> [Current thread is 2.1 (Thread 13440.13440)]
> (gdb)
>
> Finally, together with its predecessor, this patch also fixes PR gdb/26877.
While I think this is a good idea, it doesn't seem to fix the root cause of the bug you mentioned. It does stop the crash that the bug reports, but I would say the actual issue is that GDB is not noticing that the second inferior is also finished. My 2 cents, for what they're worth.
The explanation in the commit message is great! It explains the problem quite well, I just don't understand why you only changed amd64_linux_nat_target and remote. I imagine this issue happens with all targets. I'd ask at least that some of the most common ones be changed and validated. Also, some extra testing revealed that the previous patch is not actually necessary to fix the crash.
As for technical review, I don't have any questions or comments, but I can't approve patches.
--
Cheers!
Bruno Larsen
>
> Regression-tested on X86_64-Linux.
> ---
> gdb/amd64-linux-nat.c | 3 +-
> gdb/remote.c | 20 +++--
> gdb/stack.c | 33 +++++++-
> gdb/testsuite/gdb.threads/killed-outside.exp | 7 +-
> .../gdb.tui/multi-exit-remove-inferior.c | 21 +++++
> .../gdb.tui/multi-exit-remove-inferior.exp | 80 +++++++++++++++++++
> 6 files changed, 152 insertions(+), 12 deletions(-)
> create mode 100644 gdb/testsuite/gdb.tui/multi-exit-remove-inferior.c
> create mode 100644 gdb/testsuite/gdb.tui/multi-exit-remove-inferior.exp
>
> diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
> index 3d28d7e1d57..bdcc93eb498 100644
> --- a/gdb/amd64-linux-nat.c
> +++ b/gdb/amd64-linux-nat.c
> @@ -222,7 +222,8 @@ amd64_linux_nat_target::fetch_registers (struct regcache *regcache, int regnum)
> elf_gregset_t regs;
>
> if (ptrace (PTRACE_GETREGS, tid, 0, (long) ®s) < 0)
> - perror_with_name (_("Couldn't get registers"));
> + throw_perror_with_name (NOT_AVAILABLE_ERROR,
> + _("Couldn't get registers"));
>
> amd64_supply_native_gregset (regcache, ®s, -1);
> if (regnum != -1)
> diff --git a/gdb/remote.c b/gdb/remote.c
> index 49eeb63445d..936084b9b4a 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -8398,10 +8398,14 @@ remote_target::fetch_register_using_p (struct regcache *regcache,
> case PACKET_UNKNOWN:
> return 0;
> case PACKET_ERROR:
> - error (_("Could not fetch register \"%s\"; remote failure reply '%s'"),
> - gdbarch_register_name (regcache->arch (),
> - reg->regnum),
> - buf);
> + {
> + const char *regname = gdbarch_register_name (regcache->arch (),
> + reg->regnum);
> + std::string msg
> + = string_printf (_("Could not fetch register \"%s\"; remote "
> + "failure reply '%s'"), regname, buf);
> + throw_perror_with_name (NOT_AVAILABLE_ERROR, msg.c_str ());
> + }
> }
>
> /* If this register is unfetchable, tell the regcache. */
> @@ -8438,8 +8442,12 @@ remote_target::send_g_packet ()
> putpkt (rs->buf);
> getpkt (&rs->buf, 0);
> if (packet_check_result (rs->buf) == PACKET_ERROR)
> - error (_("Could not read registers; remote failure reply '%s'"),
> - rs->buf.data ());
> + {
> + std::string msg
> + = string_printf (_("Could not read registers; remote failure reply "
> + "'%s'"), rs->buf.data ());
> + throw_perror_with_name (NOT_AVAILABLE_ERROR, msg.c_str ());
> + }
>
> /* We can get out of synch in various cases. If the first character
> in the buffer is not a hex character, assume that has happened
> diff --git a/gdb/stack.c b/gdb/stack.c
> index b1bbf7d0f44..47faec5a124 100644
> --- a/gdb/stack.c
> +++ b/gdb/stack.c
> @@ -1107,7 +1107,38 @@ print_frame_info (const frame_print_options &fp_opts,
> the next frame is a SIGTRAMP_FRAME or a DUMMY_FRAME, then the
> next frame was not entered as the result of a call, and we want
> to get the line containing FRAME->pc. */
> - symtab_and_line sal = find_frame_sal (frame);
> + symtab_and_line sal;
> + try
> + {
> + sal = find_frame_sal (frame);
> + }
> + catch (const gdb_exception_error &ex)
> + {
> + if (ex.error == NOT_AVAILABLE_ERROR)
> + {
> + ui_out_emit_tuple tuple_emitter (uiout, "frame");
> +
> + annotate_frame_begin (print_level ? frame_relative_level (frame) : 0,
> + gdbarch, 0);
> +
> + if (print_level)
> + {
> + uiout->text ("#");
> + uiout->field_fmt_signed (2, ui_left, "level",
> + frame_relative_level (frame));
> + }
> +
> + std::string frame_info = "<";
> + frame_info += ex.what ();
> + frame_info += ">";
> + uiout->field_string ("func", frame_info.c_str (),
> + metadata_style.style ());
> + uiout->text ("\n");
> + annotate_frame_end ();
> + return;
> + }
> + throw;
> + }
>
> location_print = (print_what == LOCATION
> || print_what == SRC_AND_LOC
> diff --git a/gdb/testsuite/gdb.threads/killed-outside.exp b/gdb/testsuite/gdb.threads/killed-outside.exp
> index 3d4543e088c..8e156c863dc 100644
> --- a/gdb/testsuite/gdb.threads/killed-outside.exp
> +++ b/gdb/testsuite/gdb.threads/killed-outside.exp
> @@ -37,16 +37,15 @@ remote_exec target "kill -9 ${testpid}"
> # Give it some time to die.
> sleep 2
>
> -set no_such_process_msg "Couldn't get registers: No such process\."
> +set no_pc_available_msg "PC register is not available"
> set killed_msg "Program terminated with signal SIGKILL, Killed\."
> set no_longer_exists_msg "The program no longer exists\."
> set not_being_run_msg "The program is not being run\."
>
> gdb_test_multiple "continue" "prompt after first continue" {
> - -re "Continuing\.\r\n$no_such_process_msg\r\n$gdb_prompt " {
> + -re "Continuing\.\r\n$no_pc_available_msg\r\n$gdb_prompt " {
> pass $gdb_test_name
> - # Saw $no_such_process_msg. The bug condition was triggered, go
> - # check for it.
> + # The bug condition was triggered, go check for it.
> gdb_test_multiple "" "messages" {
> -re ".*$killed_msg.*$no_longer_exists_msg\r\n" {
> pass $gdb_test_name
> diff --git a/gdb/testsuite/gdb.tui/multi-exit-remove-inferior.c b/gdb/testsuite/gdb.tui/multi-exit-remove-inferior.c
> new file mode 100644
> index 00000000000..d934d74fbb7
> --- /dev/null
> +++ b/gdb/testsuite/gdb.tui/multi-exit-remove-inferior.c
> @@ -0,0 +1,21 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> + Copyright 2020 Free Software Foundation, Inc.
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3 of the License, or
> + (at your option) any later version.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program. If not, see <http://www.gnu.org/licenses/>. */
> +
> +int main() {
> + int a = 42;
> + return 0; /* break-here */
> +}
> diff --git a/gdb/testsuite/gdb.tui/multi-exit-remove-inferior.exp b/gdb/testsuite/gdb.tui/multi-exit-remove-inferior.exp
> new file mode 100644
> index 00000000000..10eb69500cc
> --- /dev/null
> +++ b/gdb/testsuite/gdb.tui/multi-exit-remove-inferior.exp
> @@ -0,0 +1,80 @@
> +# Copyright 2020 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +#
> +# This is a regression test for PR gdb/26877.
> +
> +tuiterm_env
> +
> +standard_testfile
> +
> +if {[use_gdb_stub]} {
> + return 0
> +}
> +
> +if {[build_executable "failed to prepare" ${testfile} ${srcfile}] == -1} {
> + return -1
> +}
> +
> +# Make sure TUI is supported before continuing.
> +with_test_prefix "initial check" {
> + Term::clean_restart 24 80 $testfile
> + if {![Term::enter_tui]} {
> + unsupported "TUI not supported"
> + return
> + }
> +}
> +
> +Term::clean_restart 24 80 $testfile
> +
> +# Create a setting with two inferiors, where both are stopped
> +# at a breakpoint at the end of main. Then resume both.
> +set bp [gdb_get_line_number "break-here"]
> +gdb_breakpoint "$bp"
> +
> +with_test_prefix "inferior 1" {
> + gdb_run_cmd
> + gdb_test "" ".*reakpoint \[^\r\n\]+${srcfile}.*" "run until bp"
> +}
> +
> +with_test_prefix "inferior 2" {
> + gdb_test "add-inferior -exec [standard_output_file $testfile]" \
> + "Added inferior 2.*" "add inferior"
> + gdb_test "inferior 2" "Switching to inferior 2.*" "switch"
> + gdb_run_cmd
> + gdb_test "" ".*reakpoint \[^\r\n\]+${srcfile}.*" "run until bp"
> +}
> +
> +gdb_test_no_output "set schedule-multiple on"
> +gdb_continue_to_end
> +
> +# Find out which inferior is current. It is the inferior that exited.
> +set exited_inf [get_integer_valueof {$_inferior} 1 "current inf"]
> +
> +# Switch to the other inferior and remove the exited one.
> +# Bad GDB used to crash when this is done under TUI.
> +if {![Term::enter_tui]} {
> + unsupported "TUI not supported"
> + return
> +}
> +
> +if {$exited_inf == 1} {
> + Term::command "inferior 2"
> +} else {
> + Term::command "inferior 1"
> +}
> +
> +Term::command "remove-inferiors $exited_inf"
> +Term::command "info inferior $exited_inf"
> +Term::check_contents "inferior is removed" "No inferiors."
More information about the Gdb-patches
mailing list