This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [review v2] infrun: handle already-exited threads when attempting to stop
- From: Pedro Alves <palves at redhat dot com>
- To: "Aktemur, Tankut Baris" <tankut dot baris dot aktemur at intel dot com>
- Cc: gdb-patches at sourceware dot org, Luis Machado <luis dot machado at linaro dot org>, gnutoolchain-gerrit at osci dot io
- Date: Thu, 9 Jan 2020 16:58:26 +0000
- Subject: Re: [review v2] infrun: handle already-exited threads when attempting to stop
- References: <gerrit.1571405222000.I7cec98f40283773b79255d998511da434e9cd408@gnutoolchain-gerrit.osci.io> <20191209150906.A35B620AF6@gnutoolchain-gerrit.osci.io>
Hi,
(I tried to reply via gerrit, but I couldn't find how to hit "Quote"
to quote/reply to this message of yours -- seems like via the web ui you
can only reply to the last comment in a thread, which in this case is a
ping, not the message I intended to quote...)
Sorry for the delay, and thanks for the ping. I've been thinking on and off
about this. Below you'll find my current thoughts.
On 12/9/19 3:09 PM, Tankut Baris Aktemur (Code Review) wrote:
> I'd like to ask for your opinion on making the second exit event
> pending. One problem is, because the event has not been reported to
> the user yet, the user still thinks that the inferior is alive. So,
> after getting the prompt because of the first exit event, they may be
> tempted to do "info threads" or switch to the not-yet-reported-
> inferior and inspect its state. This triggers a query (e.g. of
> registers) on the process that is already gone. I tried the following
> scenario with the current master branch (the patch that I proposed was
> not applied):
>
> ~~~
> $ gdb ./a.out
> (gdb) maint set target-non-stop off
> (gdb) start
> ...
> (gdb) add-inferior -exec ./a.out
> [New inferior 2]
> Added inferior 2
> ...
> (gdb) inferior 2
> [Switching to inferior 2 [<null>] (/tmp/a.out)]
> (gdb) start
> ...
> (gdb) set schedule-multiple on
> (gdb) c
> Continuing.
> [Inferior 2 (process 16331) exited normally]
> (gdb) i inferiors
> Num Description Executable
> 1 process 16137 /tmp/a.out
> * 2 <null> /tmp/a.out
> (gdb) inferior 1
> [Switching to inferior 1 [process 16137] (/tmp/a.out)]
> [Switching to thread 1.1 (process 16137)]
> Couldn't get registers: No such process.
> (gdb) i threads
> Id Target Id Frame
> Couldn't get registers: No such process.
> (gdb) c
> Continuing.
> Couldn't get registers: No such process.
> ~~~
>
> If I save the exit event in my patch as a pending event (and omit
> 'maint set target-non-stop off'), I get essentially the same problem.
> What is the expected GDB behavior here? Would it be alright to
> actually print both exit events, followed by the gdb-prompt, where the
> user can now query $_exitcode or $_exitsignal by switching between
> inferiors, assuming those special variables are set correctly per
> inferior?
I'm really not sure about that.
As you've seen, this happens in true all-stop too, which can't report
multiple events at the same time, so I think from that angle alone,
GDB should cope better with it.
Plus, this can happen even if an inferior stopped for some other event
while at the same time some other inferior exits.
Say, inferior 1 hits a breakpoint, and while stopping everything,
inferior 2 exits. And GDB happens to report the breakpoint hit
first. And now the user does "info threads" and sees the "No such process"
errors.
You could maybe think, that then maybe we should prioritize
inferior exits over breakpoint hits. But then, what if inferior 1
stops for a breakpoint, gdb manages to stop all threads without
inferior 2 exiting, and then a SIGKILL is sent to the
supposedly-stopped inferior, from outside GDB?
Or to make it even simpler, that SIGKILL use case can even
happen in single-inferior debugging.
Or, in "set non-stop on" mode, the inferior is running and you so
"info threads" just between the process dying, and GDB getting
the SIGCHLD and collecting the ptrace event.
So I think that the state gets into where the inferior dies
before the inferior exit event is reported to the user is just
something that GDB needs to cope with well.
I.e., report the failures to read registers, in "info threads",
"print" etc., and importantly -- _be sure to not get into a state
where the user is stuck_.
The "not get stuck" part is where I think we should improve things.
Your example already shows where we need improvement, in the
the last "c". A simplified version, using an external SIGKILL is:
$ gdb --args /usr/bin/tail -f /dev/null
GNU gdb (GDB) 10.0.50.20200106-git
...
Program received signal SIGINT, Interrupt.
0x00007ffff7b08881 in __GI___libc_read (fd=4, buf=0x555555766410, nbytes=26) at ../sysdeps/unix/sysv/linux/read.c:26
26 return SYSCALL_CANCEL (read, fd, buf, nbytes);
(gdb) info inferiors
Num Description Executable
* 1 process 9425 /usr/bin/tail
(gdb) shell kill -9 9425
(gdb) flushregs
Register cache flushed.
Couldn't get registers: No such process.
(gdb) info threads
Id Target Id Frame
Couldn't get registers: No such process.
(gdb) c
Continuing.
Couldn't get registers: No such process.
(gdb)
This error comes from the regcache_read_pc call from
within infrun.c:proceed:
...
#4 0x000000000097e04e in perror_with_name (string=0xb213e9 "Couldn't get registers") at /home/pedro/gdb/binutils-gdb/src/gdb/utils.c:612
#5 0x0000000000452156 in amd64_linux_nat_target::fetch_registers (this=0x11a64b0 <the_amd64_linux_nat_target>, regcache=0x1eac160, regnum=16) at /home/pedro/gdb/binutils-gdb/src/gdb/amd64-linux-nat.c:225
#6 0x0000000000901cc4 in target_fetch_registers (regcache=0x1eac160, regno=16) at /home/pedro/gdb/binutils-gdb/src/gdb/target.c:3427
#7 0x0000000000829f94 in regcache::raw_update (this=0x1eac160, regnum=16) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:471
#8 0x000000000082a039 in readable_regcache::raw_read (this=0x1eac160, regnum=16, buf=0x7fffffffcc00 "\302%") at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:485
#9 0x000000000082a371 in readable_regcache::cooked_read (this=0x1eac160, regnum=16, buf=0x7fffffffcc00 "\302%") at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:577
#10 0x000000000082eefd in readable_regcache::cooked_read<unsigned long, void> (this=0x1eac160, regnum=16, val=0x7fffffffcca8) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:664
#11 0x000000000082a7d8 in regcache_cooked_read_unsigned (regcache=0x1eac160, regnum=16, val=0x7fffffffcca8) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:678
#12 0x000000000082bcf5 in regcache_read_pc (regcache=0x1eac160) at /home/pedro/gdb/binutils-gdb/src/gdb/regcache.c:1182
#13 0x00000000006e6f62 in proceed (addr=0xffffffffffffffff, siggnal=GDB_SIGNAL_DEFAULT) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:2855
#14 0x00000000006d7af6 in continue_1 (all_threads=0) at /home/pedro/gdb/binutils-gdb/src/gdb/infcmd.c:804
...
I think PTRACE_EVENT_EXIT on Linux could help with at least some of
the use cases on Linux, but still, GDB should cope better on systems
that do not have that feature.
Thanks,
Pedro Alves