This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
GDB hangs on kill or quit (after following a fork child, not detaching from the parent)
- From: Pedro Alves <pedro at codesourcery dot com>
- To: gdb-patches at sourceware dot org
- Cc: Michael Snyder <msnyder at vmware dot com>
- Date: Fri, 12 Dec 2008 21:13:56 +0000
- Subject: GDB hangs on kill or quit (after following a fork child, not detaching from the parent)
[ Michael, you're the forks man. CCing you in case see an issue with
the attached patch? ]
GDB hangs if you do:
> ./gdb -ex "set follow-fork-mode child" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
GNU gdb (GDB) 6.8.50.20081212-cvs
(gdb) start
Temporary breakpoint 1 at 0x40054f: file ../../../src/gdb/testsuite/gdb.base/foll-fork.c, line 21.
Starting program: /home/pedro/gdb/baseline/build/gdb/testsuite/gdb.base/foll-fork
Temporary breakpoint 1, main () at ../../../src/gdb/testsuite/gdb.base/foll-fork.c:21
21 int v = 5;
(gdb) n
During symbol reading, incomplete CFI data; unspecified registers (e.g., rax) at 0x40054b.
23 pid = fork ();
(gdb)
[Switching to process 14900]
24 if (pid == 0)
(gdb) kill
Kill the program being debugged? (y or n) y
<HANGS HERE>
The same will happen if you issues 'quit' instead of 'kill', as quitting
tries to 'kill' the inferior.
When there are forks involved, linux_nat_kill calls into linux_fork_killall
to do the killing. But, when following a fork child, and not
detaching from the parent, we defer adding the child fork to the
list of forks (which is confusing IMHO, see below), so linux_fork_killall
misses killing it. Then, we hang waiting for it to die, but it
won't happen.
"kill"
-> linux_nat_kill
-> linux_fork_killall.
This kills all forks listed. At this point, since only the parent was listed,
the child was left alive.
After killing, we call target_mourn_inferior, which goes:
-> linux_nat_kill
-> linux_nat_mourn_inferior
->inf_ptrace_mourn_inferior
Now, inferior_ptid was pointing at the child (we followed it),
and, we didn't kill it so, we hang here:
184 /* Clean up a rotting corpse of an inferior after it died. */
185
186 static void
187 inf_ptrace_mourn_inferior (struct target_ops *ops)
188 {
189 int status;
190
191 /* Wait just one more time to collect the inferior's exit status.
192 Do not check whether this succeeds though, since we may be
193 dealing with a process that we attached to. Such a process will
194 only report its exit status to its original parent. */
(top-gdb)
195 waitpid (ptid_get_pid (inferior_ptid), &status, 0);
^^^^^^^
#0 0x00007fb6a45364a5 in waitpid () from /lib/libc.so.6
#1 0x00000000005df07b in inf_ptrace_mourn_inferior (ops=0xaf4ce0) at ../../src/gdb/inf-ptrace.c:195
#2 0x0000000000477c55 in linux_nat_mourn_inferior (ops=0xaf4ce0) at ../../src/gdb/linux-nat.c:3194
#3 0x0000000000543d82 in target_mourn_inferior () at ../../src/gdb/target.c:1899
#4 0x0000000000477c09 in linux_nat_kill () at ../../src/gdb/linux-nat.c:3180
#5 0x00000000004622b6 in kill_command (arg=0x0, from_tty=1) at ../../src/gdb/inflow.c:602
...
=================
The attached patch fixes it by also adding the child to the fork
list in this case.
Also, since we're now adding the child, one bit of special casing
from linux-fork.c can be removed, as in the patch. The 'error' call
in linux-fork.c that I'm removing is really dead code, as all callers
do the same check.
This is the current output of info forks, when following a parent,
and when following a child:
./gdb -ex "set follow-fork-mode parent" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
...
(gdb) info forks
1 process 15396 at 0x7ffff789ac4b, <fork>
* 0 process 15393 (main process) at 0x40055e, file foll-fork.c, line 24
./gdb -ex "set follow-fork-mode child" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
...
(gdb) info forks
1 process 15319 at 0x7ffff789ac4b, <fork>
Notice how the 'following the child' case is a bit convoluted:
- Only the parent is listed, but it isn't that obvious that fork 1
is the parent.
- There's no indication of which fork is current.
If you do:
(gdb) fork 1
Switching to process 16458
#0 0x00007ffff789ac4b in fork () from /lib/libc.so.6
(gdb) info forks
2 process 16461 at 0x40055e, file foll-fork.c, line 24
* 1 process 16458 at 0x7ffff789ac4b, <fork>
(gdb)
Voila! - the child fork was added (fork 2). I find this confusing.
=================
This is the output after the patch is applied:
> ./gdb -ex "set follow-fork-mode child" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
...
(gdb) info forks
* 2 process 15934 at 0x40055e, file foll-fork.c, line 24
1 process 15927 at 0x7ffff789ac4b, <fork>
>./gdb -ex "set follow-fork-mode parent" -ex "set detach-on-fork off" ./testsuite/gdb.base/foll-fork
...
(gdb) info forks
1 process 15979 at 0x7ffff789ac4b, <fork>
* 0 process 15974 (main process) at 0x40055e, file foll-fork.c, line 24
A bit more uniform, and the hang bug goes away, of course. That special
casing to have a fork number 0 could go away too, IMVHO.
Tested on x86_64-linux-gnu, no regressions.
--
Pedro Alves
2008-12-12 Pedro Alves <pedro@codesourcery.com>
* linux-nat.c (linux_child_follow_fork): If following the child,
and not detaching the parent, also add the child fork to the fork
list.
* linux-fork.c (linux_fork_context): Remove dead error call.
Assert that the incoming newfp argument is not null. Do not add a
new fork for inferior_ptid. Assert that there is one already.
---
gdb/linux-fork.c | 9 ++++-----
gdb/linux-nat.c | 6 ++++++
2 files changed, 10 insertions(+), 5 deletions(-)
Index: src/gdb/linux-nat.c
===================================================================
--- src.orig/gdb/linux-nat.c 2008-12-12 20:06:17.000000000 +0000
+++ src/gdb/linux-nat.c 2008-12-12 20:21:00.000000000 +0000
@@ -857,6 +857,12 @@ linux_child_follow_fork (struct target_o
if (!fp)
fp = add_fork (parent_pid);
fork_save_infrun_state (fp, 0);
+
+ /* Also add an entry for the child fork. */
+ fp = find_fork_pid (child_pid);
+ if (!fp)
+ fp = add_fork (child_pid);
+ fork_save_infrun_state (fp, 0);
}
else
target_detach (NULL, 0);
Index: src/gdb/linux-fork.c
===================================================================
--- src.orig/gdb/linux-fork.c 2008-12-12 20:09:43.000000000 +0000
+++ src/gdb/linux-fork.c 2008-12-12 20:09:21.000000000 +0000
@@ -600,15 +600,14 @@ static void
linux_fork_context (struct fork_info *newfp, int from_tty)
{
/* Now we attempt to switch processes. */
- struct fork_info *oldfp = find_fork_ptid (inferior_ptid);
+ struct fork_info *oldfp;
ptid_t ptid;
int id, i;
- if (!newfp)
- error (_("No such fork/process"));
+ gdb_assert (newfp != NULL);
- if (!oldfp)
- oldfp = add_fork (ptid_get_pid (inferior_ptid));
+ oldfp = find_fork_ptid (inferior_ptid);
+ gdb_assert (oldfp != NULL);
fork_save_infrun_state (oldfp, 1);
remove_breakpoints ();