[PATCH, remote] Handle 'k' packet errors gracefully

Pedro Alves palves@redhat.com
Mon Dec 2 10:57:00 GMT 2013


On 11/30/2013 06:22 PM, Maciej W. Rozycki wrote:
> On Mon, 22 Apr 2013, Pedro Alves wrote:
> 
>>> This is not a real problem with gdbserver, but other types of remote 
>>> targets (other stubs, QEMU etc) may cut the connection abruptly since 
>>> they are not required to reply to a 'k' (Kill) packet sent from GDB.
>>>
>>> The following patch addresses any issues arising from such scenario, 
>>> which leads to a GDB internal error due to an attempt to pop the 
>>> target more than once. With the patch, this failure is handled 
>>> gracefully.
>>
>> Hard to say without at least seeing the backtrace, but this may no longer
>> be applicable due to changes in this area since (some quite recent).
> 
>  It still is.  Here's the backtrace requested.  Unfortunately there seems 
> to be a race here between GDB and QEMU and the problem only triggers in 
> the testsuite, I've been unable to trigger it manually.  The backtrace has 
> therefore been obtained from a core dump (I've edited out full paths for 
> brevity).
> 
> #0  0x55573430 in __kernel_vsyscall ()
> #1  0x557a2951 in raise () from /lib32/libc.so.6
> #2  0x557a5d82 in abort () from /lib32/libc.so.6
> #3  0x0826e2e4 in dump_core ()
>     at .../gdb/utils.c:635
> #4  0x0826e5b6 in internal_vproblem (problem=0x85200c0, 
>     file=0x8416be8 ".../gdb/target.c", line=2861, 
>     fmt=0x84174ac "could not find a target to follow mourn inferior", 
>     ap=0xffa4796c "\f")
>     at .../gdb/utils.c:804
> #5  0x0826e5fb in internal_verror (
>     file=0x8416be8 ".../gdb/target.c", line=2861, 
>     fmt=0x84174ac "could not find a target to follow mourn inferior", 
>     ap=0xffa4796c "\f")
>     at .../gdb/utils.c:820
> #6  0x0826e633 in internal_error (
>     file=0x8416be8 ".../gdb/target.c", line=2861, 
>     string=0x84174ac "could not find a target to follow mourn inferior")
>     at .../gdb/utils.c:830
> #7  0x081b4ad0 in target_mourn_inferior ()
>     at .../gdb/target.c:2861
> #8  0x08082283 in remote_kill (ops=0x85245e0)
>     at .../gdb/remote.c:7840
> #9  0x081b06d1 in target_kill ()
>     at .../gdb/target.c:486
> #10 0x081b42f6 in dispose_inferior (inf=0xa501c60, args=0x0)
>     at .../gdb/target.c:2570
> #11 0x08290cfc in iterate_over_inferiors (
>     callback=0x81b42af <dispose_inferior>, data=0x0)
>     at .../gdb/inferior.c:396
> #12 0x081b435a in target_preopen (from_tty=1)
>     at .../gdb/target.c:2591
> #13 0x0807c2c6 in remote_open_1 (name=0xa5538b6 "localhost:1237", from_tty=1, 
>     target=0x85245e0, extended_p=0)
>     at .../gdb/remote.c:4292
> #14 0x0807b7a8 in remote_open (name=0xa5538b6 "localhost:1237", from_tty=1)
>     at .../gdb/remote.c:3655
> #15 0x080a23d4 in do_cfunc (c=0xa464f30, args=0xa5538b6 "localhost:1237", 
>     from_tty=1)
>     at .../gdb/cli/cli-decode.c:107
> #16 0x080a4c3b in cmd_func (cmd=0xa464f30, args=0xa5538b6 "localhost:1237", 
>     from_tty=1)
>     at .../gdb/cli/cli-decode.c:1882
> #17 0x0826bebf in execute_command (p=0xa5538c3 "7", from_tty=1)
>     at .../gdb/top.c:467
> #18 0x08193f2d in command_handler (command=0xa5538a8 "")
>     at .../gdb/event-top.c:435
> #19 0x08194463 in command_line_handler (
>     rl=0xa778198 "target remote localhost:1237")
>     at .../gdb/event-top.c:633
> #20 0x082ba92b in rl_callback_read_char ()
>     at .../readline/callback.c:220
> #21 0x08193adf in rl_callback_read_char_wrapper (client_data=0x0)
>     at .../gdb/event-top.c:164
> #22 0x08193e57 in stdin_event_handler (error=0, client_data=0x0)
>     at .../gdb/event-top.c:375
> #23 0x08192f29 in handle_file_event (data=...)
>     at .../gdb/event-loop.c:768
> #24 0x0819266a in process_event ()
>     at .../gdb/event-loop.c:342
> #25 0x08192708 in gdb_do_one_event ()
>     at .../gdb/event-loop.c:394
> #26 0x08192781 in start_event_loop ()
>     at .../gdb/event-loop.c:431
> #27 0x08193b08 in cli_command_loop (data=0x0)
>     at .../gdb/event-top.c:179
> #28 0x0818bc26 in current_interp_command_loop ()
>     at .../gdb/interps.c:327
> #29 0x0818c4e5 in captured_command_loop (data=0x0)
>     at .../gdb/main.c:267
> #30 0x0818a37f in catch_errors (func=0x818c4d0 <captured_command_loop>, 
>     func_args=0x0, errstring=0x8402108 "", mask=RETURN_MASK_ALL)
>     at .../gdb/exceptions.c:524
> #31 0x0818d736 in captured_main (data=0xffa47f10)
>     at .../gdb/main.c:1067
> #32 0x0818a37f in catch_errors (func=0x818c723 <captured_main>, 
>     func_args=0xffa47f10, errstring=0x8402108 "", mask=RETURN_MASK_ALL)
>     at .../gdb/exceptions.c:524
> #33 0x0818d76c in gdb_main (args=0xffa47f10)
>     at .../gdb/main.c:1076
> #34 0x0804dd1b in main (argc=5, argv=0xffa47fd4)
>     at .../gdb/gdb.c:34
> 
> The corresponding gdb.log excerpt:
> 
> (gdb) PASS: gdb.base/bitfields.exp: bitfield uniqueness (u9)
> cont
> Continuing.
> 
> Breakpoint 1, break1 () at .../gdb/testsuite/gdb.base/bitfields.c:44
> 44	}
> (gdb) PASS: gdb.base/bitfields.exp: continuing to break1 #9
> print flags
> $10 = {uc = 0 '\000', s1 = 0, u1 = 0, s2 = 0, u2 = 0, s3 = 0, u3 = 0, s9 = 0, u9 = 0, sc = 1 '\001'}
> (gdb) PASS: gdb.base/bitfields.exp: bitfield uniqueness (sc)
> delete breakpoints
> Delete all breakpoints? (y or n) y
> (gdb) info breakpoints
> No breakpoints or watchpoints.
> (gdb) delete breakpoints
> (gdb) info breakpoints
> No breakpoints or watchpoints.
> (gdb) break break2
> Breakpoint 2 at 0x85f8: file .../gdb/testsuite/gdb.base/bitfields.c, line 48.
> (gdb) entering gdb_reload
> target remote localhost:1235
> A program is being debugged already.  Kill it? (y or n) y
> Remote connection closed
> .../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) ^Ccontinue
> Please answer y or n.
> .../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) Resyncing due to internal error.
> n
> .../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Create a core file of GDB? (y or n) y
> Command aborted.
> (gdb) print/x flags
> $11 = {uc = 0x0, s1 = 0x0, u1 = 0x0, s2 = 0x0, u2 = 0x0, s3 = 0x0, u3 = 0x0, s9 = 0x0, u9 = 0x0, sc = 0x0}
> (gdb) FAIL: gdb.base/bitfields.exp: bitfield containment #1
> cont
> The program is not being run.
> (gdb) FAIL: gdb.base/bitfields.exp: continuing to break2 (the program is no longer running)
> print/x flags
> $12 = {uc = 0x0, s1 = 0x0, u1 = 0x0, s2 = 0x0, u2 = 0x0, s3 = 0x0, u3 = 0x0, s9 = 0x0, u9 = 0x0, sc = 0x0}
> (gdb) FAIL: gdb.base/bitfields.exp: bitfield containment #2
> delete breakpoints
> Delete all breakpoints? (y or n) y
> (gdb) info breakpoints
> No breakpoints or watchpoints.
> (gdb) delete breakpoints
> (gdb) info breakpoints
> No breakpoints or watchpoints.
> (gdb) break break3
> Breakpoint 3 at 0x8604: file .../gdb/testsuite/gdb.base/bitfields.c, line 52.
> (gdb) entering gdb_reload
> target remote localhost:1236
> Remote debugging using localhost:1236
> Reading symbols from .../lib/ld-linux.so.3...done.
> Loaded symbols for .../lib/ld-linux.so.3
> 0x41001b80 in _start () from .../lib/ld-linux.so.3
> (gdb) continue
> Continuing.
> 
> Breakpoint 3, break3 () at .../gdb/testsuite/gdb.base/bitfields.c:52
> 52	}
> (gdb) print flags
> $13 = {uc = 0 '\000', s1 = 0, u1 = 1, s2 = 0, u2 = 3, s3 = 0, u3 = 7, s9 = 0, u9 = 511, sc = 0 '\000'}
> (gdb) PASS: gdb.base/bitfields.exp: unsigned bitfield ranges
> 
> Let me know if you need any further details.

No, that's excellent.  I took the liberty of using all that
for the git commit log, and pushed the patch in, as below.

Thanks.

-----
Handle 'k' packet TARGET_CLOSE_ERROR gracefully.

Remote servers may cut the connection abruptly since they are not
required to reply to a 'k' (Kill) packet sent from GDB.

This patch addresses any issues arising from such scenario, which
leads to a GDB internal error due to an attempt to pop the target more
than once.  With the patch, this failure is handled gracefully.

Here's the GDB backtrace Maciej got running the testsuite against
QEMU.  Full paths edited out for brevity.

#0  0x55573430 in __kernel_vsyscall ()
#1  0x557a2951 in raise () from /lib32/libc.so.6
#2  0x557a5d82 in abort () from /lib32/libc.so.6
#3  0x0826e2e4 in dump_core ()
    at .../gdb/utils.c:635
#4  0x0826e5b6 in internal_vproblem (problem=0x85200c0,
    file=0x8416be8 ".../gdb/target.c", line=2861,
    fmt=0x84174ac "could not find a target to follow mourn inferior",
    ap=0xffa4796c "\f")
    at .../gdb/utils.c:804
#5  0x0826e5fb in internal_verror (
    file=0x8416be8 ".../gdb/target.c", line=2861,
    fmt=0x84174ac "could not find a target to follow mourn inferior",
    ap=0xffa4796c "\f")
    at .../gdb/utils.c:820
#6  0x0826e633 in internal_error (
    file=0x8416be8 ".../gdb/target.c", line=2861,
    string=0x84174ac "could not find a target to follow mourn inferior")
    at .../gdb/utils.c:830
#7  0x081b4ad0 in target_mourn_inferior ()
    at .../gdb/target.c:2861
#8  0x08082283 in remote_kill (ops=0x85245e0)
    at .../gdb/remote.c:7840
#9  0x081b06d1 in target_kill ()
    at .../gdb/target.c:486
#10 0x081b42f6 in dispose_inferior (inf=0xa501c60, args=0x0)
    at .../gdb/target.c:2570
#11 0x08290cfc in iterate_over_inferiors (
    callback=0x81b42af <dispose_inferior>, data=0x0)
    at .../gdb/inferior.c:396
#12 0x081b435a in target_preopen (from_tty=1)
    at .../gdb/target.c:2591
#13 0x0807c2c6 in remote_open_1 (name=0xa5538b6 "localhost:1237", from_tty=1,
    target=0x85245e0, extended_p=0)
    at .../gdb/remote.c:4292
#14 0x0807b7a8 in remote_open (name=0xa5538b6 "localhost:1237", from_tty=1)
    at .../gdb/remote.c:3655
#15 0x080a23d4 in do_cfunc (c=0xa464f30, args=0xa5538b6 "localhost:1237",
    from_tty=1)
    at .../gdb/cli/cli-decode.c:107
#16 0x080a4c3b in cmd_func (cmd=0xa464f30, args=0xa5538b6 "localhost:1237",
    from_tty=1)
    at .../gdb/cli/cli-decode.c:1882
#17 0x0826bebf in execute_command (p=0xa5538c3 "7", from_tty=1)
    at .../gdb/top.c:467
#18 0x08193f2d in command_handler (command=0xa5538a8 "")
    at .../gdb/event-top.c:435
#19 0x08194463 in command_line_handler (
    rl=0xa778198 "target remote localhost:1237")
    at .../gdb/event-top.c:633
#20 0x082ba92b in rl_callback_read_char ()
    at .../readline/callback.c:220
#21 0x08193adf in rl_callback_read_char_wrapper (client_data=0x0)
    at .../gdb/event-top.c:164
#22 0x08193e57 in stdin_event_handler (error=0, client_data=0x0)
    at .../gdb/event-top.c:375
#23 0x08192f29 in handle_file_event (data=...)
    at .../gdb/event-loop.c:768
#24 0x0819266a in process_event ()
    at .../gdb/event-loop.c:342
#25 0x08192708 in gdb_do_one_event ()
    at .../gdb/event-loop.c:394
#26 0x08192781 in start_event_loop ()
    at .../gdb/event-loop.c:431
#27 0x08193b08 in cli_command_loop (data=0x0)
    at .../gdb/event-top.c:179
#28 0x0818bc26 in current_interp_command_loop ()
    at .../gdb/interps.c:327
#29 0x0818c4e5 in captured_command_loop (data=0x0)
    at .../gdb/main.c:267
#30 0x0818a37f in catch_errors (func=0x818c4d0 <captured_command_loop>,
    func_args=0x0, errstring=0x8402108 "", mask=RETURN_MASK_ALL)
    at .../gdb/exceptions.c:524
#31 0x0818d736 in captured_main (data=0xffa47f10)
    at .../gdb/main.c:1067
#32 0x0818a37f in catch_errors (func=0x818c723 <captured_main>,
    func_args=0xffa47f10, errstring=0x8402108 "", mask=RETURN_MASK_ALL)
    at .../gdb/exceptions.c:524
#33 0x0818d76c in gdb_main (args=0xffa47f10)
    at .../gdb/main.c:1076
#34 0x0804dd1b in main (argc=5, argv=0xffa47fd4)
    at .../gdb/gdb.c:34

The corresponding gdb.log excerpt:

(gdb) PASS: gdb.base/bitfields.exp: bitfield uniqueness (u9)
cont
Continuing.

Breakpoint 1, break1 () at .../gdb/testsuite/gdb.base/bitfields.c:44
44	}
(gdb) PASS: gdb.base/bitfields.exp: continuing to break1 #9
print flags
$10 = {uc = 0 '\000', s1 = 0, u1 = 0, s2 = 0, u2 = 0, s3 = 0, u3 = 0, s9 = 0, u9 = 0, sc = 1 '\001'}
(gdb) PASS: gdb.base/bitfields.exp: bitfield uniqueness (sc)
delete breakpoints
Delete all breakpoints? (y or n) y
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) delete breakpoints
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) break break2
Breakpoint 2 at 0x85f8: file .../gdb/testsuite/gdb.base/bitfields.c, line 48.
(gdb) entering gdb_reload
target remote localhost:1235
A program is being debugged already.  Kill it? (y or n) y
Remote connection closed
.../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) ^Ccontinue
Please answer y or n.
.../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) Resyncing due to internal error.
n
.../gdb/target.c:2861: internal-error: could not find a target to follow mourn inferior
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) y
Command aborted.
(gdb) print/x flags
$11 = {uc = 0x0, s1 = 0x0, u1 = 0x0, s2 = 0x0, u2 = 0x0, s3 = 0x0, u3 = 0x0, s9 = 0x0, u9 = 0x0, sc = 0x0}
(gdb) FAIL: gdb.base/bitfields.exp: bitfield containment #1
cont
The program is not being run.
(gdb) FAIL: gdb.base/bitfields.exp: continuing to break2 (the program is no longer running)
print/x flags
$12 = {uc = 0x0, s1 = 0x0, u1 = 0x0, s2 = 0x0, u2 = 0x0, s3 = 0x0, u3 = 0x0, s9 = 0x0, u9 = 0x0, sc = 0x0}
(gdb) FAIL: gdb.base/bitfields.exp: bitfield containment #2
delete breakpoints
Delete all breakpoints? (y or n) y
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) delete breakpoints
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) break break3
Breakpoint 3 at 0x8604: file .../gdb/testsuite/gdb.base/bitfields.c, line 52.
(gdb) entering gdb_reload
target remote localhost:1236
Remote debugging using localhost:1236
Reading symbols from .../lib/ld-linux.so.3...done.
Loaded symbols for .../lib/ld-linux.so.3
0x41001b80 in _start () from .../lib/ld-linux.so.3
(gdb) continue
Continuing.

Breakpoint 3, break3 () at .../gdb/testsuite/gdb.base/bitfields.c:52
52	}
(gdb) print flags
$13 = {uc = 0 '\000', s1 = 0, u1 = 1, s2 = 0, u2 = 3, s3 = 0, u3 = 7, s9 = 0, u9 = 511, sc = 0 '\000'}
(gdb) PASS: gdb.base/bitfields.exp: unsigned bitfield ranges

gdb/
2013-12-02  Pedro Alves  <pedro@codesourcery.com>
            Maciej W. Rozycki  <macro@codesourcery.com>

	* remote.c (putpkt_for_catch_errors): Remove function.
	(remote_kill): Handle TARGET_CLOSE_ERROR from the kill packet
	gracefully.
---

 gdb/ChangeLog |    7 +++++++
 gdb/remote.c  |   40 ++++++++++++++++++++++++++++------------
 2 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index e7998b9..3d8ed20 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,10 @@
+2013-12-02  Pedro Alves  <pedro@codesourcery.com>
+            Maciej W. Rozycki  <macro@codesourcery.com>
+
+	* remote.c (putpkt_for_catch_errors): Remove function.
+	(remote_kill): Handle TARGET_CLOSE_ERROR from the kill packet
+	gracefully.
+
 2013-12-02  Pedro Alves  <palves@redhat.com>
 
 	PR remote/15974
diff --git a/gdb/remote.c b/gdb/remote.c
index aa41264..2ac8c36 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -7816,23 +7816,39 @@ getpkt_or_notif_sane (char **buf, long *sizeof_buf, int forever,
 }
 
 

-/* A helper function that just calls putpkt; for type correctness.  */
-
-static int
-putpkt_for_catch_errors (void *arg)
-{
-  return putpkt (arg);
-}
-
 static void
 remote_kill (struct target_ops *ops)
 {
-  /* Use catch_errors so the user can quit from gdb even when we
+  struct gdb_exception ex;
+
+  /* Catch errors so the user can quit from gdb even when we
      aren't on speaking terms with the remote system.  */
-  catch_errors (putpkt_for_catch_errors, "k", "", RETURN_MASK_ERROR);
+  TRY_CATCH (ex, RETURN_MASK_ERROR)
+    {
+      putpkt ("k");
+    }
+  if (ex.reason < 0)
+    {
+      if (ex.error == TARGET_CLOSE_ERROR)
+	{
+	  /* If we got an (EOF) error that caused the target
+	     to go away, then we're done, that's what we wanted.
+	     "k" is susceptible to cause a premature EOF, given
+	     that the remote server isn't actually required to
+	     reply to "k", and it can happen that it doesn't
+	     even get to reply ACK to the "k".  */
+	  return;
+	}
+
+	/* Otherwise, something went wrong.  We didn't actually kill
+	   the target.  Just propagate the exception, and let the
+	   user or higher layers decide what to do.  */
+	throw_exception (ex);
+    }
 
-  /* Don't wait for it to die.  I'm not really sure it matters whether
-     we do or not.  For the existing stubs, kill is a noop.  */
+  /* We've killed the remote end, we get to mourn it.  Since this is
+     target remote, single-process, mourning the inferior also
+     unpushes remote_ops.  */
   target_mourn_inferior ();
 }
 



More information about the Gdb-patches mailing list