20392 – gdb "run" command hangs

Bug 20392 - gdb "run" command hangs

Summary: gdb "run" command hangs

Status:	NEW

Alias:	None

Product:	gdb
Classification:	Unclassified
Component:	server (show other bugs)
Version:	HEAD

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2016-07-21 09:10 UTC by Jan Kratochvil
Modified:	2016-07-21 09:10 UTC (History)
CC List:	0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jan Kratochvil 2016-07-21 09:10:18 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1176227

Description of problem:

The gdb "run" command hangs if it's invoked before the user has issued any resume-execution commands ("step", "continue", "stepi", etc.).

Version-Release number of selected component (if applicable):

$ gdb -v
GNU gdb (GDB) Fedora 7.8.1-30.fc21

How reproducible: 100%

Steps to Reproduce:
1. In one shell run |gdbserver --remote-debug :1111 `which ls`|
2. In another run |gdb `which ls`|
3. In the second shell, run |target extended-remote :1111| to connect to the gdbserver in the first shell
4. In the second shell, run |r| to "restart" the inferior

Actual results:

gdb hangs after the following protocol traffic is seen

getpkt ("vCont;c:p2e6.-1");  [no ack sent] 
putpkt ("$T0506:a0e5f*"7f0* ;07:b0e4f*"7f0* ;10:42e9ddf7ff7f0* ;thread:p2e6.2e6;core:3;#f0"); [noack mode]
getpkt ("g");  [no ack sent] 
putpkt ("$20e1fff7ff7f0* 20e1fff7ff7f0* a748dff7ff7f0*(d8020*@a0e5f*"7f0* b0e4f*"7f0* 67040*:b0ddf7ff7f0*!6020*(98d9fff7ff7f0*048e1fff7ff7f0* 48e1fff7ff7f0* 42e9ddf7ff7f0* 46020* 330*"2b0*}0*}0* 7f030*(f* 0*Hff0*2ff0*"242424242424242424242424242424240*"ff0*2f* 0*2ff0**ff0*}0*}0*}0*o801f0* f*,0*}0*}0*}0*}0*}0*7#35"); [noack mode]
getpkt ("G[snip]");  [no ack sent] 
putpkt ("$OK#9a"); [noack mode]
getpkt ("m7ffff7dde941,1");  [no ack sent] 
putpkt ("$90#69"); [noack mode]
getpkt ("m7ffff7dde941,1");  [no ack sent] 
putpkt ("$90#69"); [noack mode]

What this traffic says is that the inferior hits an "internal" gdb breakpoint (one set by gdb itself, not the user), and then gdb queries the memory at that breakpoint address, but then doesn't resume execution of the inferior (for whatever reason).  So the user-visible result is a "hang".

Expected results:

Execution is restarted normally.  A workaround is to follow the steps above, but just before step (4), issue a "stepi" command.  With that extra step, the "run" command works as expected.

Additional info:

I happily concede that this is an edge-case bug for normal gdb users.  However, this bug bites the rr tool[1] quite hard.  Indeed, I found this bug by running the rr regression tests on a fedora 21 installation.  We have a gross workaround[2] in hand, but we would like to disabuse ourselves of it at some point.

[1] http://rr-project.org/
[2] https://github.com/mozilla/rr/pull/1406

Confirmed:
FAIL: gdb-7.11.1-75.fc24.x86_64
FAIL: GNU gdb (GDB) 7.11.50.20160720-git