Bug 17283 - gdbserver stops working in non-stop mode
Summary: gdbserver stops working in non-stop mode
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-17 12:16 UTC by dilyan.palauzov@aegee.org
Modified: 2014-08-17 19:14 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description dilyan.palauzov@aegee.org 2014-08-17 12:16:08 UTC
I use gdb78 and gdbserver78.  When gdb is in non-stop mode, gdbserver terminates with
../../../gdb-7.8/gdb/gdbserver/server.c:2695: A problem internal to GDBserver has been detected.

I expect that gdbserver does not terminate, or at least gdb warns, that doing remote non-stop will lead to gdbserver termination.

Moreover, I expect that target remote can be run async:
  target remote   -- stops the program, after connecting to remove
  target remote & -- doesn't stop the program (implies continue&)

# gdbserver --debug --debug-format=timestamp,all --attach :1234 21936 2> gdb-stderr

# gdb ./prog

(gdb) set non-stop on
(gdb) show architecture
The target architecture is set automatically (currently i386:x86-64)
(gdb) target remote localhost:1234 
Remote debugging using localhost:1234
Remote connection closed
(qdb) quit

# cat gdb-stderr
sigchld_handler
1408277153:183341 Found new lwp 21937
sigchld_handler
1408277153:183464 Found new lwp 28894
sigchld_handler
1408277153:183584 Found new lwp 28968
sigchld_handler
Attached; pid = 21936
1408277153:183747 >>>> entering linux_wait_1
1408277153:183774 linux_wait_1: [Process 21936]
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x1): status(137f), 21936
1408277153:183828 LWFE: waitpid(-1, ...) returned 21936, ERRNO-OK
1408277153:183872 LLW: waitpid 21936 received Stopped (signal) (stopped)
1408277153:183976 linux_low_filter_event: pc is 0x7fe35ea42633
1408277153:183995 pc is 0x7fe35ea42633
1408277153:184014 stop pc is 0x7fe35ea42633
my_waitpid (29340, 0x0)
my_waitpid (29340, 0x0): status(137f), 29340
my_waitpid (29340, 0x0)
my_waitpid (29340, 0x0): status(1057f), 29340
my_waitpid (29341, 0x0)
my_waitpid (29341, 0x0): status(137f), 29341
my_waitpid (29341, 0x0)
my_waitpid (29341, 0x0): status(9), 29341
my_waitpid (29340, 0x0)
my_waitpid (29340, 0x0): status(0), 29340
1408277153:184887 Expected stop.
sigchld_handler
1408277153:184930 Hit a non-gdbserver trap event.
1408277153:184950 >>>> entering stop_all_lwps
1408277153:184967 stop_all_lwps (stop, except=none)
1408277153:184988 Have pending sigstop for lwp 21937
1408277153:185008 Have pending sigstop for lwp 28894
1408277153:185025 Have pending sigstop for lwp 28968
1408277153:185045 wait_for_sigstop: pulling events
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x80000001): status(137f), 28968
1408277153:185084 LWFE: waitpid(-1, ...) returned 28968, ERRNO-OK
1408277153:185108 LLW: waitpid 28968 received Stopped (signal) (stopped)
1408277153:185167 linux_low_filter_event: pc is 0x7fe35ea42633
1408277153:185186 pc is 0x7fe35ea42633
1408277153:185203 stop pc is 0x7fe35ea42633
1408277153:185219 Expected stop.
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x80000001): status(137f), 28894
1408277153:185250 LWFE: waitpid(-1, ...) returned 28894, ERRNO-OK
1408277153:185267 LLW: waitpid 28894 received Stopped (signal) (stopped)
1408277153:185307 linux_low_filter_event: pc is 0x7fe35ea42633
1408277153:185324 pc is 0x7fe35ea42633
1408277153:185339 stop pc is 0x7fe35ea42633
1408277153:185356 Expected stop.
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x80000001): status(137f), 21937
1408277153:185387 LWFE: waitpid(-1, ...) returned 21937, ERRNO-OK
1408277153:185403 LLW: waitpid 21937 received Stopped (signal) (stopped)
1408277153:185442 linux_low_filter_event: pc is 0x7fe35f436009
1408277153:185459 pc is 0x7fe35f436009
1408277153:185473 stop pc is 0x7fe35f436009
1408277153:185488 Expected stop.
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x80000001): status(137f), 0
1408277153:185522 LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
1408277153:185628 leader_pid=21936, leader_lp!=NULL=1, num_lwps=4, zombie=0
1408277153:185704 LLW: exit (no unwaited-for LWP)
1408277153:185720 stop_all_lwps done, setting stopping_threads back to !stopping
1408277153:185732 <<<< exiting stop_all_lwps
1408277153:185750 Checking whether LWP 21936 needs to move out of the jump pad...no
1408277153:185766 Checking whether LWP 21937 needs to move out of the jump pad...no
1408277153:185780 Checking whether LWP 28894 needs to move out of the jump pad...no
1408277153:185795 Checking whether LWP 28968 needs to move out of the jump pad...no
1408277153:185815 linux_wait_1 ret = LWP 21936.21936, 1, 0
1408277153:185829 <<<< exiting linux_wait_1
Listening on port 1234
1408277224:689906 handling possible accept event
Remote debugging from host 127.0.0.1
1408277224:690036 linux_async (0), previous=0
1408277224:690069 handling possible serial event
sigchld_handler
sigchld_handler
1408277224:690983 handling possible serial event
1408277224:691120 handling possible serial event
1408277224:691247 handling possible serial event
1408277224:691355 handling possible serial event
1408277224:691832 handling possible serial event
1408277224:692877 handling possible serial event
1408277224:693735 handling possible serial event
1408277224:696486 handling possible serial event
1408277224:696650 handling possible serial event
1408277224:696676 linux_async (1), previous=0
1408277224:696762 handling possible target event
1408277224:696796 >>>> entering linux_wait_1
1408277224:696817 linux_wait_1: [<all threads>]
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x80000001): status(30), 0
1408277224:696850 LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
1408277224:696942 leader_pid=21936, leader_lp!=NULL=1, num_lwps=4, zombie=0
1408277224:697002 LLW: exit (no unwaited-for LWP)
1408277224:697015 linux_wait_1 ret = null_ptid, TARGET_WAITKIND_NO_RESUMED
1408277224:697027 <<<< exiting linux_wait_1
1408277224:697044 handling possible serial event
1408277224:697517 handling possible serial event
1408277224:697650 handling possible serial event
1408277224:697739 handling possible serial event
1408277224:697765 >>>> entering linux_resume
1408277224:697779 linux_resume:
1408277224:697791 already stopped LWP 21936 at GDB's request
1408277224:697805 Need step over [LWP 21936]? Ignoring, should remain stopped
1408277224:697817 Need step over [LWP 21937]? No
1408277224:697857 pc is 0x7fe35f436009
1408277224:697875 Need step over [LWP 21937]? No, no breakpoint found at 0x7fe35f436009
1408277224:697888 Need step over [LWP 28894]? No
1408277224:697920 pc is 0x7fe35ea42633
1408277224:697935 Need step over [LWP 28894]? No, no breakpoint found at 0x7fe35ea42633
1408277224:697947 Need step over [LWP 28968]? No
1408277224:697978 pc is 0x7fe35ea42633
1408277224:697993 Need step over [LWP 28968]? No, no breakpoint found at 0x7fe35ea42633
1408277224:698005 Resuming, no pending status or step over needed
1408277224:698016 linux_resume done
1408277224:698027 <<<< exiting linux_resume
1408277224:698125 handling possible serial event
1408277224:698151 Returning trace status as 0, stop reason tnotrun
1408277224:698259 handling possible serial event
1408277224:698280 Returning first trace state variable definition
1408277224:698361 handling possible serial event
1408277224:698382 Returning additional trace state variable definition
1408277224:698466 handling possible serial event
1408277224:698496 Reporting thread LWP 21936.21936 as already stopped with status->kind = stopped, signal = GDB_SIGNAL_0
1408277224:698513 Reporting thread LWP 21936.21937 as already stopped with status->kind = ignore
../../../gdb-7.8/gdb/gdbserver/server.c:2695: A problem internal to GDBserver has been detected.
queue_stop_reply_callback: Assertion `thread->last_status.kind != TARGET_WAITKIND_IGNORE' failed.
Comment 1 dilyan.palauzov@aegee.org 2014-08-17 13:32:08 UTC
Does gdbserver in fact interrupt the program, when attaching to it, before gdb connects to it over target remote?  I think so, after doing some experiments.

I cannot find the documentation of this behaviour and it is non-intuitive.  But in such case, the remark above that "target remote &" shall not interrupting the process, contrary to "target remote" gets irrelevant, as the process is already interrupted when gdbserver attaches to it.

Moreover, the documentation of gdbserver, option --debug-format offers three possibilities: none, all, timestamps.  But the documenation does not say, which of it is valid, if no --debug-format is specified, only --debug .  (Even if some intermediate loggin level is used, when only --debug without --debug-format is supplied, this shall also be written down).

Finally, when gdbserver is aware, that it is going to terminate abnormally, (e.g. by printing

../../../gdb-7.8/gdb/gdbserver/server.c:2695: A problem internal to GDBserver has been detected.

) does gdbserver remove all breakpoints?  My process, which I debug remotely, crashes from time time, printing "Trace/breakpoint trap (core dumped)" and I have never seen this, before doing remote debugging on it.  (This happened several times, but right now I cannot say how to reproduce it.  Maybe we can explain this phenomena without further details regarding reproducibility).

The same question, about removal off breakpoints, upon unpleasent sitatuations are recognized: when it happens that gdb offers to write down its core dump, does it remove all inserted breakpoints and leave the process running furhter?
Comment 2 dje 2014-08-17 19:02:29 UTC
I agree more clarity is needed here, but first some data points.

Re:
>Moreover, I expect that target remote can be run async:
>  target remote   -- stops the program, after connecting to remove
>  target remote & -- doesn't stop the program (implies continue&)

"target remote ... &" is not, AIUI, expected to do what you think it does.
IOW, "&" has no special meaning to "target remote".
[One could entertain the thought of extending "target ..." to handle "&",
but that's a separate subject.  There are already existing ways to handle
some things, so the additional complexity would need to be justified.]

"target remote" just establishes a connection with gdbserver,
it does not affect the running state of an inferior.

Re:
>Does gdbserver in fact interrupt the program, when attaching to it, before gdb >connects to it over target remote?

Yes it does.

If one wants to attach to a running program, and leave it running, one generally
uses "target extended-remote ..." and then "attach ... &" from gdb.

Re:
>Finally, when gdbserver is aware, that it is going to terminate abnormally, [...]
>does gdbserver remove all breakpoints?

Depending on what form of breakpoint is being used, gdbserver may not even be aware that breakpoints have been asserted.  No disagreement that we should be removing breakpoints, just a note to say this will take a bit of cooperation with gdb (to fully handle all possible cases).  For a start, gdbserver could at least remove the ones it is aware of.

Re:
>The same question, about removal off breakpoints, upon unpleasent sitatuations >are recognized: when it happens that gdb offers to write down its core dump, does >it remove all inserted breakpoints and leave the process running furhter?

I can't find any code that does this.
Easy enough to verify of course.
Comment 3 dje 2014-08-17 19:14:36 UTC
(In reply to dje from comment #2)
> Re:
> >The same question, about removal off breakpoints, upon unpleasent sitatuations >are recognized: when it happens that gdb offers to write down its core dump, does >it remove all inserted breakpoints and leave the process running furhter?
> 
> I can't find any code that does this.
> Easy enough to verify of course.

Yeah, simple experiment confirms that breakpoints aren't removed before gdb terminates.

One issue that needs to be taken into account is that when we're in this situation, gdb has detected an internal error - trying to do anything has no guarantees of success.  On can certainly wish gdb would try though.

[Note to self:
If gdb is going to exit after dumping core (including letting the generation of the core dump cause the exit, e.g., SIGABRT), it currently doesn't fork().
If we do try to clean up before we generate a core dump we'll want to fork() in order to have the coredump more accurately represent gdb's state at the time of internal-error detection.]