All-Stop on top of Non-Stop
Non-stop allows finer grained control of threads. In the non-stop variant of the RSP, thread control is asynchronous.
We should get rid of the all-stop vs non-stop distinction. Itsets will allow merging the user interface aspects. On the backend side, we should implement all-stop behavior on top of the target always running in non-stop mode.
There's been preliminary work posted. The original version is here:
A few revisions have been later posted:
However, that work is still WIP and not complete.
1. Getting the code / Helping
Discussions are held on the main GDB mailing lists. Patches should be posted to the gdb-patches mailing list. Work is being committed directly to the mainline (i.e., there's no special feature branch).
2. Remaining sub tasks
2.1. Target independent (native/remote)
Non-stop requires async mode. That should be made the default. WIP here.
- Currently, software single-step targets work in non-stop mode by forcing displaced stepping for all single steps. That's quite inefficient. The reason we don't do the usual "put break at next instruction - continue" dance in non-stop, is because the current software single-step implementation assumes only one thread will be single-stepping at a given time (with lots of globals and code marked deprecated). That won't be true in non-stop mode. Software single-step should be reworked to make it work in non-stop mode. Related the thread-hop code should be handled by the generic 'breakpoint doesn't cause stop / bpstat' code.
- In all-stop GDB passes the last stop signal to the thread we're resuming, in 'proceed', even if the thread that originally got the signal was some other thread. That's really silly. That means proceed delivers the signal to the wrong thread when one_proc triggers. Instead of making all-stop-on-top-of-non-stop mirror this (the current patch does not do that yet), all-stop should be fixed. GDB should warn, and perhaps query what to do (pass signal to original thread, or to the selected thread?).
- The current all-stop-on-top-of-non-stop patch left “target_pass_signals” mechanism broken at places. That needs fixing.
- Currently, in all-stop, when breakpoint commands run, all threads are stopped. With all-stop-on-top-of-non-stop, only the thread that got the event is stopped. To preserve behavior, that might need to be emulated, as users might expect things like "thread apply all foo" to work from breakpoint commands. This is unfortunate and conflicts with the desire of merging all-stop and non-stop under the itsets umbrella. Needs further thought and discussion.
- We should have a "breakpoints always-inserted when-running" mode (spelling just for illustration). In all-stop, GDB always removes all breakpoints from the target whenever the program stops. This is useful in case GDB crashes, as otherwise, there's good chance breakpoints would be left planted on target. non-stop mode requires "breakpoints always-inserted" mode. That is, breakpoints are left planted even if the target is stopped. It'd be nice to come up with a hybrid -- leave breakpoints always inserted, as long as some thread is running. If all threads in the process stop, then remove all breakpoints. Related, currently, unless displaced stepping is in effect, GDB always removes all breakpoints when it needs to single-step over one, and then puts them all back. It'd be good to make gdb only remove the breakpoint being stepped over, leaving all the others in place.
- infrun.c:infwait_state is a global presently. It looks like it should be per thread in non-stop mode.
- We have a single dummy frame stack/list. That looks like a problem for non-stop mode. Actually it's also a problem for multi-process, as the dummy stack list also doesn't take inferiors/program spaces into account -- a possible test to exercise this would be something along the lines of: start two processes of the same program, run to main in both, set a breakpoint at malloc, and do "print malloc(0)" in both inferiors. At this point, the stacks will be exactly the same in both processes. main print dummy-frame shows only one frame... As there'll usually be only one or a few threads doing function calls at the same time, putting a dummy stack pointer in the thread object ends up being a little bit expensive. We could instead do like inline-frame.c -- record the ptid in struct dummy_frame.
- Actually getting the all-stop-on-top-of-non-stop patch into shape.
2.2. Remote target
- When you connect to a remote target with "target remote" (or target extended-remote), in non-stop mode, there's a window where the user already has the prompt, and GDB thinks all remote threads are running, even if they were already stopped. This means that scripts that do e.g., "target remote; continue" fail on the "continue" with "can't do that when the current thread is running". This is actually visible in the testsuite -- all the non-stop tests are racy against GDBserver because of this. As we'll always use non-stop mode behind the scenes, this needs to be fixed.
- We go from all-stop always stopping all threads on any event, to having to have GDB always stop all threads individually. That's quite inefficient for remote targets. We should at least have a way to request a whole process be stopped, or all threads of all processes (and be told when that request is complete). This may mean we end up with simplified itsets on the target side.
- Remote File I/O (The F packet) doesn't work in non-stop mode. Need to come up with alternative.
- The remote O packet doesn't work in nonstop mode. Need to come up with alternative.