[PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses

Fri Jan 8 18:34:59 GMT 2021

* Simon Marchi <simon.marchi@polymtl.ca> [2021-01-07 23:17:33 -0500]:

> From: Simon Marchi <simon.marchi@efficios.com>
> 
> The rationale for this patch comes from the ROCm port [1], the goal
> being to reduce the number of back and forths between GDB and the target
> when doing successive operations.  I'll start with explaining the
> rationale and then go over the implementation.  In the ROCm / GPU world,
> the term "wave" is somewhat equivalent to a "thread" in GDB.  So if you
> read if from a GPU stand point, just s/thread/wave/.
> 
> ROCdbgapi, the library used by GDB [2] to communicate with the GPU
> target, gives the illusion that it's possible for the debugger to
> control (start and stop) individual threads.  But in reality, this is
> not how it works.  Under the hood, all threads of a queue are controlled
> as a group.  To stop one thread in a group of running ones, the state of
> all threads is retrieved from the GPU, all threads are destroyed, and all
> threads but the one we want to stop are re-created from the saved state.
> The net result, from the point of view of GDB, is that the library
> stopped one thread.  The same thing goes if we want to resume one thread
> while others are running: the state of all running threads is retrieved
> from the GPU, they are all destroyed, and they are all re-created,
> including the thread we want to resume.
> 
> This leads to some inefficiencies when combined with how GDB works, here
> are two examples:
> 
>  - Stopping all threads: because the target operates in non-stop mode,
>    when the user interface mode is all-stop, GDB must stop all threads
>    individually when presenting a stop.  Let's suppose we have 1000
>    threads and the user does ^C.  GDB asks the target to stop one
>    thread.  Behind the scenes, the library retrieves 1000 thread states
>    and restores the 999 others still running ones.  GDB asks the target
>    to stop another one.  The target retrieves 999 thread states and
>    restores the 998 remaining ones.  That means that to stop 1000
>    threads, we did 1000 back and forths with the GPU.  It would have
>    been much better to just retrieve the states once and stop there.
> 
>  - Resuming with pending events: suppose the 1000 threads hit a
>    breakpoint at the same time.  The breakpoint is conditional and
>    evaluates to true for the first thread, to false for all others.  GDB
>    pulls one event (for the first thread) from the target, decides that
>    it should present a stop, so stops all threads using
>    stop_all_threads.  All these other threads have a breakpoint event to
>    report, which is saved in `thread_info::suspend::waitstatus` for
>    later.  When the user does "continue", GDB resumes that one thread
>    that did hit the breakpoint.  It then processes the pending events
>    one by one as if they just arrived.  It picks one, evaluates the
>    condition to false, and resumes the thread.  It picks another one,
>    evaluates the condition to false, and resumes the thread.  And so on.
>    In between each resumption, there is a full state retrieval and
>    re-creation.  It would be much nicer if we could wait a little bit
>    before sending those threads on the GPU, until it processed all those
>    pending events.
> 
> To address this kind of performance issue, ROCdbgapi has a concept
> called "forward progress required", which is a boolean state that allows
> its user (i.e. GDB) to say "I'm doing a bunch of operations, you can
> hold off putting the threads on the GPU until I'm done" (the "forward
> progress not required" state).  Turning forward progress back on
> indicates to the library that all threads that are supposed to be
> running should now be really running on the GPU.
> 
> It turns out that GDB has a similar concept, though not as general,
> commit_resume.  On difference is that commit_resume is not stateful: the

typo: 'On difference' ?

> target can't look up "does the core need me to schedule resumed threads
> for execution right now".  It is also specifically linked to the resume
> method, it is not used in other contexts.  The target accumulates
> resumption requests through target_ops::resume calls, and then commits
> those resumptions when target_ops::commit_resume is called.  The target
> has no way to check if it's ok to leave resumed threads stopped in other
> target methods.
> 
> To bridge the gap, this patch generalizes the commit_resume concept in
> GDB to match the forward progress concept of ROCdbgapi.  The current
> name (commit_resume) can be interpreted as "commit the previous resume
> calls".  I renamed the concept to "commit_resumed", as in "commit the
> threads that are resumed".
> 
> In the new version, we have two things in process_stratum_target:
> 
>  - the commit_resumed_state field: indicates whether GDB requires this
>    target to have resumed threads committed to the execution
>    target/device.  If false, the target is allowed to leave resumed
>    threads un-committed at the end of whatever method it is executing.
> 
>  - the commit_resumed method: called when commit_resumed_state
>    transitions from false to true.  While commit_resumed_state was
>    false, the target may have left some resumed threads un-committed.
>    This method being called tells it that it should commit them back to
>    the execution device.
> 
> Let's take the "Stopping all threads" scenario from above and see how it
> would work with the ROCm target with this change.  Before stopping all
> threads, GDB would set the target's commit_resumed_state field to false.
> It would then ask the target to stop the first thread.  The target would
> retrieve all threads' state from the GPU and mark that one as stopped.
> Since commit_resumed_state is false, it leaves all the other threads
> (still resumed) stopped.  GDB would then proceed to call target_stop for
> all the other threads.  Since resumed threads are not committed, this
> doesn't do any back and forth with the GPU.
> 
> To simplify the implementation of targets, I made it so that when
> calling certain target methods, the contract between the core and the
> targets guarantees that commit_resumed_state is false.  This way, the
> target doesn't need two paths, one commit_resumed_state == true and one
> for commit_resumed_state == false.  It can just assert that
> commit_resumed_state is false and work with that assumption.  This also
> helps catch places where we forgot to disable commit_resumed_state
> before calling the method, which represents a probable optimization
> opportunity.
> 
> To have some confidence that this contract between the core and the
> targets is respected, I added assertions in the linux-nat target
> methods, even though the linux-nat target doesn't actually use that
> feature.  Since linux-nat is tested much more than other targets, this
> will help catch these issues quicker.
> 
> To ensure that commit_resumed_state is always turned back on (only if
> necessary, see below) and the commit_resumed method is called when doing
> so, I introduced the scoped_disabled_commit_resumed RAII object, which
> replaces make_scoped_defer_process_target_commit_resume.  On
> construction, it clears the commit_resumed_state flag of all process
> targets.  On destruction, it turns it back on (if necessary) and calls
> the commit_resumed method.  The nested case is handled by having a
> "nesting" counter: only when the counter goes back to 0 is
> commit_resumed_state turned back on.
> 
> On destruction, commit-resumed is not re-enabled for a given target if:
> 
>  1. this target has no threads resumed, or
>  2. this target at least one thread with a pending status known to the

Missing word: 'this target at least'?

>     core (saved in thread_info::suspend::waitstatus).
> 
> The first point is not technically necessary, because a proper
> commit_resumed implementation would be a no-op if the target has no
> resumed threads.  But since we have a flag do to a quick check, I think
> it doesn't hurt.
> 
> The second point is more important: together with the
> scoped_disable_commit_resumed instance added in fetch_inferior_event, it
> makes it so the "Resuming with pending events" described above is
> handled efficiently.  Here's what happens in that case:
> 
>  1. The user types "continue".
>  2. Upon destruction, the scoped_disable_commit_resumed in the `proceed`
>     function does not enable commit-resumed, as it sees other threads
>     have pending statuses.
>  3. fetch_inferior_event is called to handle another event, one thread
>     is resumed.  Because there are still more threads with pending
>     statuses, the destructor of scoped_disable_commit_resumed in
>     fetch_inferior_event still doesn't enable commit-resumed.
>  4. Rinse and repeat step 3, until the last pending status is handled by
>     fetch_inferior_event.  In that case, scoped_disable_commit_resumed's
>     destructor sees there are no more threads with pending statues, so
>     it asks the target to commit resumed threads.
> 
> This allows us to avoid all unnecessary back and forths, there is a
> single commit_resumed call.
> 
> This change required remote_target::remote_stop_ns to learn how to
> handle stopping threads that were resumed but pending vCont.  The
> simplest example where that happens is when using the remote target in
> all-stop, but with "maint set target-non-stop on", to force it to
> operate in non-stop mode under the hood.  If two threads hit a
> breakpoint at the same time, GDB will receive two stop replies.  It will
> present the stop for one thread and save the other one in
> thread_info::suspend::waitstatus.
> 
> Before this patch, when doing "continue", GDB first resumes the thread
> without a pending status:
> 
>     Sending packet: $vCont;c:p172651.172676#f3
> 
> It then consumes the pending status in the next fetch_inferior_event
> call:
> 
>     [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
>     [infrun] target_wait (-1.0.0, status) =
>     [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
>     [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP
> 
> It then realizes it needs to stop all threads to present the stop, so
> stops the thread it just resumed:
> 
>     [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
>     [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
>     remote_stop called
>     Sending packet: $vCont;t:p172651.172676#04
> 
> This is an unnecessary resume/stop.  With this patch, we don't commit
> resumed threads after proceeding, because of the pending status:
> 
>     [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus
> 
> When GDB handles the pending status and stop_all_threads runs, we stop a
> resumed but pending vCont thread:
> 
>     remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)
> 
> That thread was never actually resumed on the remote stub / gdbserver.
> This is why remote_stop_ns needed to learn this new trick of enqueueing
> phony stop replies.
> 
> Note that this patch only considers pending statuses known to the core
> of GDB, that is the events that were pulled out of the target and stored
> in `thread_info::suspend::waitstatus`.  In some cases, we could also
> avoid unnecessary back and forth when the target has events that it has
> not yet reported the core.  I plan to implement this as a subsequent
> patch, once this series has settled.

I read through the commit message and convinced myself that it made
sense.  I ran out of time to look at the actual code.

Thanks,
Andrew

> 
> gdb/ChangeLog:
> 
> 	* infrun.h (struct scoped_disable_commit_resumed): New.
> 	* infrun.c (do_target_resume): Remove
> 	maybe_commit_resume_process_target call.
> 	(maybe_commit_resume_all_process_targets): Rename to...
> 	(maybe_commit_resumed_all_process_targets): ... this.  Skip
> 	targets that have no executing threads or resumed threads with
> 	a pending status.
> 	(scoped_disable_commit_resumed_depth): New.
> 	(scoped_disable_commit_resumed::scoped_disable_commit_resumed):
> 	New.
> 	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed):
> 	New.
> 	(proceed): Use scoped_disable_commit_resumed.
> 	(fetch_inferior_event): Use scoped_disable_commit_resumed.
> 	* process-stratum-target.h (class process_stratum_target):
> 	<commit_resume>: Rename to...
> 	<commit_resumed>: ... this.
> 	<commit_resumed_state>: New.
> 	(all_process_targets): New.
> 	(maybe_commit_resume_process_target): Remove.
> 	(make_scoped_defer_process_target_commit_resume): Remove.
> 	* process-stratum-target.c (all_process_targets): New.
> 	(defer_process_target_commit_resume): Remove.
> 	(maybe_commit_resume_process_target): Remove.
> 	(make_scoped_defer_process_target_commit_resume): Remove.
> 	* linux-nat.c (linux_nat_target::resume): Add gdb_assert.
> 	(linux_nat_target::wait): Add gdb_assert.
> 	(linux_nat_target::stop): Add gdb_assert.
> 	* infcmd.c (run_command_1): Use scoped_disable_commit_resumed.
> 	(attach_command): Use scoped_disable_commit_resumed.
> 	(detach_command): Use scoped_disable_commit_resumed.
> 	(interrupt_target_1): Use scoped_disable_commit_resumed.
> 	* mi/mi-main.c (exec_continue): Use
> 	scoped_disable_commit_resumed.
> 	* record-full.c (record_full_wait_1): Change
> 	commit_resumed_state around calling commit_resumed.
> 	* remote.c (class remote_target) <commit_resume>: Rename to...
> 	<commit_resumed>: ... this.
> 	(remote_target::resume): Add gdb_assert.
> 	(remote_target::commit_resume): Rename to...
> 	(remote_target::commit_resumed): ... this.  Check if there is
> 	any thread pending vCont resume.
> 	(struct stop_reply): Move up.
> 	(remote_target::remote_stop_ns): Generate stop replies for
> 	resumed but pending vCont threads.
> 	(remote_target::wait_ns): Add gdb_assert.
> 
> [1] https://github.com/ROCm-Developer-Tools/ROCgdb/
> [2] https://github.com/ROCm-Developer-Tools/ROCdbgapi
> 
> Change-Id: I836135531a29214b21695736deb0a81acf8cf566
> ---
>  gdb/infcmd.c                 |   8 +++
>  gdb/infrun.c                 | 116 +++++++++++++++++++++++++++++++----
>  gdb/infrun.h                 |  41 +++++++++++++
>  gdb/linux-nat.c              |   5 ++
>  gdb/mi/mi-main.c             |   2 +
>  gdb/process-stratum-target.c |  37 +++++------
>  gdb/process-stratum-target.h |  63 +++++++++++--------
>  gdb/record-full.c            |   4 +-
>  gdb/remote.c                 | 111 +++++++++++++++++++++++----------
>  9 files changed, 292 insertions(+), 95 deletions(-)
> 
> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
> index 6f0ed952de67..b7595e42e265 100644
> --- a/gdb/infcmd.c
> +++ b/gdb/infcmd.c
> @@ -488,6 +488,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
>        uiout->flush ();
>      }
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("running");
> +
>    /* We call get_inferior_args() because we might need to compute
>       the value now.  */
>    run_target->create_inferior (exec_file,
> @@ -2591,6 +2593,8 @@ attach_command (const char *args, int from_tty)
>    if (non_stop && !attach_target->supports_non_stop ())
>      error (_("Cannot attach to this target in non-stop mode"));
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("attaching");
> +
>    attach_target->attach (args, from_tty);
>    /* to_attach should push the target, so after this point we
>       shouldn't refer to attach_target again.  */
> @@ -2746,6 +2750,8 @@ detach_command (const char *args, int from_tty)
>    if (inferior_ptid == null_ptid)
>      error (_("The program is not being run."));
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
> +
>    query_if_trace_running (from_tty);
>  
>    disconnect_tracing ();
> @@ -2814,6 +2820,8 @@ stop_current_target_threads_ns (ptid_t ptid)
>  void
>  interrupt_target_1 (bool all_threads)
>  {
> +  scoped_disable_commit_resumed inhibit ("interrupting");
> +
>    if (non_stop)
>      {
>        if (all_threads)
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 1a27af51b7e9..92a1102cb595 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -2172,8 +2172,6 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
>  
>    target_resume (resume_ptid, step, sig);
>  
> -  maybe_commit_resume_process_target (tp->inf->process_target ());
> -
>    if (target_can_async_p ())
>      target_async (1);
>  }
> @@ -2760,17 +2758,109 @@ schedlock_applies (struct thread_info *tp)
>  					    execution_direction)));
>  }
>  
> -/* Calls maybe_commit_resume_process_target on all process targets.  */
> +/* Maybe require all process stratum targets to commit their resumed threads.
> +
> +   A specific process stratum target is not required to do so if:
> +
> +   - it has no resumed threads
> +   - it has a thread with a pending status  */
>  
>  static void
> -maybe_commit_resume_all_process_targets ()
> +maybe_commit_resumed_all_process_targets ()
>  {
> -  scoped_restore_current_thread restore_thread;
> +  /* This is an optional to avoid unnecessary thread switches. */
> +  gdb::optional<scoped_restore_current_thread> restore_thread;
>  
>    for (process_stratum_target *target : all_non_exited_process_targets ())
>      {
> +      gdb_assert (!target->commit_resumed_state);
> +
> +      if (!target->threads_executing)
> +	{
> +	  infrun_debug_printf ("not re-enabling forward progress for target "
> +			       "%s, no executing threads",
> +			       target->shortname ());
> +	  continue;
> +	}
> +
> +      /* If a thread from this target has some status to report, we better
> +	 handle it before requiring the target to commit its resumed threads:
> +	 handling the status might lead to resuming more threads.  */
> +      bool has_thread_with_pending_status = false;
> +      for (thread_info *thread : all_non_exited_threads (target))
> +	if (thread->resumed && thread->suspend.waitstatus_pending_p)
> +	  {
> +	    has_thread_with_pending_status = true;
> +	    break;
> +	  }
> +
> +      if (has_thread_with_pending_status)
> +	{
> +	  infrun_debug_printf ("not requesting commit-resumed for target %s, a"
> +			       "thread has a pending waitstatus",
> +			       target->shortname ());
> +	  continue;
> +	}
> +
> +      if (!restore_thread.has_value ())
> +	restore_thread.emplace ();
> +
>        switch_to_target_no_thread (target);
> -      maybe_commit_resume_process_target (target);
> +      infrun_debug_printf ("enabling commit-resumed for target %s",
> +			   target->shortname());
> +
> +      target->commit_resumed_state = true;
> +      target->commit_resumed ();
> +    }
> +}
> +
> +/* To track nesting of scoped_disable_commit_resumed objects.  */
> +
> +static int scoped_disable_commit_resumed_depth = 0;
> +
> +scoped_disable_commit_resumed::scoped_disable_commit_resumed
> +  (const char *reason)
> +  : m_reason (reason)
> +{
> +  infrun_debug_printf ("reason=%s", m_reason);
> +
> +  for (process_stratum_target *target : all_process_targets ())
> +    {
> +      if (scoped_disable_commit_resumed_depth == 0)
> +	{
> +	  /* This is the outermost instance.  */
> +	  target->commit_resumed_state = false;
> +	}
> +      else
> +	{
> +	  /* This is not the outermost instance, we expect COMMIT_RESUMED_STATE
> +	     to have been cleared by the outermost instance.  */
> +	  gdb_assert (!target->commit_resumed_state);
> +	}
> +    }
> +
> +  ++scoped_disable_commit_resumed_depth;
> +}
> +
> +scoped_disable_commit_resumed::~scoped_disable_commit_resumed ()
> +{
> +  infrun_debug_printf ("reason=%s", m_reason);
> +
> +  gdb_assert (scoped_disable_commit_resumed_depth > 0);
> +
> +  --scoped_disable_commit_resumed_depth;
> +
> +  if (scoped_disable_commit_resumed_depth == 0)
> +    {
> +      /* This is the outermost instance.  */
> +      maybe_commit_resumed_all_process_targets ();
> +    }
> +  else
> +    {
> +      /* This is not the outermost instance, we expect COMMIT_RESUMED_STATE to
> +	 still be false.  */
> +      for (process_stratum_target *target : all_process_targets ())
> +	gdb_assert (!target->commit_resumed_state);
>      }
>  }
>  
> @@ -2994,8 +3084,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>    cur_thr->prev_pc = regcache_read_pc_protected (regcache);
>  
>    {
> -    scoped_restore save_defer_tc
> -      = make_scoped_defer_process_target_commit_resume ();
> +    scoped_disable_commit_resumed disable_commit_resumed ("proceeding");
>  
>      started = start_step_over ();
>  
> @@ -3065,8 +3154,6 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>        }
>    }
>  
> -  maybe_commit_resume_all_process_targets ();
> -
>    finish_state.release ();
>  
>    /* If we've switched threads above, switch back to the previously
> @@ -3819,8 +3906,15 @@ fetch_inferior_event ()
>        = make_scoped_restore (&execution_direction,
>  			     target_execution_direction ());
>  
> +    /* Allow process stratum targets to pause their resumed threads while we
> +       handle the event.  */
> +    scoped_disable_commit_resumed disable_commit_resumed ("handling event");
> +
>      if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG))
> -      return;
> +      {
> +	infrun_debug_printf ("do_target_wait returned no event");
> +	return;
> +      }
>  
>      gdb_assert (ecs->ws.kind != TARGET_WAITKIND_IGNORE);
>  
> diff --git a/gdb/infrun.h b/gdb/infrun.h
> index 7160b60f1368..5c32c0c97f6e 100644
> --- a/gdb/infrun.h
> +++ b/gdb/infrun.h
> @@ -269,4 +269,45 @@ extern void all_uis_check_sync_execution_done (void);
>     started or re-started).  */
>  extern void all_uis_on_sync_execution_starting (void);
>  
> +/* RAII object to temporarily disable the requirement for process stratum
> +   targets to commit their resumed threads.
> +
> +   On construction, set process_stratum_target::commit_resumed_state to false
> +   for all process stratum targets.
> +
> +   On destruction, call maybe_commit_resumed_all_process_targets.
> +
> +   In addition, track creation of nested scoped_disable_commit_resumed objects,
> +   for cases like this:
> +
> +     void
> +     inner_func ()
> +     {
> +       scoped_disable_commit_resumed disable;
> +       // do stuff
> +     }
> +
> +     void
> +     outer_func ()
> +     {
> +       scoped_disable_commit_resumed disable;
> +
> +       for (... each thread ...)
> +	 inner_func ();
> +     }
> +
> +   In this case, we don't want the `disable` in `inner_func` to require targets
> +   to commit resumed threads in its destructor.  */
> +
> +struct scoped_disable_commit_resumed
> +{
> +  scoped_disable_commit_resumed (const char *reason);
> +  ~scoped_disable_commit_resumed ();
> +
> +  DISABLE_COPY_AND_ASSIGN (scoped_disable_commit_resumed);
> +
> +private:
> +  const char *m_reason;
> +};
> +
>  #endif /* INFRUN_H */
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index dc524cf10dc1..9adec81ba132 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -1661,6 +1661,8 @@ linux_nat_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
>  			   ? strsignal (gdb_signal_to_host (signo)) : "0"),
>  			  target_pid_to_str (inferior_ptid).c_str ());
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* A specific PTID means `step only this process id'.  */
>    resume_many = (minus_one_ptid == ptid
>  		 || ptid.is_pid ());
> @@ -3406,6 +3408,8 @@ linux_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
>    linux_nat_debug_printf ("[%s], [%s]", target_pid_to_str (ptid).c_str (),
>  			  target_options_to_string (target_options).c_str ());
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* Flush the async file first.  */
>    if (target_is_async_p ())
>      async_file_flush ();
> @@ -4166,6 +4170,7 @@ linux_nat_stop_lwp (struct lwp_info *lwp)
>  void
>  linux_nat_target::stop (ptid_t ptid)
>  {
> +  gdb_assert (!this->commit_resumed_state);
>    iterate_over_lwps (ptid, linux_nat_stop_lwp);
>  }
>  
> diff --git a/gdb/mi/mi-main.c b/gdb/mi/mi-main.c
> index 9a14d78e1e27..e5653ea3e3f5 100644
> --- a/gdb/mi/mi-main.c
> +++ b/gdb/mi/mi-main.c
> @@ -266,6 +266,8 @@ exec_continue (char **argv, int argc)
>  {
>    prepare_execution_command (current_top_target (), mi_async_p ());
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("mi continue");
> +
>    if (non_stop)
>      {
>        /* In non-stop mode, 'resume' always resumes a single thread.
> diff --git a/gdb/process-stratum-target.c b/gdb/process-stratum-target.c
> index 1436a550ac04..9877f0d81931 100644
> --- a/gdb/process-stratum-target.c
> +++ b/gdb/process-stratum-target.c
> @@ -99,6 +99,20 @@ all_non_exited_process_targets ()
>  
>  /* See process-stratum-target.h.  */
>  
> +std::set<process_stratum_target *>
> +all_process_targets ()
> +{
> +  /* Inferiors may share targets.  To eliminate duplicates, use a set.  */
> +  std::set<process_stratum_target *> targets;
> +  for (inferior *inf : all_inferiors ())
> +    if (inf->process_target () != nullptr)
> +      targets.insert (inf->process_target ());
> +
> +  return targets;
> +}
> +
> +/* See process-stratum-target.h.  */
> +
>  void
>  switch_to_target_no_thread (process_stratum_target *target)
>  {
> @@ -108,26 +122,3 @@ switch_to_target_no_thread (process_stratum_target *target)
>        break;
>      }
>  }
> -
> -/* If true, `maybe_commit_resume_process_target` is a no-op.  */
> -
> -static bool defer_process_target_commit_resume;
> -
> -/* See target.h.  */
> -
> -void
> -maybe_commit_resume_process_target (process_stratum_target *proc_target)
> -{
> -  if (defer_process_target_commit_resume)
> -    return;
> -
> -  proc_target->commit_resume ();
> -}
> -
> -/* See process-stratum-target.h.  */
> -
> -scoped_restore_tmpl<bool>
> -make_scoped_defer_process_target_commit_resume ()
> -{
> -  return make_scoped_restore (&defer_process_target_commit_resume, true);
> -}
> diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
> index c8060c46be93..3cea911dee09 100644
> --- a/gdb/process-stratum-target.h
> +++ b/gdb/process-stratum-target.h
> @@ -63,19 +63,10 @@ class process_stratum_target : public target_ops
>    bool has_registers () override;
>    bool has_execution (inferior *inf) override;
>  
> -  /* Commit a series of resumption requests previously prepared with
> -     resume calls.
> +  /* Ensure that all resumed threads are committed to the target.
>  
> -     GDB always calls `commit_resume` on the process stratum target after
> -     calling `resume` on a target stack.  A process stratum target may thus use
> -     this method in coordination with its `resume` method to batch resumption
> -     requests.  In that case, the target doesn't actually resume in its
> -     `resume` implementation.  Instead, it takes note of resumption intent in
> -     `resume`, and defers the actual resumption `commit_resume`.
> -
> -     E.g., the remote target uses this to coalesce multiple resumption requests
> -     in a single vCont packet.  */
> -  virtual void commit_resume () {}
> +     See the description of COMMIT_RESUMED_STATE for more details.  */
> +  virtual void commit_resumed () {}
>  
>    /* True if any thread is, or may be executing.  We need to track
>       this separately because until we fully sync the thread list, we
> @@ -86,6 +77,35 @@ class process_stratum_target : public target_ops
>  
>    /* The connection number.  Visible in "info connections".  */
>    int connection_number = 0;
> +
> +  /* Whether resumed threads must be committed to the target.
> +
> +     When true, resumed threads must be committed to the execution target.
> +
> +     When false, the process stratum target may leave resumed threads stopped
> +     when it's convenient or efficient to do so.  When the core requires resumed
> +     threads to be committed again, this is set back to true and calls the
> +     `commit_resumed` method to allow the target to do so.
> +
> +     To simplify the implementation of process stratum targets, the following
> +     methods are guaranteed to be called with COMMIT_RESUMED_STATE set to
> +     false:
> +
> +       - resume
> +       - stop
> +       - wait
> +
> +     Knowing this, the process stratum target doesn't need to implement
> +     different behaviors depending on the COMMIT_RESUMED_STATE, and can
> +     simply assert that it is false.
> +
> +     Process stratum targets can take advantage of this to batch resumption
> +     requests, for example.  In that case, the target doesn't actually resume in
> +     its `resume` implementation.  Instead, it takes note of the resumption
> +     intent in `resume` and defers the actual resumption to `commit_resumed`.
> +     For example, the remote target uses this to coalesce multiple resumption
> +     requests in a single vCont packet.  */
> +  bool commit_resumed_state = false;
>  };
>  
>  /* Downcast TARGET to process_stratum_target.  */
> @@ -101,24 +121,13 @@ as_process_stratum_target (target_ops *target)
>  
>  extern std::set<process_stratum_target *> all_non_exited_process_targets ();
>  
> +/* Return a collection of all existing process stratum targets.  */
> +
> +extern std::set<process_stratum_target *> all_process_targets ();
> +
>  /* Switch to the first inferior (and program space) of TARGET, and
>     switch to no thread selected.  */
>  
>  extern void switch_to_target_no_thread (process_stratum_target *target);
>  
> -/* Commit a series of resumption requests previously prepared with
> -   target_resume calls.
> -
> -   This function is a no-op if commit resumes are deferred (see
> -   `make_scoped_defer_process_target_commit_resume`).  */
> -
> -extern void maybe_commit_resume_process_target
> -  (process_stratum_target *target);
> -
> -/* Setup to defer `commit_resume` calls, and re-set to the previous status on
> -   destruction.  */
> -
> -extern scoped_restore_tmpl<bool>
> -  make_scoped_defer_process_target_commit_resume ();
> -
>  #endif /* !defined (PROCESS_STRATUM_TARGET_H) */
> diff --git a/gdb/record-full.c b/gdb/record-full.c
> index 56ab29479874..fad355afdf4f 100644
> --- a/gdb/record-full.c
> +++ b/gdb/record-full.c
> @@ -1263,7 +1263,9 @@ record_full_wait_1 (struct target_ops *ops,
>  					    "issuing one more step in the "
>  					    "target beneath\n");
>  		      ops->beneath ()->resume (ptid, step, GDB_SIGNAL_0);
> -		      proc_target->commit_resume ();
> +		      proc_target->commit_resumed_state = true;
> +		      proc_target->commit_resumed ();
> +		      proc_target->commit_resumed_state = false;
>  		      continue;
>  		    }
>  		}
> diff --git a/gdb/remote.c b/gdb/remote.c
> index f8150f39fb5c..be53886c1837 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -421,7 +421,7 @@ class remote_target : public process_stratum_target
>    void detach (inferior *, int) override;
>    void disconnect (const char *, int) override;
>  
> -  void commit_resume () override;
> +  void commit_resumed () override;
>    void resume (ptid_t, int, enum gdb_signal) override;
>    ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
>  
> @@ -6376,6 +6376,8 @@ remote_target::resume (ptid_t ptid, int step, enum gdb_signal siggnal)
>  {
>    struct remote_state *rs = get_remote_state ();
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* When connected in non-stop mode, the core resumes threads
>       individually.  Resuming remote threads directly in target_resume
>       would thus result in sending one packet per thread.  Instead, to
> @@ -6565,7 +6567,7 @@ vcont_builder::push_action (ptid_t ptid, bool step, gdb_signal siggnal)
>  /* to_commit_resume implementation.  */
>  
>  void
> -remote_target::commit_resume ()
> +remote_target::commit_resumed ()
>  {
>    int any_process_wildcard;
>    int may_global_wildcard_vcont;
> @@ -6640,6 +6642,8 @@ remote_target::commit_resume ()
>       disable process and global wildcard resumes appropriately.  */
>    check_pending_events_prevent_wildcard_vcont (&may_global_wildcard_vcont);
>  
> +  bool any_pending_vcont_resume = false;
> +
>    for (thread_info *tp : all_non_exited_threads (this))
>      {
>        remote_thread_info *priv = get_remote_thread_info (tp);
> @@ -6656,6 +6660,9 @@ remote_target::commit_resume ()
>  	  continue;
>  	}
>  
> +      if (priv->resume_state () == resume_state::RESUMED_PENDING_VCONT)
> +	any_pending_vcont_resume = true;
> +
>        /* If a thread is the parent of an unfollowed fork, then we
>  	 can't do a global wildcard, as that would resume the fork
>  	 child.  */
> @@ -6663,6 +6670,11 @@ remote_target::commit_resume ()
>  	may_global_wildcard_vcont = 0;
>      }
>  
> +  /* We didn't have any resumed thread pending a vCont resume, so nothing to
> +     do.  */
> +  if (!any_pending_vcont_resume)
> +    return;
> +
>    /* Now let's build the vCont packet(s).  Actions must be appended
>       from narrower to wider scopes (thread -> process -> global).  If
>       we end up with too many actions for a single packet vcont_builder
> @@ -6735,7 +6747,35 @@ remote_target::commit_resume ()
>    vcont_builder.flush ();
>  }
>  
> -
> +struct stop_reply : public notif_event
> +{
> +  ~stop_reply ();
> +
> +  /* The identifier of the thread about this event  */
> +  ptid_t ptid;
> +
> +  /* The remote state this event is associated with.  When the remote
> +     connection, represented by a remote_state object, is closed,
> +     all the associated stop_reply events should be released.  */
> +  struct remote_state *rs;
> +
> +  struct target_waitstatus ws;
> +
> +  /* The architecture associated with the expedited registers.  */
> +  gdbarch *arch;
> +
> +  /* Expedited registers.  This makes remote debugging a bit more
> +     efficient for those targets that provide critical registers as
> +     part of their normal status mechanism (as another roundtrip to
> +     fetch them is avoided).  */
> +  std::vector<cached_reg_t> regcache;
> +
> +  enum target_stop_reason stop_reason;
> +
> +  CORE_ADDR watch_data_address;
> +
> +  int core;
> +};
>  
>  /* Non-stop version of target_stop.  Uses `vCont;t' to stop a remote
>     thread, all threads of a remote process, or all threads of all
> @@ -6748,6 +6788,39 @@ remote_target::remote_stop_ns (ptid_t ptid)
>    char *p = rs->buf.data ();
>    char *endp = p + get_remote_packet_size ();
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
> +  /* If any threads that needs to stop are pending a vCont resume, generate
> +     dummy stop_reply events.  */
> +  for (thread_info *tp : all_non_exited_threads (this, ptid))
> +    {
> +      remote_thread_info *remote_thr = get_remote_thread_info (tp);
> +
> +      if (remote_thr->resume_state () == resume_state::RESUMED_PENDING_VCONT)
> +	{
> +	  if (remote_debug)
> +	    {
> +	      fprintf_unfiltered (gdb_stdlog,
> +				  "remote_stop_ns: Enqueueing phony stop reply "
> +				  "for thread pending vCont-resume "
> +				  "(%d, %ld, %ld)\n",
> +				  tp->ptid.pid(), tp->ptid.lwp (),
> +				  tp->ptid.tid ());
> +	    }
> +
> +	  stop_reply *sr = new stop_reply ();
> +	  sr->ptid = tp->ptid;
> +	  sr->rs = rs;
> +	  sr->ws.kind = TARGET_WAITKIND_STOPPED;
> +	  sr->ws.value.sig = GDB_SIGNAL_0;
> +	  sr->arch = tp->inf->gdbarch;
> +	  sr->stop_reason = TARGET_STOPPED_BY_NO_REASON;
> +	  sr->watch_data_address = 0;
> +	  sr->core = 0;
> +	  this->push_stop_reply (sr);
> +	}
> +    }
> +
>    /* FIXME: This supports_vCont_probed check is a workaround until
>       packet_support is per-connection.  */
>    if (packet_support (PACKET_vCont) == PACKET_SUPPORT_UNKNOWN
> @@ -6955,36 +7028,6 @@ remote_console_output (const char *msg)
>    gdb_stdtarg->flush ();
>  }
>  
> -struct stop_reply : public notif_event
> -{
> -  ~stop_reply ();
> -
> -  /* The identifier of the thread about this event  */
> -  ptid_t ptid;
> -
> -  /* The remote state this event is associated with.  When the remote
> -     connection, represented by a remote_state object, is closed,
> -     all the associated stop_reply events should be released.  */
> -  struct remote_state *rs;
> -
> -  struct target_waitstatus ws;
> -
> -  /* The architecture associated with the expedited registers.  */
> -  gdbarch *arch;
> -
> -  /* Expedited registers.  This makes remote debugging a bit more
> -     efficient for those targets that provide critical registers as
> -     part of their normal status mechanism (as another roundtrip to
> -     fetch them is avoided).  */
> -  std::vector<cached_reg_t> regcache;
> -
> -  enum target_stop_reason stop_reason;
> -
> -  CORE_ADDR watch_data_address;
> -
> -  int core;
> -};
> -
>  /* Return the length of the stop reply queue.  */
>  
>  int
> @@ -7877,6 +7920,8 @@ remote_target::wait_ns (ptid_t ptid, struct target_waitstatus *status,
>    int ret;
>    int is_notif = 0;
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* If in non-stop mode, get out of getpkt even if a
>       notification is received.	*/
>  
> -- 
> 2.29.2
>