[PATCH 5/9] fbsd-nat: Defer any ineligible events reported by wait.

Simon Marchi simark@simark.ca
Mon Mar 20 19:09:34 GMT 2023


On 2/28/23 13:18, John Baldwin wrote:
> If wait_1 finds an event for a thread or process that does not match
> the set of threads and processes previously resumed, defer the event.
> If the event is for a specific thread, suspend the thread and continue
> the associated process before waiting for another event.

I do not understand this, because it doesn't match my mental model of
how things should work (which is derived from working with linux-nat).
In that model, a thread with a pending event should never be
ptrace-resumed.  So the last sentence doesn't make sense, it implies
that a thread has a pending and is ptrace-resumed (since we suspend it).

> One specific example of such an event is if a thread is created while
> another thread in the same process hits a breakpoint.  If the second
> thread's event is reported first, the target resume method does not
> yet "know" about the new thread and will not suspend it via
> PT_SUSPEND.

I was surprised to read that the resume method suspends some threads.  I
don't think it should, but I do see the current code.  The current code
appears to interpret the resume request as: "ensure the state of thread
resumption looks exactly like this", so it suspends threads as needed,
to match the requested resumption state.  I don't think it's supposed to
work like that.  The resume method should only resume some threads, and
if the core of GDB wants some threads stopped, it will call the stop
method.

Anyhow, just to be sure I understand the problem you describe: when
fbsd-nat reports the breakpoint hit from "another thread", is the
"thread created" event still in the kernel, not consumed by GDB?

> When wait is called, it will probably return the event
> from the first thread before the result of the step from second
> thread.  This is the case reported in PR 21497.

> ---
>  gdb/fbsd-nat.c | 34 +++++++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c
> index ca278b871ef..3f7278c6ea0 100644
> --- a/gdb/fbsd-nat.c
> +++ b/gdb/fbsd-nat.c
> @@ -1514,7 +1514,39 @@ fbsd_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
>    if (is_async_p ())
>      async_file_flush ();
>  
> -  wptid = wait_1 (ptid, ourstatus, target_options);
> +  while (1)
> +    {
> +      wptid = wait_1 (ptid, ourstatus, target_options);
> +
> +      /* If no event was found, just return.  */
> +      if (ourstatus->kind () == TARGET_WAITKIND_IGNORE
> +	  || ourstatus->kind () == TARGET_WAITKIND_NO_RESUMED)
> +	break;
> +
> +      /* If an event is reported for a thread or process while
> +	 stepping some other thread, suspend the thread reporting the
> +	 event and defer the event until it can be reported to the
> +	 core.  */
> +      if (!wptid.matches (resume_ptid))
> +	{
> +	  add_pending_event (wptid, *ourstatus);
> +	  fbsd_nat_debug_printf ("deferring event [%s], [%s]",
> +				 target_pid_to_str (wptid).c_str (),
> +				 ourstatus->to_string ().c_str ());
> +	  if (wptid.pid () == resume_ptid.pid ())
> +	    {
> +	      fbsd_nat_debug_printf ("suspending thread [%s]",
> +				     target_pid_to_str (wptid).c_str ());
> +	      if (ptrace (PT_SUSPEND, wptid.lwp (), NULL, 0) == -1)
> +		perror_with_name (("ptrace (PT_SUSPEND)"));


This is the part I don't understand.  After wait_1 returned an event for
a specific thread, I would expect this thread to be ptrace-stopped.  So,
I don't see the need to suspend it.  You'd just leave it in the "resumed
from the
until.

> +	      if (ptrace (PT_CONTINUE, wptid.pid (), (caddr_t) 1, 0) == -1)
> +		perror_with_name (("ptrace (PT_CONTINUE)"));

I don't get why you continue `wptid.pid ()`.  What assures you that this
particular thread was stopped previously?  Perhaps I don't understand
because I assume somethings about pids/lwps and ptrace that are only
true on Linux.

Simon


More information about the Gdb-patches mailing list