This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2 3/3] PR remote/19496, timeout in forking-threads-plus-bkpt


On 3/15/2016 8:29 AM, Pedro Alves wrote:
> Hi Don,
> 
> Sorry for the delay.  I had to find a moment to think this through.
> 
> On 02/11/2016 12:28 AM, Don Breazeal wrote:
>> On 2/1/2016 12:09 PM, Pedro Alves wrote:
>>> On 02/01/2016 07:29 PM, Don Breazeal wrote:
>>>> On 2/1/2016 4:05 AM, Pedro Alves wrote:
>>>
>>>> Hi Pedro,
>> ---snip---
>>>> A fork event was reported to GDB before GDB knew about the parent thread,
>>>> followed immediately by a breakpoint event in a different thread.  The
>>>> parent thread was subsequently added via remote_notice_new_inferior in
>>>> process_stop_reply, but when the thread was added the thread_info.state
>>>> was set to THREAD_STOPPED.  The fork event was then handled correctly,
>>>> but when the fork parent was resumed via a call to keep_going, the state
>>>> was unchanged.
>>>
>>> Since this is non-stop, then it sounds to me like the bug is that the
>>> thread should have been added in THREAD_RUNNING state.
>>>
>>> Consider that infrun may be pulling target events out of the target_ops
>>> backend into its own event queue, but, not process them immediately.
>>>
>>> E.g., infrun may be stopping all threads temporarily for a step-over-breakpoint
>>> operation for thread A (stop_all_threads).  The waitstatus of all threads
>>> is thus left pending in the thread structure (save_status), including the
>>> fork event of thread B.  Right at this point, if the user
>>> does "info threads", that should show thread B (the fork parent) running,
>>> not stopped, even if internally, gdb is holding it paused for a little bit.
>>>
>>
>> Hi Pedro,
>> Here is a new patch that adds the threads with the state set to
>> THREAD_RUNNING for fork events.
> 
> I really meant it for _all_ kinds of events, not just fork.

Oh.  Oops.

> 
> I don't think we can do this:
> 
> +      if (status->kind == TARGET_WAITKIND_FORKED
> +	  || status->kind == TARGET_WAITKIND_VFORKED)
> +	remote_notice_new_inferior (ptid, 1);
> 
> in all-stop mode, as passing 1 means we'll end up installing a
> continuation that immediately resumes the whole process.

Huh.  Reading through the code I see this now, but don't understand why
I didn't run into any problems in my testing.  I guess I can go back and
do some debugging to clarify that for myself.

> 
> +      else
> +       remote_notice_new_inferior (ptid, 0);
> 
> and with this, if in non-stop, and the event ends up pending
> on the infrun side, "info threads" will be confused for that
> thread too.

I get this part now.

> 
> So the fix requires a bit of plumbing to pass down the correct
> external and internal thread states.  Something like the patch
> below.
> 
> WDYT?  Does it fix the timeout you observe?

This does fix the timeout, and it makes sense.

Nit: blank line in ChangeLog.
> 
> (We should really come up with better, less confusing names for the
> "running" vs "executing" distinction.  Maybe external/internal running
> state, or user/internal, or public/private...)

I like user/internal, like "user_run_state" and "internal_run_state".
But I never won any prizes for my ability to name things. :-S

Along the same lines, I don't get the distinction between
thread_info.resumed and thread_info.state==THREAD_RUNNING.  It seems
like they both mean that the thread is running from the user perspective.

I'm running the full GDB testsuite on the Nios II target to see if
anything else pops up.  I'll let you know the results of that.

Thanks for the review and the rework.
--Don

> 
> --------------
> [PATCH] PR remote/19496, timeout in forking-threads-plus-bkpt
> 
> This patch addresses a failure in
> gdb.threads/forking-threads-plus-breakpoint.exp:
> 
> FAIL: gdb.threads/forking-threads-plus-breakpoint.exp: cond_bp_target=1:
> detach_on_fork=on: inferior 1 exited (timeout)
> 
> Cause:
> 
> A fork event was reported to GDB before GDB knew about the parent
> thread, followed immediately by a breakpoint event in a different
> thread.  The parent thread was subsequently added via
> remote_notice_new_inferior in process_stop_reply, but when the thread
> was added the thread_info.state was set to THREAD_STOPPED.  The fork
> event was then handled correctly, but when the fork parent was resumed
> via a call to keep_going, the state was unchanged.
> 
> The breakpoint event was then handled, which caused all the
> non-breakpoint threads to be stopped.  When the breakpoint thread was
> resumed, all the non-breakpoint threads were resumed via
> infrun.c:restart_threads.  Our old fork parent wasn't restarted,
> because it still had thread_info.state set to THREAD_STOPPED.
> Ultimately the program under debug hung waiting for a pthread_join
> while the old fork parent was stopped forever by GDB.
> 
> Fix:
> 
> Since this is non-stop, then the bug is that the thread should have
> been added in THREAD_RUNNING state.  Consider that infrun may be
> pulling target events out of the target_ops backend into its own event
> queue, but, not process them immediately.  E.g., infrun may be
> stopping all threads temporarily for a step-over-breakpoint operation
> for thread A (stop_all_threads).  The waitstatus of all threads is
> thus left pending in the thread structure (save_status), including the
> fork event of thread B.  Right at this point, if the user does "info
> threads", that should show thread B (the fork parent) running, not
> stopped, even if internally, gdb is holding it paused for a little
> bit.
> 
> Thus if in non-stop mode, always add new threads in the external
> user-visible THREAD_RUNNING state.  Change remote_notice_new_inferior
> to accept the internal executing state of the thread instead, with
> EXECUTING set to 1 when we discover a thread that is running on the
> target (such as through remote_update_thread_list), and 0 when the
> thread is really paused (such as when we see a stop reply).
> 
> Tested on x86_64 Linux and Nios II Linux target with x86 Linux host.
> 
> gdb/ChangeLog:
> 2016-03-15  Pedro Alves  <palves@redhat.com>
> 	    Don Breazeal  <donb@codesourcery.com>
> 
> 	* infcmd.c (notice_new_inferior): Use the 'leave_running' argument
> 	instead of checking the 'non_stop' global.
> 
> 	* remote.c (remote_add_thread): New parameter 'executing'.  Use it
> 	to set the new thread's executing state.
> 	(remote_notice_new_inferior): Rename parameter 'running' to
> 	'executing'.  Always set the thread state to THREAD_RUNNING in
> 	non-stop mode, and to THREAD_STOPPED in all-stop mode.  Pass
> 	EXECUTING to remote_add_thread and notice_new_inferior.
> 	(remote_update_thread_list): Update to pass executing state, not
> 	running state.
> ---
>  gdb/infcmd.c |  7 +------
>  gdb/remote.c | 30 ++++++++++++++++++------------
>  2 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
> index a80bf0a..d687116 100644
> --- a/gdb/infcmd.c
> +++ b/gdb/infcmd.c
> @@ -2903,11 +2903,7 @@ notice_new_inferior (ptid_t ptid, int leave_running, int from_tty)
>  
>    old_chain = make_cleanup (null_cleanup, NULL);
>  
> -  /* If in non-stop, leave threads as running as they were.  If
> -     they're stopped for some reason other than us telling it to, the
> -     target reports a signal != GDB_SIGNAL_0.  We don't try to
> -     resume threads with such a stop signal.  */
> -  mode = non_stop ? ATTACH_POST_WAIT_RESUME : ATTACH_POST_WAIT_NOTHING;
> +  mode = leave_running ? ATTACH_POST_WAIT_RESUME : ATTACH_POST_WAIT_NOTHING;
>  
>    if (!ptid_equal (inferior_ptid, null_ptid))
>      make_cleanup_restore_current_thread ();
> @@ -2943,7 +2939,6 @@ notice_new_inferior (ptid_t ptid, int leave_running, int from_tty)
>        return;
>      }
>  
> -  mode = leave_running ? ATTACH_POST_WAIT_RESUME : ATTACH_POST_WAIT_NOTHING;
>    attach_post_wait ("" /* args */, from_tty, mode);
>  
>    do_cleanups (old_chain);
> diff --git a/gdb/remote.c b/gdb/remote.c
> index f09a06e..af0a08a 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -1801,7 +1801,7 @@ remote_add_inferior (int fake_pid_p, int pid, int attached,
>     according to RUNNING.  */
>  
>  static void
> -remote_add_thread (ptid_t ptid, int running)
> +remote_add_thread (ptid_t ptid, int running, int executing)
>  {
>    struct remote_state *rs = get_remote_state ();
>  
> @@ -1816,7 +1816,7 @@ remote_add_thread (ptid_t ptid, int running)
>    else
>      add_thread (ptid);
>  
> -  set_executing (ptid, running);
> +  set_executing (ptid, executing);
>    set_running (ptid, running);
>  }
>  
> @@ -1824,11 +1824,17 @@ remote_add_thread (ptid_t ptid, int running)
>     It may be the first time we hear about such thread, so take the
>     opportunity to add it to GDB's thread list.  In case this is the
>     first time we're noticing its corresponding inferior, add it to
> -   GDB's inferior list as well.  */
> +   GDB's inferior list as well.  EXECUTING indicates whether the
> +   thread is (internally) executing or stopped.  */
>  
>  static void
> -remote_notice_new_inferior (ptid_t currthread, int running)
> +remote_notice_new_inferior (ptid_t currthread, int executing)
>  {
> +  /* In non-stop mode, we assume new found threads are (externally)
> +     running until proven otherwise with a stop reply.  In all-stop,
> +     we can only get here if all threads are stopped.  */
> +  int running = target_is_non_stop_p () ? 1 : 0;
> +
>    /* If this is a new thread, add it to GDB's thread list.
>       If we leave it up to WFI to do this, bad things will happen.  */
>  
> @@ -1836,7 +1842,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
>      {
>        /* We're seeing an event on a thread id we knew had exited.
>  	 This has to be a new thread reusing the old id.  Add it.  */
> -      remote_add_thread (currthread, running);
> +      remote_add_thread (currthread, running, executing);
>        return;
>      }
>  
> @@ -1857,7 +1863,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
>  	    thread_change_ptid (inferior_ptid, currthread);
>  	  else
>  	    {
> -	      remote_add_thread (currthread, running);
> +	      remote_add_thread (currthread, running, executing);
>  	      inferior_ptid = currthread;
>  	    }
>  	  return;
> @@ -1888,7 +1894,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
>  	}
>  
>        /* This is really a new thread.  Add it.  */
> -      remote_add_thread (currthread, running);
> +      remote_add_thread (currthread, running, executing);
>  
>        /* If we found a new inferior, let the common code do whatever
>  	 it needs to with it (e.g., read shared libraries, insert
> @@ -1899,7 +1905,7 @@ remote_notice_new_inferior (ptid_t currthread, int running)
>  	  struct remote_state *rs = get_remote_state ();
>  
>  	  if (!rs->starting_up)
> -	    notice_new_inferior (currthread, running, 0);
> +	    notice_new_inferior (currthread, executing, 0);
>  	}
>      }
>  }
> @@ -3259,12 +3265,12 @@ remote_update_thread_list (struct target_ops *ops)
>  	    {
>  	      struct private_thread_info *info;
>  	      /* In non-stop mode, we assume new found threads are
> -		 running until proven otherwise with a stop reply.  In
> -		 all-stop, we can only get here if all threads are
> +		 executing until proven otherwise with a stop reply.
> +		 In all-stop, we can only get here if all threads are
>  		 stopped.  */
> -	      int running = target_is_non_stop_p () ? 1 : 0;
> +	      int executing = target_is_non_stop_p () ? 1 : 0;
>  
> -	      remote_notice_new_inferior (item->ptid, running);
> +	      remote_notice_new_inferior (item->ptid, executing);
>  
>  	      info = demand_private_info (item->ptid);
>  	      info->core = item->core;
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]