This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [non-stop] 08/10 linux native support

From: Pedro Alves <pedro at codesourcery dot com>
To: Daniel Jacobowitz <drow at false dot org>
Cc: gdb-patches at sourceware dot org
Date: Thu, 10 Jul 2008 22:51:27 +0100
Subject: Re: [non-stop] 08/10 linux native support
References: <200806152205.49241.pedro@codesourcery.com> <200807101901.23598.pedro@codesourcery.com> <20080710195855.GA24287@caradoc.them.org>

A Thursday 10 July 2008 20:58:55, Daniel Jacobowitz wrote:
> Right.  Could you try this version?

Thanks!

> Basically the same as your previous posting, except that I moved the
> logic assuring we find the first thread when we find the first child
> into the thread-db layer.

Then, this patch cleaned it up a bit further.  Basically, it gets
rid of the find_lwp_pid call in thread_db_find_new_threads.  Instead
I'm using ALL_LWPS, which is already exported.  This gets rid
of the find_lwp_pid -> linux_nat_lwp_pid rename throughout, and
removes the need for thread_db_find_new_threads_1.  I then
reimported a couple of comments and cleanups that were on
the last patch, since you had picked up the previous-to-last.

Otherwise, the logic is the same.

Regtested on x86_64-unknown-linux-gnu.

OK?

-- 
Pedro Alves

2008-07-10  Pedro Alves  <pedro@codesourcery.com>

	Non-stop linux native.

	* linux-nat.c (linux_test_for_tracefork): Block events while we're
	here.
	(find_lwp_pid): Rename to...
	(linux_nat_find_lwp_pid): ... this.  Make public.  Update all
	callers.
	(get_pending_status): Implement non-stop mode.
	(linux_nat_detach): Stop threads before detaching.
	(linux_nat_resume): In non-stop mode, always resume only a single
	PTID.
	(linux_handle_extended_wait): On a clone event, in non-stop mode,
	add new lwp to GDB's thread table, and mark as running, executing
	and stopped appropriately.
	(linux_nat_filter_event): Don't assume there are other running
	threads when a thread exits.
	(linux_nat_wait): Mark the main thread as running and executing.
	In non-stop mode, don't stop all lwps.
	(linux_nat_kill): Stop lwps before killing them.
	(linux_nat_thread_alive): Use signal 0 to detect if a thread is
	alive.
	(send_sigint_callback): New.
	(linux_nat_stop): New.
	(linux_nat_add_target): Set to_stop to linux_nat_stop.

	* linux-nat.h (thread_db_attach_lwp): Declare.

	* linux-thread-db.c (thread_get_info_callback): Check for new
	threads if we have none.
	(thread_from_lwp, enable_thread_event): Set proc_handle.pid to the
	stopped lwp.  Check for new threads if we have none.
	(thread_db_attach_lwp): New.
	(thread_db_init): Set proc_handle.pid to inferior_ptid.
	(check_event): Set proc_handle.pid to the stopped lwp.
	(thread_db_find_new_threads): Set proc_handle.pid to any stopped
	lwp available, bail out if there is none.

	* linux-fork.c (linux_fork_killall): Use SIGKILL instead of
	PTRACE_KILL.

---
 gdb/linux-fork.c      |    4 
 gdb/linux-nat.c       |  246 ++++++++++++++++++++++++++++++++++++++++----------
 gdb/linux-nat.h       |    2 
 gdb/linux-thread-db.c |   77 +++++++++++++++
 4 files changed, 282 insertions(+), 47 deletions(-)

Index: src/gdb/linux-nat.c
===================================================================
--- src.orig/gdb/linux-nat.c	2008-07-10 22:14:19.000000000 +0100
+++ src/gdb/linux-nat.c	2008-07-10 22:19:40.000000000 +0100
@@ -285,6 +285,9 @@ static void linux_nat_async (void (*call
 static int linux_nat_async_mask (int mask);
 static int kill_lwp (int lwpid, int signo);
 
+static int send_sigint_callback (struct lwp_info *lp, void *data);
+static int stop_callback (struct lwp_info *lp, void *data);
+
 /* Captures the result of a successful waitpid call, along with the
    options used in that call.  */
 struct waitpid_result
@@ -487,6 +490,9 @@ linux_test_for_tracefork (int original_p
 {
   int child_pid, ret, status;
   long second_pid;
+  enum sigchld_state async_events_original_state;
+
+  async_events_original_state = linux_nat_async_events (sigchld_sync);
 
   linux_supports_tracefork_flag = 0;
   linux_supports_tracevforkdone_flag = 0;
@@ -517,6 +523,7 @@ linux_test_for_tracefork (int original_p
       if (ret != 0)
 	{
 	  warning (_("linux_test_for_tracefork: failed to kill child"));
+	  linux_nat_async_events (async_events_original_state);
 	  return;
 	}
 
@@ -527,6 +534,7 @@ linux_test_for_tracefork (int original_p
 	warning (_("linux_test_for_tracefork: unexpected wait status 0x%x from "
 		 "killed child"), status);
 
+      linux_nat_async_events (async_events_original_state);
       return;
     }
 
@@ -566,6 +574,8 @@ linux_test_for_tracefork (int original_p
   if (ret != 0)
     warning (_("linux_test_for_tracefork: failed to kill child"));
   my_waitpid (child_pid, &status, 0);
+
+  linux_nat_async_events (async_events_original_state);
 }
 
 /* Return non-zero iff we have tracefork functionality available.
@@ -1376,16 +1386,80 @@ get_pending_status (struct lwp_info *lp,
      events are always cached in waitpid_queue.  */
 
   *status = 0;
-  if (GET_LWP (lp->ptid) == GET_LWP (last_ptid))
+
+  if (non_stop)
     {
-      if (stop_signal != TARGET_SIGNAL_0
-	  && signal_pass_state (stop_signal))
-	*status = W_STOPCODE (target_signal_to_host (stop_signal));
+      enum target_signal signo = TARGET_SIGNAL_0;
+
+      if (is_executing (lp->ptid))
+	{
+	  /* If the core thought this lwp was executing --- e.g., the
+	     executing property hasn't been updated yet, but the
+	     thread has been stopped with a stop_callback /
+	     stop_wait_callback sequence (see linux_nat_detach for
+	     example) --- we can only have pending events in the local
+	     queue.  */
+	  if (queued_waitpid (GET_LWP (lp->ptid), status, __WALL) != -1)
+	    {
+	      if (WIFSTOPPED (status))
+		signo = target_signal_from_host (WSTOPSIG (status));
+
+	      /* If not stopped, then the lwp is gone, no use in
+		 resending a signal.  */
+	    }
+	}
+      else
+	{
+	  /* If the core knows the thread is not executing, then we
+	     have the last signal recorded in
+	     thread_info->stop_signal, unless this is inferior_ptid,
+	     in which case, it's in the global stop_signal, due to
+	     context switching.  */
+
+	  if (ptid_equal (lp->ptid, inferior_ptid))
+	    signo = stop_signal;
+	  else
+	    {
+	      struct thread_info *tp = find_thread_pid (lp->ptid);
+	      gdb_assert (tp);
+	      signo = tp->stop_signal;
+	    }
+	}
+
+      if (signo != TARGET_SIGNAL_0
+	  && !signal_pass_state (signo))
+	{
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog, "\
+GPT: lwp %s had signal %s, but it is in no pass state\n",
+				target_pid_to_str (lp->ptid),
+				target_signal_to_string (signo));
+	}
+      else
+	{
+	  if (signo != TARGET_SIGNAL_0)
+	    *status = W_STOPCODE (target_signal_to_host (signo));
+
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"GPT: lwp %s as pending signal %s\n",
+				target_pid_to_str (lp->ptid),
+				target_signal_to_string (signo));
+	}
     }
-  else if (target_can_async_p ())
-    queued_waitpid (GET_LWP (lp->ptid), status, __WALL);
   else
-    *status = lp->status;
+    {
+      if (GET_LWP (lp->ptid) == GET_LWP (last_ptid))
+	{
+	  if (stop_signal != TARGET_SIGNAL_0
+	      && signal_pass_state (stop_signal))
+	    *status = W_STOPCODE (target_signal_to_host (stop_signal));
+	}
+      else if (target_can_async_p ())
+	queued_waitpid (GET_LWP (lp->ptid), status, __WALL);
+      else
+	*status = lp->status;
+    }
 
   return 0;
 }
@@ -1449,6 +1523,13 @@ linux_nat_detach (char *args, int from_t
   if (target_can_async_p ())
     linux_nat_async (NULL, 0);
 
+  /* Stop all threads before detaching.  ptrace requires that the
+     thread is stopped to sucessfully detach.  */
+  iterate_over_lwps (stop_callback, NULL);
+  /* ... and wait until all of them have reported back that
+     they're no longer running.  */
+  iterate_over_lwps (stop_wait_callback, NULL);
+
   iterate_over_lwps (detach_callback, NULL);
 
   /* Only the initial process should be left right now.  */
@@ -1538,10 +1619,17 @@ linux_nat_resume (ptid_t ptid, int step,
   /* A specific PTID means `step only this process id'.  */
   resume_all = (PIDGET (ptid) == -1);
 
-  if (resume_all)
-    iterate_over_lwps (resume_set_callback, NULL);
-  else
-    iterate_over_lwps (resume_clear_callback, NULL);
+  if (non_stop && resume_all)
+    internal_error (__FILE__, __LINE__,
+		    "can't resume all in non-stop mode");
+
+  if (!non_stop)
+    {
+      if (resume_all)
+	iterate_over_lwps (resume_set_callback, NULL);
+      else
+	iterate_over_lwps (resume_clear_callback, NULL);
+    }
 
   /* If PID is -1, it's the current inferior that should be
      handled specially.  */
@@ -1551,6 +1639,7 @@ linux_nat_resume (ptid_t ptid, int step,
   lp = find_lwp_pid (ptid);
   gdb_assert (lp != NULL);
 
+  /* Convert to something the lower layer understands.  */
   ptid = pid_to_ptid (GET_LWP (lp->ptid));
 
   /* Remember if we're stepping.  */
@@ -1701,9 +1790,12 @@ linux_handle_extended_wait (struct lwp_i
 	ourstatus->kind = TARGET_WAITKIND_VFORKED;
       else
 	{
+	  struct cleanup *old_chain;
+
 	  ourstatus->kind = TARGET_WAITKIND_IGNORE;
 	  new_lp = add_lwp (BUILD_LWP (new_pid, GET_PID (inferior_ptid)));
 	  new_lp->cloned = 1;
+	  new_lp->stopped = 1;
 
 	  if (WSTOPSIG (status) != SIGSTOP)
 	    {
@@ -1720,13 +1812,38 @@ linux_handle_extended_wait (struct lwp_i
 	  else
 	    status = 0;
 
-	  if (stopping)
-	    new_lp->stopped = 1;
-	  else
+	  if (non_stop)
+	    {
+	      /* Add the new thread to GDB's lists as soon as possible
+		 so that:
+
+		 1) the frontend doesn't have to wait for a stop to
+		 display them, and,
+
+		 2) we tag it with the correct running state.  */
+
+	      /* If the thread_db layer is active, let it know about
+		 this new thread, and add it to GDB's list.  */
+	      if (!thread_db_attach_lwp (new_lp->ptid))
+		{
+		  /* We're not using thread_db.  Add it to GDB's
+		     list.  */
+		  target_post_attach (GET_LWP (new_lp->ptid));
+		  add_thread (new_lp->ptid);
+		}
+
+	      if (!stopping)
+		{
+		  set_running (new_lp->ptid, 1);
+		  set_executing (new_lp->ptid, 1);
+		}
+	    }
+
+	  if (!stopping)
 	    {
+	      new_lp->stopped = 0;
 	      new_lp->resumed = 1;
-	      ptrace (PTRACE_CONT,
-		      PIDGET (lp->waitstatus.value.related_pid), 0,
+	      ptrace (PTRACE_CONT, new_pid, 0,
 		      status ? WSTOPSIG (status) : 0);
 	    }
 
@@ -2463,13 +2580,7 @@ linux_nat_filter_event (int lwpid, int s
 	 not the end of the debugged application and should be
 	 ignored.  */
       if (num_lwps > 0)
-	{
-	  /* Make sure there is at least one thread running.  */
-	  gdb_assert (iterate_over_lwps (running_callback, NULL));
-
-	  /* Discard the event.  */
-	  return NULL;
-	}
+	return NULL;
     }
 
   /* Check if the current LWP has previously exited.  In the nptl
@@ -2599,6 +2710,8 @@ linux_nat_wait (ptid_t ptid, struct targ
       lp->resumed = 1;
       /* Add the main thread to GDB's thread list.  */
       add_thread_silent (lp->ptid);
+      set_running (lp->ptid, 1);
+      set_executing (lp->ptid, 1);
     }
 
   sigemptyset (&flush_mask);
@@ -2826,19 +2939,23 @@ retry:
     fprintf_unfiltered (gdb_stdlog, "LLW: Candidate event %s in %s.\n",
 			status_to_str (status), target_pid_to_str (lp->ptid));
 
-  /* Now stop all other LWP's ...  */
-  iterate_over_lwps (stop_callback, NULL);
+  if (!non_stop)
+    {
+      /* Now stop all other LWP's ...  */
+      iterate_over_lwps (stop_callback, NULL);
 
-  /* ... and wait until all of them have reported back that they're no
-     longer running.  */
-  iterate_over_lwps (stop_wait_callback, &flush_mask);
-  iterate_over_lwps (flush_callback, &flush_mask);
-
-  /* If we're not waiting for a specific LWP, choose an event LWP from
-     among those that have had events.  Giving equal priority to all
-     LWPs that have had events helps prevent starvation.  */
-  if (pid == -1)
-    select_event_lwp (&lp, &status);
+      /* ... and wait until all of them have reported back that
+	 they're no longer running.  */
+      iterate_over_lwps (stop_wait_callback, &flush_mask);
+      iterate_over_lwps (flush_callback, &flush_mask);
+
+      /* If we're not waiting for a specific LWP, choose an event LWP
+	 from among those that have had events.  Giving equal priority
+	 to all LWPs that have had events helps prevent
+	 starvation.  */
+      if (pid == -1)
+	select_event_lwp (&lp, &status);
+    }
 
   /* Now that we've selected our final event LWP, cancel any
      breakpoints in other LWPs that have hit a GDB breakpoint.  See
@@ -2970,6 +3087,13 @@ linux_nat_kill (void)
     }
   else
     {
+      /* Stop all threads before killing them, since ptrace requires
+	 that the thread is stopped to sucessfully PTRACE_KILL.  */
+      iterate_over_lwps (stop_callback, NULL);
+      /* ... and wait until all of them have reported back that
+	 they're no longer running.  */
+      iterate_over_lwps (stop_wait_callback, NULL);
+
       /* Kill all LWP's ...  */
       iterate_over_lwps (kill_callback, NULL);
 
@@ -3022,22 +3146,22 @@ linux_nat_xfer_partial (struct target_op
 static int
 linux_nat_thread_alive (ptid_t ptid)
 {
+  int err;
+
   gdb_assert (is_lwp (ptid));
 
-  errno = 0;
-  ptrace (PTRACE_PEEKUSER, GET_LWP (ptid), 0, 0);
+  /* Send signal 0 instead of anything ptrace, because ptracing a
+     running thread errors out claiming that the thread doesn't
+     exist.  */
+  err = kill_lwp (GET_LWP (ptid), 0);
+
   if (debug_linux_nat)
     fprintf_unfiltered (gdb_stdlog,
-			"LLTA: PTRACE_PEEKUSER %s, 0, 0 (%s)\n",
+			"LLTA: KILL(SIG0) %s (%s)\n",
 			target_pid_to_str (ptid),
-			errno ? safe_strerror (errno) : "OK");
+			err ? safe_strerror (err) : "OK");
 
-  /* Not every Linux kernel implements PTRACE_PEEKUSER.  But we can
-     handle that case gracefully since ptrace will first do a lookup
-     for the process based upon the passed-in pid.  If that fails we
-     will get either -ESRCH or -EPERM, otherwise the child exists and
-     is alive.  */
-  if (errno == ESRCH || errno == EPERM)
+  if (err != 0)
     return 0;
 
   return 1;
@@ -4239,6 +4363,35 @@ linux_nat_set_async_mode (int on)
   linux_nat_async_enabled = on;
 }
 
+static int
+send_sigint_callback (struct lwp_info *lp, void *data)
+{
+  /* Use is_running instead of !lp->stopped, because the lwp may be
+     stopped due to an internal event, and we want to interrupt it in
+     that case too.  What we want is to check if the thread is stopped
+     from the point of view of the user.  */
+  if (is_running (lp->ptid))
+    kill_lwp (GET_LWP (lp->ptid), SIGINT);
+  return 0;
+}
+
+static void
+linux_nat_stop (ptid_t ptid)
+{
+  if (non_stop)
+    {
+      if (ptid_equal (ptid, minus_one_ptid))
+	iterate_over_lwps (send_sigint_callback, &ptid);
+      else
+	{
+	  struct lwp_info *lp = find_lwp_pid (ptid);
+	  send_sigint_callback (lp, NULL);
+	}
+    }
+  else
+    linux_ops->to_stop (ptid);
+}
+
 void
 linux_nat_add_target (struct target_ops *t)
 {
@@ -4269,6 +4422,9 @@ linux_nat_add_target (struct target_ops 
   t->to_terminal_inferior = linux_nat_terminal_inferior;
   t->to_terminal_ours = linux_nat_terminal_ours;
 
+  /* Methods for non-stop support.  */
+  t->to_stop = linux_nat_stop;
+
   /* We don't change the stratum; this target will sit at
      process_stratum and thread_db will set at thread_stratum.  This
      is a little strange, since this is a multi-threaded-capable
Index: src/gdb/linux-nat.h
===================================================================
--- src.orig/gdb/linux-nat.h	2008-07-10 22:14:19.000000000 +0100
+++ src/gdb/linux-nat.h	2008-07-10 22:14:30.000000000 +0100
@@ -94,6 +94,8 @@ void check_for_thread_db (void);
 /* Tell the thread_db layer what native target operations to use.  */
 void thread_db_init (struct target_ops *);
 
+int thread_db_attach_lwp (ptid_t ptid);
+
 /* Find process PID's pending signal set from /proc/pid/status.  */
 void linux_proc_pending_signals (int pid, sigset_t *pending, sigset_t *blocked, sigset_t *ignored);
 
Index: src/gdb/linux-thread-db.c
===================================================================
--- src.orig/gdb/linux-thread-db.c	2008-07-10 22:14:19.000000000 +0100
+++ src/gdb/linux-thread-db.c	2008-07-10 22:15:05.000000000 +0100
@@ -283,7 +283,10 @@ thread_get_info_callback (const td_thrha
   if (thread_info == NULL)
     {
       /* New thread.  Attach to it now (why wait?).  */
-      attach_thread (thread_ptid, thp, &ti);
+      if (!have_threads ())
+	thread_db_find_new_threads ();
+      else
+	attach_thread (thread_ptid, thp, &ti);
       thread_info = find_thread_pid (thread_ptid);
       gdb_assert (thread_info != NULL);
     }
@@ -308,6 +311,8 @@ thread_from_lwp (ptid_t ptid)
      LWP.  */
   gdb_assert (GET_LWP (ptid) != 0);
 
+  /* Access an lwp we know is stopped.  */
+  proc_handle.pid = GET_LWP (ptid);
   err = td_ta_map_lwp2thr_p (thread_agent, GET_LWP (ptid), &th);
   if (err != TD_OK)
     error (_("Cannot find user-level thread for LWP %ld: %s"),
@@ -332,6 +337,48 @@ thread_from_lwp (ptid_t ptid)
 }
 
 
+/* Attach to lwp PTID, doing whatever else is required to have this
+   LWP under the debugger's control --- e.g., enabling event
+   reporting.  Returns true on success.  */
+int
+thread_db_attach_lwp (ptid_t ptid)
+{
+  td_thrhandle_t th;
+  td_thrinfo_t ti;
+  td_err_e err;
+
+  if (!using_thread_db)
+    return 0;
+
+  /* This ptid comes from linux-nat.c, which should always fill in the
+     LWP.  */
+  gdb_assert (GET_LWP (ptid) != 0);
+
+  /* Access an lwp we know is stopped.  */
+  proc_handle.pid = GET_LWP (ptid);
+
+  /* If we have only looked at the first thread before libpthread was
+     initialized, we may not know its thread ID yet.  Make sure we do
+     before we add another thread to the list.  */
+  if (!have_threads ())
+    thread_db_find_new_threads ();
+
+  err = td_ta_map_lwp2thr_p (thread_agent, GET_LWP (ptid), &th);
+  if (err != TD_OK)
+    /* Cannot find user-level thread.  */
+    return 0;
+
+  err = td_thr_get_info_p (&th, &ti);
+  if (err != TD_OK)
+    {
+      warning (_("Cannot get thread info: %s"), thread_db_err_str (err));
+      return 0;
+    }
+
+  attach_thread (ptid, &th, &ti);
+  return 1;
+}
+
 void
 thread_db_init (struct target_ops *target)
 {
@@ -418,6 +465,9 @@ enable_thread_event (td_thragent_t *thre
   td_notify_t notify;
   td_err_e err;
 
+  /* Access an lwp we know is stopped.  */
+  proc_handle.pid = GET_LWP (inferior_ptid);
+
   /* Get the breakpoint address for thread EVENT.  */
   err = td_ta_event_addr_p (thread_agent, event, &notify);
   if (err != TD_OK)
@@ -761,6 +811,15 @@ check_event (ptid_t ptid)
   if (stop_pc != td_create_bp_addr && stop_pc != td_death_bp_addr)
     return;
 
+  /* Access an lwp we know is stopped.  */
+  proc_handle.pid = GET_LWP (ptid);
+
+  /* If we have only looked at the first thread before libpthread was
+     initialized, we may not know its thread ID yet.  Make sure we do
+     before we add another thread to the list.  */
+  if (!have_threads ())
+    thread_db_find_new_threads ();
+
   /* If we are at a create breakpoint, we do not know what new lwp
      was created and cannot specifically locate the event message for it.
      We have to call td_ta_event_getmsg() to get
@@ -951,11 +1010,27 @@ find_new_threads_callback (const td_thrh
   return 0;
 }
 
+/* Search for new threads, accessing memory through stopped thread
+   PTID.  */
+
 static void
 thread_db_find_new_threads (void)
 {
   td_err_e err;
+  struct lwp_info *lp;
+  ptid_t ptid;
+
+  /* In linux, we can only read memory through a stopped lwp.  */
+  ALL_LWPS (lp, ptid)
+    if (lp->stopped)
+      break;
+
+  if (!lp)
+    /* There is no stopped thread.  Bail out.  */
+    return;
 
+  /* Access an lwp we know is stopped.  */
+  proc_handle.pid = GET_LWP (ptid);
   /* Iterate over all user-space threads to discover new threads.  */
   err = td_ta_thr_iter_p (thread_agent, find_new_threads_callback, NULL,
 			  TD_THR_ANY_STATE, TD_THR_LOWEST_PRIORITY,
Index: src/gdb/linux-fork.c
===================================================================
--- src.orig/gdb/linux-fork.c	2008-07-10 22:14:19.000000000 +0100
+++ src/gdb/linux-fork.c	2008-07-10 22:14:30.000000000 +0100
@@ -337,7 +337,9 @@ linux_fork_killall (void)
     {
       pid = PIDGET (fp->ptid);
       do {
-	ptrace (PT_KILL, pid, 0, 0);
+	/* Use SIGKILL instead of PTRACE_KILL because the former works even
+	   if the thread is running, while the later doesn't.  */
+	kill (pid, SIGKILL);
 	ret = waitpid (pid, &status, 0);
 	/* We might get a SIGCHLD instead of an exit status.  This is
 	 aggravated by the first kill above - a child has just

Follow-Ups:
- Re: [non-stop] 08/10 linux native support
  - From: Daniel Jacobowitz

References:
- Re: [non-stop] 08/10 linux native support
  - From: Pedro Alves
- Re: [non-stop] 08/10 linux native support
  - From: Daniel Jacobowitz

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]