This is the mail archive of the mailing list for the GDB project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fix for bugzilla/16168

The test-case at
demonstrates gdbserver locking up under a race-condition.

The logic here is very hairy, and this is my best estimate as to what
is happening, I could be wrong.

This only happens with gdbserver and remote debugging. Consider two
threads T1 and T2 while debugging in all-stop mode:

The setup:
T1: hits a breakpoint
gdbserver: sends SIGSTOP to T2
gdbserver: waits for T2 to receive the SIGSTOP
T2: Receives SIGPROF from OS and notifies gdbserver
gdbserver: receives notification that T2 stopped with SIGPROF, but it
is expecting a SIGSTOP
gdbserver: knows everyone is stopped now
gdbserver: marks that T2 has pending SIGPROF

The bug:
in linux_resume, gdbserver restarts T2 to handle the SIGPROF, but
believes that it can just report the pending status, and doesn't need
to restart the other threads.

This would be true if T2 was also at the breakpoint, or if T1 needed a
step over. But neither of these is true. T1 therefore never gets
restarted, and eventually T2 has no more work to do. And gdbserver
locks up.

The enclosed patch has two components:

1. More tracing, which I think is likely useful regardless of the rest.

2. An additional check in resume_status_pending_p to see if the thread
was stopped at a breakpoint, that follows some of the logic in
linux_wait_1 to set report_to_gdb

I'm not sure that this is the complete, correct check--likely it needs
even more from linux_wait_1. One could either simplify it or make it
more complex, but what is there works for this test. I would love to
figure out a way to test all different conditions here, but it has
been hard to find a case that exercises both sides at all, but less
each of the sub-components.

I do have such a test-case, and will attach it to the bug. This is all
quite racy, and so a bit hard to test automatically without manually
checking the log. Before I make this a real patch submission, does
anyone have any comments for this approach?


diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 799fcc9..00d29a2 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -2564,6 +2564,8 @@ Check if we're already there.\n",
 	info_p = NULL;
       linux_resume_one_lwp (event_child, event_child->stepping,
 			    WSTOPSIG (w), info_p);
+      if (debug_threads)
+        fprintf (stderr, "retrying...\n");
       goto retry;
@@ -2669,6 +2671,9 @@ Check if we're already there.\n",
       if (ptid_equal (ptid, minus_one_ptid))
 	  event_child->status_pending_p = 1;
+          if (debug_threads)
+            fprintf (stderr, "LWP %ld becomes pending\n",
+                     lwpid_of (event_child));
 	  event_child->status_pending = w;
 	  select_event_lwp (&event_child);
@@ -3149,9 +3154,9 @@ linux_resume_one_lwp (struct lwp_info *lwp,
   current_inferior = get_lwp_thread (lwp);
   if (debug_threads)
-    fprintf (stderr, "Resuming lwp %ld (%s, signal %d, stop %s)\n",
+    fprintf (stderr, "Resuming lwp %ld (%s, signal %d, stop %s, pending %d)\n",
 	     lwpid_of (lwp), step ? "step" : "continue", signal,
-	     lwp->stop_expected ? "expected" : "not expected");
+	     lwp->stop_expected ? "expected" : "not expected", lwp->status_pending_p);
   /* This bit needs some thinking about.  If we get a signal that
      we must report while a single-step reinsert is still pending,
@@ -3385,8 +3390,31 @@ resume_status_pending_p (struct inferior_list_entry *entry, void *flag_p)
   if (lwp->resume == NULL)
     return 0;
+  /* If an LWP is pending, but the signal won't be reported to gdb,
+     allow all threads to continue.  */
   if (lwp->status_pending_p)
-    * (int *) flag_p = 1;
+    {
+      if ((current_inferior->last_resume_kind == resume_step
+           && !lwp_in_step_range (lwp))
+          || lwp->stopped_by_watchpoint
+          || (gdb_breakpoint_here (lwp->stop_pc)
+              && gdb_condition_true_at_breakpoint (lwp->stop_pc)
+              && gdb_no_commands_at_breakpoint (lwp->stop_pc)))
+        {
+          * (int *) flag_p = 1;
+          if (debug_threads)
+            fprintf (stderr, "LWP %ld has status pending\n",
+                     lwpid_of (lwp));
+        }
+      else
+        {
+          if (debug_threads)
+            fprintf (stderr,
+                     "LWP %ld has status pending, but no breakpoint\n",
+                     lwpid_of (lwp));
+        }
+    }
   return 0;

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]