This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC]: fix for recycled thread ids


On Wed, Mar 24, 2004 at 10:51:30AM -0500, Daniel Jacobowitz wrote:
> Fortunately, in <gnu/libc-version.h> there is a function to return the
> runtime version of glibc.  We should be able to use that - and the not
> 100% valid, but generally valid and already assumed by thread_db,
> assumption that a native GDB, when used to debug native programs, is
> debugging the same version of the C library - to enable TD_DEATH when
> it is safe to do so.  This will let us detach the threads when they
> die.
> 
> That has its own risks, since the thread continues to run for a short
> while after the death event is reported.  For instance, in NPTL the
> thread reports the event and then cleans up after itself; in LT I don't
> remember whether the manager or the thread does this, but I think it's
> the same.  I already wrote limited code to handle this, if you search
> for "zombie" in thread-db.c, so it should be OK.  The gist is that we
> remove it from the thread list right away, but do not detach the
> thread.  We resist attaching to zombies.
> 
> At least, all that is how it looks to me.  I'll experiment with
> TD_DEATH before I speculate further.

Here's a proof-of-concept patch.  It is not quite right, in that now I
can hit C-c about four times before I get the error().  It does
illustrate what I was suggesting, though, minus the version checking.

The reason we still get the segfaults is not that surprising.  I spent
a little while coaxing it into producing a core dump, and got this:

(gdb) i threads
  7 process 10303  0x40034531 in __nptl_create_event () from /lib/tls/i686/cmov/libpthread.so.0
  6 process 12467  0xffffe410 in __kernel_vsyscall ()
  5 process 12468  0xffffe410 in __kernel_vsyscall ()
  4 process 12469  0x0804884e in threadfunc (arg=0x25) at lphello.c:64
  3 process 12470  0x40034540 in __nptl_death_event () from /lib/tls/i686/cmov/libpthread.so.0
  2 process 12471  0xffffe410 in __kernel_vsyscall ()
* 1 process 12472  0xffffe410 in __kernel_vsyscall ()

So one thread is running, four are probably sleeping, one is dying...
and one is creating another thread.  I'd bet money that we aren't
attached to the one being created yet.  I can produce all sorts of
other interesting GDB internal errors this way, too.

Does this patch fix the problems you were seeing with less pathological
test cases?  Eclipse, in this case, is a less pathological test case,
since it creates threads that actually do a bit of work.  If it does,
I'll add the necessary glibc version checking.  The only way to fix the
race on the create side is to add PTRACE_EVENT_CLONE support (this _is_
what it was for), but that will require some major surgery.

Index: thread-db.c
===================================================================
RCS file: /cvs/src/src/gdb/thread-db.c,v
retrieving revision 1.37
diff -u -p -r1.37 thread-db.c
--- thread-db.c	29 Feb 2004 02:39:47 -0000	1.37
+++ thread-db.c	24 Mar 2004 16:47:23 -0000
@@ -1,6 +1,6 @@
 /* libthread_db assisted debugging support, generic parts.
 
-   Copyright 1999, 2000, 2001, 2003 Free Software Foundation, Inc.
+   Copyright 1999, 2000, 2001, 2003, 2004 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -130,6 +130,7 @@ static CORE_ADDR td_death_bp_addr;
 static void thread_db_find_new_threads (void);
 static void attach_thread (ptid_t ptid, const td_thrhandle_t *th_p,
 			   const td_thrinfo_t *ti_p, int verbose);
+static void detach_thread (ptid_t ptid, int verbose);
 
 
 /* Building process ids.  */
@@ -154,6 +155,8 @@ struct private_thread_info
   unsigned int th_valid:1;
   unsigned int ti_valid:1;
 
+  unsigned int dying:1;
+
   td_thrhandle_t th;
   td_thrinfo_t ti;
 };
@@ -501,7 +504,7 @@ enable_thread_event_reporting (void)
   /* Set the process wide mask saying which events we're interested in.  */
   td_event_emptyset (&events);
   td_event_addset (&events, TD_CREATE);
-#if 0
+#if 1
   /* FIXME: kettenis/2000-04-23: The event reporting facility is
      broken for TD_DEATH events in glibc 2.1.3, so don't enable it for
      now.  */
@@ -696,6 +699,20 @@ attach_thread (ptid_t ptid, const td_thr
   struct thread_info *tp;
   td_err_e err;
 
+	  /* We may already know about this thread, for instance when the
+	     user has issued the `info threads' command before the SIGTRAP
+	     for hitting the thread creation breakpoint was reported.  */
+  if (in_thread_list (ptid))
+    {
+      tp = find_thread_pid (ptid);
+      gdb_assert (tp != NULL);
+
+      if (!tp->private->dying)
+        return;
+
+      delete_thread (ptid);
+    }
+
   check_thread_signals ();
 
   /* Add the thread to GDB's thread list.  */
@@ -741,8 +758,16 @@ thread_db_attach (char *args, int from_t
 static void
 detach_thread (ptid_t ptid, int verbose)
 {
+  struct thread_info *thread_info;
+
   if (verbose)
     printf_unfiltered ("[%s exited]\n", target_pid_to_str (ptid));
+
+  /* Don't delete the thread now, because it still reports as active until
+     it has executed a few instructions after the event breakpoint.  */
+  thread_info = find_thread_pid (ptid);
+  gdb_assert (thread_info != NULL);
+  thread_info->private->dying = 1;
 }
 
 static void
@@ -848,11 +873,7 @@ check_event (ptid_t ptid)
 	{
 	case TD_CREATE:
 
-	  /* We may already know about this thread, for instance when the
-	     user has issued the `info threads' command before the SIGTRAP
-	     for hitting the thread creation breakpoint was reported.  */
-	  if (!in_thread_list (ptid))
-	    attach_thread (ptid, msg.th_p, &ti, 1);
+	  attach_thread (ptid, msg.th_p, &ti, 1);
 
 	  break;
 
@@ -1121,8 +1142,7 @@ find_new_threads_callback (const td_thrh
 
   ptid = BUILD_THREAD (ti.ti_tid, GET_PID (inferior_ptid));
 
-  if (!in_thread_list (ptid))
-    attach_thread (ptid, th_p, &ti, 1);
+  attach_thread (ptid, th_p, &ti, 1);
 
   return 0;
 }


-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]