This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC]: fix for recycled thread ids


On Wed, Mar 24, 2004 at 07:46:39PM -0500, Jeff Johnston wrote:
> Jeff Johnston wrote:
> >I also tried the TD_DEATH method but gave up after running into 
> >segfaults.  I have to go back and recheck whether those segfaults could 
> >occur without the Crtl-C being issued.  I can't remember off-hand.
> >
> 
> Just to confirm, I don't see any segfaults just running, but this code is 
> extremely brittle on my RHEL3-U1 system.  It is coming back to me why I 
> abandoned TD_DEATH.
> 
> On the first Ctrl-C I usually get:
> 
> Program received signal SIGINT, Interrupt.
> [Switching to Thread -1226028112 (LWP 16580)]
> 0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) info threads
> [New Thread -1226163280 (LWP 16579)]
> Can't attach LWP 16579: Operation not permitted
> (gdb) info threads
> [New Thread -1226298448 (LWP 16578)]
> Can't attach LWP 16578: Operation not permitted

The usual cause of this message is attempting to attach to a thread
we're already attached to; on earlier versions of the patch I posted I
saw this error plenty often.  Notice that both that TID and LWP ID are
already in the thread list and trying to die:

> (gdb) info threads
>   158 Thread -1226298448 (LWP 16578)  0xb75ce6b1 in __nptl_death_event ()
>    from /lib/tls/libpthread.so.0
>   157 Thread -1226163280 (LWP 16579)  0xb75ce6b1 in __nptl_death_event ()
>    from /lib/tls/libpthread.so.0
> * 156 Thread -1226028112 (LWP 16580)  0xb75ebc32 in _dl_sysinfo_int80 ()
>    from /lib/ld-linux.so.2
>   1 Thread -1218598080 (LWP 16423)  0xb75ce6a1 in __nptl_create_event ()
>    from /lib/tls/libpthread.so.0
> (gdb) c
> 
> After continuing, the program will go shortly and then die.
> 
> [Thread -1226163280 (zombie) exited]
> [New Thread -1226433616 (LWP 16674)]
> Cannot enable thread event reporting for Thread -1226433616 (LWP 16674): 
> generic error
> 
> My original fix doesn't break near as easily on RHEL-U1.  Hitting enough 
> Ctrl-C's does eventually trigger an error (occassionally even a gdb 
> assert), but this may be part of catching the race condition before we know 
> about the thread.
> Now, it is more than likely just a bug in the TD_DEATH event processing 
> because we haven't exercised it.  Would it be worth looking at implementing 
> your original alternative for my patch?

I assume you're using the same test you posted, here.

Hmm, upon thinking about this problem, it's because you used info
threads after stopping and I didn't think to try that.  We end up
removing and re-adding the thread several times.  We'd really like to
delay removing the thread until we see a new create message for it, at
this point.  I see several ways to massage the code for that effect;
it's not the clearest, but for testing purposes we can revert the last
bit of the patch I sent you.  This bit:

@@ -1121,8 +1142,7 @@ find_new_threads_callback (const td_thrh

   ptid = BUILD_THREAD (ti.ti_tid, GET_PID (inferior_ptid));

-  if (!in_thread_list (ptid))
-    attach_thread (ptid, th_p, &ti, 1);
+  attach_thread (ptid, th_p, &ti, 1);

   return 0;
 }


After that we hit some other problems with GDB's book-keeping, though I
suspect they are related to the create race.  It's hard to tell.  The
process tends to zombie unexpectedly but that's the only error I've
seen with the bit above reverted.

Do you have any code for PTRACE_EVENT_CLONE yet, or should I put
something together in the morning to verify whether that's the problem?

> You asked me in a previous note to run with strace on and confirm that I 
> was not getting WIFEXITTED for the dying threads.  I confirmed this.  I 
> only get one WIFEXITTED and that is for the main thread.

OK, so we can't count on the thread exits.  You might be able to enable
PTRACE_EVENT_EXIT to work around this, if RHEL3 has that - but one
needs to be careful with that or you can really confuse GDB.  Anyway,
we don't need that for now.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]