This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC]: fix for recycled thread ids


On Fri, Mar 19, 2004 at 04:32:36PM -0500, Jeff Johnston wrote:
> 
> 
> Daniel Jacobowitz wrote:
> >On Fri, Mar 19, 2004 at 02:35:40PM -0500, Jeff Johnston wrote:
> >
> >>>Conceptually, we attach to LWPs, not to threads.  That suggests to me
> >>>that the correct fix is to ask the LWP layer if the LWP is attached
> >>>rather than looking it up in the thread list in the first place. 
> >>>We've already got an appropriate list of LWPs though we might need a
> >>>new accessor.
> >>>
> >>
> >>I like that idea.  We still have to deal with the bogus thread list 
> >>entry.  The routine prune_threads calls thread_db_alive and it won't 
> >>realize the thread info it has is bogus because it will find the tid is 
> >>valid.
> >
> >
> >Do you think this will be a problem?  My hope is that it will just look
> >as if the thread has 'migrated' to a new LWP.
> >
> 
> It will have invalid state associated with it.  For example, the thread 
> info has a prev_pc field.  As to all the havoc that the state may or may 
> not cause, I think it would be a very good idea to clean it up now.  Who's 
> to say what state will be added to thread_info in the future.

It turns out that this is not only a problem, but the whole problem. 
Could you run this test under GDB, on RHEL, using strace?  Tell me
whether or not you see WIFEXITED results for every thread as it exits. 
I was assuming you did not, but I can reproduce the misbehavior here
even though I do get them.

The problem is that we get the LWP death events, but we treat the
threads and LWPs as completely independent sets.  We never find out
that the threads have died.

We don't enable thread death event reporting, because in glibc 2.1.3
there was a bug in the death reporting which would cause the debugged
program to segfault:

#if 0
  /* FIXME: kettenis/2000-04-23: The event reporting facility is
     broken for TD_DEATH events in glibc 2.1.3, so don't enable it for
     now.  */
  td_event_addset (&events, TD_DEATH);
#endif

Fortunately, in <gnu/libc-version.h> there is a function to return the
runtime version of glibc.  We should be able to use that - and the not
100% valid, but generally valid and already assumed by thread_db,
assumption that a native GDB, when used to debug native programs, is
debugging the same version of the C library - to enable TD_DEATH when
it is safe to do so.  This will let us detach the threads when they
die.

That has its own risks, since the thread continues to run for a short
while after the death event is reported.  For instance, in NPTL the
thread reports the event and then cleans up after itself; in LT I don't
remember whether the manager or the thread does this, but I think it's
the same.  I already wrote limited code to handle this, if you search
for "zombie" in thread-db.c, so it should be OK.  The gist is that we
remove it from the thread list right away, but do not detach the
thread.  We resist attaching to zombies.

At least, all that is how it looks to me.  I'll experiment with
TD_DEATH before I speculate further.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]