[RFA]: Modified Watchthreads Patch

Mark Kettenis kettenis@gnu.org
Sat Dec 11 19:06:00 GMT 2004

   Date: Sat, 11 Dec 2004 11:36:52 -0500
   From: Daniel Jacobowitz <drow@false.org>

   > Adding hacks around hacks, like we've been doing to support threads on
   > Linux for quite some time now is defenitely not a good idea.

   Mark, would you please stop saying this?  I don't believe it to be true
   any more.  If you think it's still accurate, please point me at
   specific hacks around hacks, and let's see if we can get rid of them

Sorry Daniel, I know you've done some really good work with regard to
threads in the kernel.  I guess I'm still somewhat frustrated about
the situation back when I wrote the initial Linux LWP layer.  Back
then the attitude of the kernel developers was basically: "We won't
complicate the kernel with support for debuggers, solve everything
from userland!".

That said, there is just too much code in linux-nat.c.  If you compare
the code necessary to implement to_wait and to_resume that's there
with the amount of code in inf-ttrace.c, you see what I mean.  Most of
the code is present because we need to stop each thread individually
by sending it a SIGSTOP.  Things become so much simpler if the kernel
would provide an interface to stop them all in one go that doesn't
interfere with signal delivery...

   I admit there are some peculiarities related to stopping all threads.
   But most of them are related to very real situations that we want to be
   able to debug: two threads receiving a signal at the same time, hitting
   different breakpoints at the same time, et cetera.  Life with threads
   is just more complicated.  Some platforms do the complicated bits in
   the kernel, and Linux chose to expose an LWP-oriented interface rather
   than a whole-process oriented interface so we have to do the
   complicated bits in userspace.  That is not going to change, because
   the Linux design philosophy for threading is that they are just a
   special kind of process; Linux has no concept of "the whole process"
   and will not be adding one.  This has been discussed from time to time
   on the linux-kernel list.  [There is some correlation to the POSIX
   threading concept of a process, for the purpose of POSIX-compliant
   signal delivery, but that's the extent of it.]

I still think this is wrong.  The very fact that these proceses share
a virtual memory space means that they're grouped together.  The
kernel shouldn't deny that.  But even if folks don't want to support
freezing that memory space atomically (at least to the observer), we
really need a way to stop each process individually that doesn't
interfere with signal delivery.  I sincerely believe that we'll keep
seeing thread-related problems if it isn't possible to stop threads
while keeping all signals pending.

   And I'm busily (at work) improving platform support for NPTL; one of my
   goals is to someday rip all the LinuxThreads support code out of GDB. 
   But it's going to be a long time before that's a viable option - at
   least a couple of years from now.  That's no more friendly than
   dropping support for all but the newest kernel.  And we'll need some
   of that code to provide quality support for debugging multiple
   processes simultaneously, or to support debugging applications which
   use clone() directly.

Your work is very much appreciated.


More information about the Gdb-patches mailing list