GDB hangs with simple multi-threaded program on linux

Thiago Jung Bauermann bauerman@br.ibm.com
Thu Jul 15 15:46:00 GMT 2010


Hi,

I'm struggling with an issue which perhaps you already faced or thought
about...

The following testcase locks GDB nearly every time on Linux:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define NUM_THREADS     2

pthread_t main_thread;

void *print_hello (void *threadid)
{
   int tid = (int) threadid;

   printf ("Hello world! It's me, thread #%d!\n", tid);

   /* The first thread will wait main terminate.  */
   if (tid == 0)
     pthread_join (main_thread, NULL);

   pthread_exit (NULL);
}

int main (int argc, char *argv[])
{
   int i, rc;
   pthread_t threads[NUM_THREADS];

   main_thread = pthread_self ();

   for (i = 0; i < NUM_THREADS; i++) {
      printf ("In main: creating thread %d\n", i);

      rc = pthread_create (&threads[i], NULL, print_hello, (void *) i);
      if (rc) {
         printf ("ERROR; return code from pthread_create is %d\n", rc);
         exit (-1);
      }
   }

   pthread_exit (NULL);
}

What's special about this testcase is that the main thread exits earlier
than the threads it creates.

What GDB does is that when it is notified about a signal in some thread,
it will send a SIGSTOP to the other threads in the process and then call
waitpid on them to make sure that the threads indeed stopped (at the end
of linux_nat_wait_1, when it call stop_callback and stop_wait_callback
on all LWPs).

Normally this is ok, but what is happening here is that when GDB is
notified about a signal in some thread, the main thread already exited
(but GDB is oblivious to this fact), and GDB sends a SIGSTOP to every
thread in the debuggee (including the zombie main thread) and then when
it goes on to wait on them threads, it hangs while waiting on the main
thread.

I suspect that waitpid interprets the call to wait on the main thread to
actually mean waiting on the whole program instead (since TID == PID in
this case) and hangs because there are other threads in the thread group
(even though they are in the tracing stop state).

So my questions are:

1. Is it true that when the main thread exits but there are other
threads in the thread group, then no SIGCHLD is generated to notify GDB
that it exited (perhaps because such a SIGCHLD could be ambiguous and
mean that the whole process exited)? If so, how can GDB learn when the
main thread exits? This is why GDB still thinks the main thread is still
around. Either that, or GDB missed the SIGCHLD or it is later in the
queue and yet unprocessed.

2. Is there a way for GDB to wait on just the main thread instead of on
the whole process when it waits on a TID which is also the PID?

-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center



More information about the Gdb mailing list