This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

gdb, pthreads, and sleep

From: Michael Elizabeth Chastain <mec at shout dot net>
To: gdb at sources dot redhat dot com
Date: Mon, 22 Sep 2003 16:47:06 -0400
Subject: gdb, pthreads, and sleep

There's some code in the mi pthreads test that is bugging me.  All the
tests pass all the time, but different tests pass on different test
runs, which causes noise in my test reports.

I'm running on native i686-pc-linux-gnu, red hat 8.0, glibc 2.2.93-5-rh.

Here is the code of the program under test:

  # gdb/testsuite/gdb.mi/pthreads.c

  void *
  routine (void *arg)
  {
    sleep (9);
    printf ("hello thread\n");
  }

  main ()
  {
    ...
    /* Create a few threads */
    for (i = 0; i < 5; i++)
      create_thread ();
    done_making_threads ();
  }

When gdb is not used, "sleep (9)" sleeps for 9 seconds and returns 0.
When gdb is used, "sleep (9)" sleeps for 0 seconds and returns 9.
This causes races and different output on different test runs.

The problem is an interaction between sleep, pthread_create, and gdb.
When gdb is running, pthread_create eventually calls
pthread_restart_new, which sends the pthread_sig_restart signal.  gdb
notices this signal.  But as a side effect, the "sleep (9)" is
interrupted and returns early.

When gdb is used, usually all the threads go all the way to exit, but
sometimes some threads do not (especially the newest thread created).

The test script gdb.mi/mi-pthreads.exp wants to test -thread-select
on the child threads.  The test script lets the threads be created.
Then it asks for thread info, and then it tests -thread-select
on each thread.

The list of threads varies from run to run, so the PASS results vary
from run to run.  On 95% of the runs, however, there are no children at
all, so the test script is not covering -thread-select very well
(it still sees the parent thread and the manager thread).

What can we do about this?

(1) Do nothing.

    This bugs me because I would like to run the gdb test suite twice in
    a row and have it come out the same way each time.  This makes it
    easier for automated testers and for new people, like gcc people, to
    use the test suite.  That's my reason for bringing this up.

    Also, -thread-select is not testing child processes very well.

(2) Change the program under test to be more correct:

      int unslept = 9;
      while (unslept)
	unslept = sleep (unslept);

    This is the proper way to call 'sleep' in a program that may
    receive signals.  The return value of 'sleep' is documented in
    Single Unix Spec, v2, so it is portable.  If this code leads to a
    problem, then it means that the test program has found a bug in the
    operating system's implementation of "sleep".  Tickling bugs is
    a *good* thing for a test program.

    The gotcha here is that gdb should work with buggy test programs.
    Currently, pthreads.c is written poorly (ignores the return value of
    'sleep'), but it's natural that people write code like this.

    On the other hand, the point of the pthreads test is to call
    thread-select on a lot of threads.  With the exising code, there is
    a child thread for thread-select less than 10% of the time.  By
    writing the code my way, thread-select is actually called on the
    child threads 100% of the time.  (I've tested this on my test
    bed with 200 runs each way).

(3) Leave the test program along, but change the test script so that it
    generates one PASS result for the whole thread list instead of one PASS
    per thread.  It would still generate FAILs for each thread that FAILed
    thread-select.  This would make the results reproducible, and leave the
    test program exactly as it is now.

My preferences: (2), (3), (1).

What do you think?

Also, I think we need some documentation in the gdb threads section.
gdb makes some threaded programs behave differently because the signals
for gdb are not perfectly transparent.  Watch what happens when I run
gdb.mi/pthreads with and without gdb:

  /* without gdb */
  % ./pthreads
  hello
  hello

  /* with gdb */
  (gdb) run
  Starting program: /berman/home/mgnu/gdb/pthread-select/pthreads
  [New Thread 8192 (LWP 12564)]
  [New Thread 16385 (LWP 12568)]
  [New Thread 8194 (LWP 12569)]
  [New Thread 16387 (LWP 12570)]
  hello thread
  [New Thread 24580 (LWP 12571)]
  hello thread
  [New Thread 32773 (LWP 12572)]
  hello thread
  [New Thread 40966 (LWP 12573)]
  hello thread
  hello
  hello

  Program exited normally.

The 'hello thread' output happens only under the debugger.
I think people will be surprised when their 'sleep' calls actually
sleep without gdb, but return with a lot of unslept time with gdb.
I doubt we can fix this, although maybe the pthreads implementors
can fix it.  But we can least document it.

Michael C

Follow-Ups:
- Re: gdb, pthreads, and sleep
  - From: Daniel Jacobowitz

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]