This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
Re: [rfa/testsuite] Make pthreads test more robust
- To: Michael Snyder <msnyder at cygnus dot com>
- Subject: Re: [rfa/testsuite] Make pthreads test more robust
- From: Fernando Nasser <fnasser at redhat dot com>
- Date: Mon, 01 Oct 2001 11:12:31 -0400
- CC: Daniel Jacobowitz <drow at mvista dot com>, gdb-patches at sources dot redhat dot com, fnasser at cygnus dot com
- Organization: Red Hat Canada
- References: <20010928114642.A913@nevyn.them.org> <3BB4C175.42072289@cygnus.com>
(please see below)
Michael Snyder wrote:
>
> Daniel Jacobowitz wrote:
> >
> > I've been seeing about 90% failure rate for gdb.threads/pthreads.exp lately.
> > The test which fails is always stopping threads with a control-c.
> > I spent some time debugging this today; it seems that the course of events
> > looks something like this:
> > - send "continue\n"
> > - wait
> > - send "\003"
> > - read back "Continuing."
> > - timeout
> >
> > Note that, among other things, we never see "continue\n" echoed back to us,
> > and yet gdb continues anyway. Obviously something is fishy in timing. We
> > also do no reads between the continue and the \003. My best guess is that
> > gdb does not get scheduled between the two sends; so waiting for the output
> > of continue seems like a good idea.
> >
> > Also, I'm not entirely sure what 'after 1000 [ ... ]' is supposed to do, but
> > it doesn't seem to delay by any measurable amount of time. Adding an
> > additional sleep, so that the process has actually continued before we send
> > the ^C, lets the test pass.
> >
> > Is this OK to commit? With it, the rest of pthreads.exp passes (except for
> > the last test:
> > break common_routine thread 4
> > Breakpoint 6 at 0x8048666: file ../../../../src-hashtest/gdb/testsuite/gdb.threads/pthreads.c, line 50.
> > (gdb) PASS: gdb.threads/pthreads.exp: set break at common_routine in thread 2
> > continue
> > Continuing.
> > Cannot find thread 1024: generic error
> > (gdb) FAIL: gdb.threads/pthreads.exp: continue to bkpt at common_routine in thread 2
> > )
>
> The patch looks sane. I'd like Fernando's blessing, but I'm inclined
> to suggest checking it in and just watching out to see if it breaks
> on any other platform.
>
I thought I had already responded to that, but I can't find the
answer...
I agree with Michael -- lets try. There is definitively a race
condition
in there. If GDB does not say continue in 1 sec. we send it a Cntl-C
and
I am not so sure what the output will look like if we send the Cntl-C
before
GDB says "Continuing".
With your change we will be sure that the program is running before
sending
the interrupt request. I guess it is the right thing to do.
P.S.: The "after" command schedules something to be done after a certain
time.
In this case, after a second (1000 milliseconds), a "\003" will be sent
to GDB.
So, we make it run and then interrupt it.
P.S.2: I wonder if there isn't a second race condition between the 1 sec
to
interrupt and the timeout of gdb_expect...
> > 2001-09-28 Daniel Jacobowitz <drow@mvista.com>
> >
> > * gdb.threads/pthreads.exp: Wait for output and delay
> > before sending ^C.
> >
> > Index: pthreads.exp
> > ===================================================================
> > RCS file: /cvs/src/src/gdb/testsuite/gdb.threads/pthreads.exp,v
> > retrieving revision 1.6
> > diff -u -r1.6 pthreads.exp
> > --- pthreads.exp 2001/06/06 18:34:53 1.6
> > +++ pthreads.exp 2001/09/28 15:30:28
> > @@ -248,6 +248,15 @@
> >
> > # Send a continue followed by ^C to the process to stop it.
> > send_gdb "continue\n"
> > + gdb_expect {
> > + -re "Continuing." {
> > + pass "Continue with all threads running"
> > + }
> > + timeout {
> > + fail "Continue with all threads running (timeout)"
> > + }
> > + }
> > + sleep 1
> > set description "Stopped with a ^C"
> > after 1000 [send_gdb "\003"]
> > gdb_expect {
--
Fernando Nasser
Red Hat Canada Ltd. E-Mail: fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario M4P 2C9