[PATCH v8 6/7] Remote fork catch

Fri May 8 10:16:00 GMT 2015

On 05/06/2015 05:10 PM, Don Breazeal wrote:

> I have some concerns about my test results, though.  Over the past
> few weeks I've been seeing more intermittent failures than I expect,
> on both the mainline and my branch.  Sometimes I get clean test runs,
> other times not.  Is this something others have been seeing, or is it
> just me?

Yeah, some tests are racy still.  See the buildbot results
here, for example.

 https://sourceware.org/ml/gdb-testers/2015-q2/msg03532.html
 https://sourceware.org/ml/gdb-testers/2015-q2/msg03533.html

etc.  That list is high volume, since results are threaded by commit,
it's manageable in a mail client.

>
> I know that some of these tests are problematic (random-signal.exp),
> but others I haven't been aware of.  The failures I've seen include
> (but aren't limited to):
>
> gdb.base/random-signal.exp

The buildbots show:

  FAIL: gdb.base/random-signal.exp: stop with control-c (timeout)

Haven't investigated it.

> gdb.mi/mi-nsmoribund.exp

This one's fixed by:

  https://sourceware.org/ml/gdb-patches/2015-05/msg00117.html

That may fix some others.  Not sure.

> gdb.threads/attach-many-short-lived-threads.exp

The buildbots show this failing randomly too.  As far as I've
seen, most often, gdb fails to attach to the process,
saying that the leader thread is zombie, on first iteration.
It looks like either gdb or kernel bug, but I haven't managed to
pinpoint it.

> gdb.threads/interrupted-hand-call.exp
> gdb.threads/thread-unwindonsignal.exp

These are both caused by the same bug:

 PASS -> FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit
 PASS -> FAIL: gdb.threads/interrupted-hand-call.exp: continue until exit

It's related to:

 https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html

That is, in those tests, a thread other than the main thread exits
the process while gdbserver is still iterating over all threads
continuing them.  gdbserver needs more fixing.

Really the proper way to handle this would be to file bugs in
bugzilla, pasting there the relevant gdb.log bits, and mark the
tests with KFAIL.

> gdb.trace/tspeed.exp

No idea.  Given the test's purpose, I'd suspect a test bug.

>
> I've also seen failures in 'random' tests due to timeouts on the
> qSupported packet, like this:
>
> target remote localhost:3238^M
> Remote debugging using localhost:3238^M
> Ignoring packet error, continuing...^M
> warning: unrecognized item "timeout" in "qSupported" response^M
>
> I was particularly concerned about this since I had made changes to
> qSupported, but I found that it often times out for me on the
> mainline as well.  I tried bumping the timeout up to 5 seconds for
> the qSupported packet and still saw the error.  I haven't identified
> a root cause.

No idea.

>
> Most of these issues I've been able to reproduce on the mainline by
> just running the test over and over in a loop until it fails.
>
> So, all that said, I believe that my changes aren't introducing any
> obvious regressions, but I'd feel a lot better about it if I could
> get clean test runs more reliably.

I'd love to get clean test results too.  All the help would be
much appreciated!

Thanks,
Pedro Alves