racy tests

Sergio Durigan Junior sergiodj@redhat.com
Tue Jul 14 20:26:00 GMT 2015


On Saturday, July 11 2015, Pedro Alves wrote:

>> I can think of various things to do after that.
>> E.g. if any of the additional runs of the test record a PASS then flag
>> the test as RACY, and remember this state for the next run, and rerun
>> the same test multiple times in the next run. If the next time all N
>> runs pass (or all N runs fail) then switch its state to PASS/FAIL.
>> That's not perfect, it's hard to be perfect with racy tests. One can
>> build on that, but there's a pragmatic tradeoff here between being too
>> complex and not doing anything at all.
>> I think we should do something. The above keeps the baseline
>> machine-generated and does minimal work to manage racy tests. A lot of
>> racy tests get exposed during these additional runs for me because I
>> don't run them in parallel and thus the system is under less load, and
>> it's system load that triggers a lot of the racyness.
>> 
>
> One thing that I'd like is for this to be part of the testsuite
> itself, rather than separate machinery the buildbot uses.  That way,
> everyone benefits from it, and so that we all maintain/evolve it.
> I think this is important, because people are often confused that
> they do a test run before patch, apply patch, run test, and see
> confusing new FAILs their patch can't explain.

I have something implemented for BuildBot that would address the issue
of racy tests, but I agree that having this as part of the official
testsuite is the best approach.  I'm willing to work on this, BTW.  I
will see what I can do this week during my spare time.

> E.g., we could have the testsuite machinery itself run the tests
> multiple times, iff they failed.

I would expand this to "have the testsuite machinery itself run the
tests multiple times".  Racy tests will also PASS in the first run,
which means that we'd miss some of them if the testsuite only re-ran
those that failed.  This would have side-effects, of course: the
testsuite run would take much longer, because we would be running all
tests several times...

BuildBot would then be able to use this to update its own xfail files.
I think running this "special test mode" once a week would be enough for
BuildBot to keep things up-to-dated.

> May all tests would be eligible for
> this, or maybe we'd run apply this to those which are explicitly
> marked racy somehow, but that's separate policy from the framework
> that actually re-runs tests.  On a parallel test run, we run
> each .exp under its own separate runtest invocation, driven from
> the testsuite's Makefile; we could wrap each of those invocation and
> check whether it failed, and if so, rerun that exp a few times.
>
> That may mean that only parallel mode supports this, but I'd be
> myself fine with that, because we can always do
>
>   make check -j1 FORCE_PARALLEL="1"
>
> or some convenience for that, to get the benefits.

I glanced over the testsuite's Makefile, and I thought about having an
explicit "FORCE_RACY_PARALLEL" (with its do-check-racy-parallel
counterpart).

This would be a "special mode", that would take longer to complete (as
explained above), but that would also generate real results.  I think
we'd have to modify dg-extract-results.sh as well; not sure.

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/



More information about the Gdb-patches mailing list