This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH] Improve analysis of racy testcases
- From: Antoine Tremblay <antoine dot tremblay at ericsson dot com>
- To: Sergio Durigan Junior <sergiodj at redhat dot com>
- Cc: GDB Patches <gdb-patches at sourceware dot org>
- Date: Thu, 25 Feb 2016 13:33:56 -0500
- Subject: Re: [PATCH] Improve analysis of racy testcases
- Authentication-results: sourceware.org; auth=none
- References: <87r3gcgm91 dot fsf at redhat dot com>
Sergio Durigan Junior writes:
> This patch is a proposal to introduce some mechanisms to identify racy
> testcases present in our testsuite. As can be seen in previous
> discussions, racy tests are really bothersome and cause our BuildBot to
> pollute the gdb-testers mailing list with hundreds of false-positives
> messages every month. Hopefully, by identifying these racy tests in
> advance (and automatically) will contribute to the reduction of noise
> traffic to gdb-testers, maybe to the point where we will be able to send
> the failure messages directly to the authors of the commits.
>
> I spent some time trying to decide the best way to tackle this problem,
> and decided that there is no silver bullet. Racy tests are tricky and
> it is difficult to catch them, so the best solution I could find (for
> now?) is to run our testsuite a number of times in a row, and then
> compare the results (i.e., the gdb.sum files generated during each run).
> The more times you run the tests, the more racy tests you are likely to
> detect (at the expense of waiting longer and longer). You can also run
> the tests in parallel, which makes things faster (and contribute to
> catching more racy tests, because your machine will have less resources
> for each test and some of them are likely to fail when this happens). I
> did some tests in my machine (8-core i7, 16GB RAM), and running the
> whole GDB testsuite 5 times using -j6 took 23 minutes. Not bad.
>
> In order to run the racy test machinery, you need to specify the
> RACY_ITER environment variable. You will assign a number to this
> variable, which represents the number of times you want to run the
> tests. So, for example, if you want to run the whole testsuite 3 times
> in parallel (using 2 cores), you will do:
>
> make check RACY_ITER=3 -j2
>
> It is also possible to use the TESTS variable and specify which tests
> you want to run:
>
> make check TEST='gdb.base/default.exp' RACY_ITER=3 -j2
>
> And so on. The output files will be put at the directory
> gdb/testsuite/racy_outputs/.
>
> After make invokes the necessary rules to run the tests, it finally runs
> a Python script that will analyze the resulting gdb.sum files. This
> Python script will read each file, and construct a series of sets based
> on the results of the tests (one set for FAIL's, one for PASS'es, one
> for KFAIL's, etc.). It will then do some set operations and come up
> with a list of unique, sorted testcases that are racy. The algorithm
> behind this is:
>
> for state in PASS, FAIL, XFAIL, XPASS...; do
> if a test's state in every sumfile is $state; then
> it is not racy
> else
> it is racy
>
> IOW, a test must have the same state in every sumfile.
>
> After processing everything, the script prints the racy tests it could
> identify on stdout. I am redirecting this to a file named racy.sum.
>
> Something else that I wasn't sure how to deal with was non-unique
> messages in our testsuite. I decided to do the same thing I do in our
> BuildBot: include a unique identifier in the end of message, like:
>
> gdb.base/xyz.exp: non-unique message
> gdb.base/xyz.exp: non-unique message <<2>>
>
> This means that you will have to be careful about them when you use the
> racy.sum file.
>
> I ran the script several times here, and it did a good job catching some
> well-known racy tests. Overall, I am satisfied with this approach and I
> think it will be helpful to have it upstream'ed. I also intend to
> extend our BuildBot and create new, specialized builders that will be
> responsible for detecting the racy tests every X number of days, but
> that will only be done when this patch is accepted.
>
> Comments are welcome, as usual. Thanks.
Thanks for this ! This was quite a problem for me while testing on arm.
I'm testing it now...
One note maybe it would be nice output the list of unracy tests too to
be able to auto-build a list of tests to run out of this since I'm not
sure you can set an exclusion list ?
Regards,
Antoine