This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Improve analysis of racy testcases


Sergio Durigan Junior writes:

> This patch is a proposal to introduce some mechanisms to identify racy
> testcases present in our testsuite.  As can be seen in previous
> discussions, racy tests are really bothersome and cause our BuildBot to
> pollute the gdb-testers mailing list with hundreds of false-positives
> messages every month.  Hopefully, by identifying these racy tests in
> advance (and automatically) will contribute to the reduction of noise
> traffic to gdb-testers, maybe to the point where we will be able to send
> the failure messages directly to the authors of the commits.
>
> I spent some time trying to decide the best way to tackle this problem,
> and decided that there is no silver bullet.  Racy tests are tricky and
> it is difficult to catch them, so the best solution I could find (for
> now?) is to run our testsuite a number of times in a row, and then
> compare the results (i.e., the gdb.sum files generated during each run).
> The more times you run the tests, the more racy tests you are likely to
> detect (at the expense of waiting longer and longer).  You can also run
> the tests in parallel, which makes things faster (and contribute to
> catching more racy tests, because your machine will have less resources
> for each test and some of them are likely to fail when this happens).  I
> did some tests in my machine (8-core i7, 16GB RAM), and running the
> whole GDB testsuite 5 times using -j6 took 23 minutes.  Not bad.
>
> In order to run the racy test machinery, you need to specify the
> RACY_ITER environment variable.  You will assign a number to this
> variable, which represents the number of times you want to run the
> tests.  So, for example, if you want to run the whole testsuite 3 times
> in parallel (using 2 cores), you will do:
>
>   make check RACY_ITER=3 -j2
>
> It is also possible to use the TESTS variable and specify which tests
> you want to run:
>
>   make check TEST='gdb.base/default.exp' RACY_ITER=3 -j2
>
> And so on.  The output files will be put at the directory
> gdb/testsuite/racy_outputs/.
>
> After make invokes the necessary rules to run the tests, it finally runs
> a Python script that will analyze the resulting gdb.sum files.  This
> Python script will read each file, and construct a series of sets based
> on the results of the tests (one set for FAIL's, one for PASS'es, one
> for KFAIL's, etc.).  It will then do some set operations and come up
> with a list of unique, sorted testcases that are racy.  The algorithm
> behind this is:
>
>   for state in PASS, FAIL, XFAIL, XPASS...; do
>     if a test's state in every sumfile is $state; then
>       it is not racy
>     else
>       it is racy
>
> IOW, a test must have the same state in every sumfile.
>
> After processing everything, the script prints the racy tests it could
> identify on stdout.  I am redirecting this to a file named racy.sum.
>
> Something else that I wasn't sure how to deal with was non-unique
> messages in our testsuite.  I decided to do the same thing I do in our
> BuildBot: include a unique identifier in the end of message, like:
>
>   gdb.base/xyz.exp: non-unique message
>   gdb.base/xyz.exp: non-unique message <<2>>
>   
> This means that you will have to be careful about them when you use the
> racy.sum file.
>
> I ran the script several times here, and it did a good job catching some
> well-known racy tests.  Overall, I am satisfied with this approach and I
> think it will be helpful to have it upstream'ed.  I also intend to
> extend our BuildBot and create new, specialized builders that will be
> responsible for detecting the racy tests every X number of days, but
> that will only be done when this patch is accepted.
>
> Comments are welcome, as usual.  Thanks.

Thanks for this ! This was quite a problem for me while testing on arm.
I'm testing it now...

One note maybe it would be nice output the list of unracy tests too to
be able to auto-build a list of tests to run out of this since I'm not
sure you can set an exclusion list ?

Regards,
Antoine


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]