[RFC] Proposal for hosting GDB CI builds

Simon Marchi simon.marchi@polymtl.ca
Fri Jul 23 20:17:17 GMT 2021


On 2021-07-20 11:21 a.m., Luis Machado wrote:
> Any other feedback on this?

Sorry, I procrastinated replying to this thread for a while - too many
things to do.

> On 6/30/21 1:46 PM, Luis Machado wrote:
>> Hi,
>>
>> This proposal comes as an attempt to revive the GDB CI builds, given the previous one (buildbot) is no longer being maintained by Sergio (thanks a lot for keeping it working for all these years by the way).
>>
>> CI GDB builds are a great help for spotting regressions without having to do the tedious and time-consuming work of running the GDB testsuite for each git revision, for each architecture and reading through hundreds of lines of summaries. If a regression is spotted, then one needs to bisect to find the culprit. This isn't great, specially for architectures without great availability of desktop hardware.
>>
>> Ideally, for each commit, we'd run full builds to validate the state of the tree, but we're not quite there yet. So meanwhile, having some level of automation to get the builds done without manual intervention sounds like a reasonable step forward.
>>
>>  From previous IRC conversations, it seems to be a consensus that availability of processing power is not a problem. It is reasonably easy to find hardware to do some builds. The most difficult resource to find is manpower to setup the CI infrastructure and keep it running.

I agree, it's almost a full time job to babysit a CI.

>> With the above said, I've discussed this internally at Linaro and we can spare some manpower to setup and maintain an isolated Linaro-hosted Jenkins instance for GDB CI builds.
>>
>> Linaro can take care of providing builders and build jobs for ARM. Other architectures would be handled by their respective contributors. Those contributors can write jobs and plug builders as needed.
>>
>> Setting up new jobs doesn't require the use of the web interface. It can be done with yaml files in a git repo. It is reasonably simple.
>>
>> You can see an example of the Linaro CI here: https://ci.linaro.org/
>>
>> Also, a GDB job for aarch64: https://ci.linaro.org/job/tcwg-gdb/
>>
>> And also the summaries for GDB testsuite runs: https://ci.linaro.org/job/tcwg-compare-results/13968/artifact/artifacts/logs/0-report.html

The reports look really good.

When I looked into setting up a CI, the most difficult / annoying part
was analyzing the results.  Not only the racy tests that intermittently
fail, but also some tests that alternate between PASS and KFAIL (I don't
remember which exactly, I think the "many short lived threads" one).

The previous buildbot had a racy test analysis to detemine which tests
were flaky, and (I suppose) ignore them in the results.  I don't know
how well that worked.  But if that makes it simpler, I'd be ok with an
hand-maintained list of known flaky/racy tests that should be ignored
when comparing the results.  Ignoring some test results is not great,
but if it can make the CI's results more trustworthy, I say it's better
than letting the racy/flaky tests "contaminate" the results of all the
tests that aren't.

In the end, we want a CI with results we can trust.  With the racy/flaky
tests, it often ends up that builds fail but "oh it's just that test
failing, it's fine".  And this is how regressions can slip in.

>> Of course, this effort only makes sense if the community is OK with using Jenkins as the CI mechanism and if it actually sees value in having a system like this in place.

I am of course in favor of it.  And in my opinion, whoever makes the
effort has the final word on how it's done.  So if you are used to
Jenkins, let's go with Jenkins.

Simon


More information about the Gdb mailing list