This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [BuildBot] Notifications disabled for Debian-s390x-* and Fedora-ppc64*-* builders
On 12/15/2017 03:53 PM, David Edelsohn wrote:
> On Fri, Dec 15, 2017 at 10:42 AM, Pedro Alves <palves@redhat.com> wrote:
>> On 12/15/2017 03:06 PM, David Edelsohn wrote:
>>
>>> Third, the testsuite summaries that no one from the GDB community
>>> monitored show that the testsuite runtime jumped from a relatively
>>> short amount of time to over 9 hours for each run, which points to a
>>> newly introduced problem in GDB or in the testsuite (timeouts?).
>>
>> That may well be. Can you point at some representative builds,
>> before/after the jump?
>
> The testsuite runs for 6 minutes on RHEL7 s390x buildslave and 9 hours
> on Debian Jessie s390x buildslave.
Those are separate machines. I'd like to see the jump on the same
machine, so we can maybe pinpoint what caused it.
I was really asking for URLs. Here looks like there's some:
https://gdb-build.sergiodj.net/builders/Debian-s390x-native-gdbserver-m64
Here, for example:
https://gdb-build.sergiodj.net/builders/Debian-s390x-native-gdbserver-m64/builds/4351
"test gdb tested GDB failed (9 hrs, 2 mins, 56 secs)"
That's definitely too long.
I downloaded the gdb.log file, and did:
$ grep FAIL gdb.log | grep timeout | sed 's/.exp.*/.exp/g' | sort | uniq -c | sort -n
1 FAIL: gdb.base/watch-cond.exp
1 FAIL: gdb.multi/watchpoint-multi-exit.exp
1 FAIL: gdb.threads/interrupted-hand-call.exp
1 FAIL: gdb.threads/thread-unwindonsignal.exp
2 FAIL: gdb.base/value-double-free.exp
3 FAIL: gdb.mi/mi-async.exp
3 FAIL: gdb.threads/process-dies-while-detaching.exp
4 FAIL: gdb.base/pr11022.exp
10 FAIL: gdb.base/watch-bitfields.exp
15 FAIL: gdb.base/watchpoints.exp
20 FAIL: gdb.threads/interrupt-while-step-over.exp
32 FAIL: gdb.threads/watchpoint-fork.exp
45 FAIL: gdb.threads/step-over-trips-on-watchpoint.exp
46 FAIL: gdb.base/display.exp
51 FAIL: gdb.base/watchpoint.exp
Not _that_ many. Could they explain the long time? I suspect not.
We see this:
$ grep "Test run by" gdb.log | head -n 3
Test run by dje on Tue Nov 21 03:23:01 2017
Test run by dje on Tue Nov 21 03:23:01 2017
Test run by dje on Tue Nov 21 03:23:01 2017
$ grep "Test run by" gdb.log | tail -n 3
Test run by dje on Tue Nov 21 03:29:54 2017
Test run by dje on Tue Nov 21 03:29:54 2017
Test run by dje on Tue Nov 21 03:29:54 2017
So most of the testsuite actually ran for 7 minutes. And then
something hung for 9 hours? I have no idea how that
could happen from the existing logs. The tail end of the log has:
~~~
FAIL: gdb.base/watchpoint.exp: delete all breakpoints in delete_breakpoints (timeout)
ERROR: breakpoints not deleted
ERROR: breakpoints not deleted
command timed out: 1200 seconds without output running ['make', '-k', 'check', 'RUNTESTFLAGS=--target_board native-gdbserver', '-j8', 'FORCE_PARALLEL=1'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=32576.210392
~~~
I don't understand how 7 minutes plus 1200 seconds (~20min)
resulted in "elapsedTime=32576.210392" (~9h). Maybe that number
isn't to be trusted.
Anyway, I'm sorry, but I really don't have the time to be
looking at this. Someone with the motivation and access to
the machine could try running the testsuite manually,
for example, see how long that takes, and where the hang is.
> The Debian Jessie system also runs a Python buildslave without
> problem. The system has 4 virtual cpus and 16GB of memory, which
> should be more than adequately sized.
Thanks,
Pedro Alves