This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [BuildBot] Notifications disabled for Debian-s390x-* and Fedora-ppc64*-* builders
On Fri, Dec 15, 2017 at 12:29 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> On Fri, Dec 15, 2017 at 11:20 AM, Pedro Alves <palves@redhat.com> wrote:
>> On 12/15/2017 03:53 PM, David Edelsohn wrote:
>>> On Fri, Dec 15, 2017 at 10:42 AM, Pedro Alves <palves@redhat.com> wrote:
>>>> On 12/15/2017 03:06 PM, David Edelsohn wrote:
>>>>
>>>>> Third, the testsuite summaries that no one from the GDB community
>>>>> monitored show that the testsuite runtime jumped from a relatively
>>>>> short amount of time to over 9 hours for each run, which points to a
>>>>> newly introduced problem in GDB or in the testsuite (timeouts?).
>>>>
>>>> That may well be. Can you point at some representative builds,
>>>> before/after the jump?
>>>
>>> The testsuite runs for 6 minutes on RHEL7 s390x buildslave and 9 hours
>>> on Debian Jessie s390x buildslave.
>>
>> Those are separate machines. I'd like to see the jump on the same
>> machine, so we can maybe pinpoint what caused it.
>>
>> I was really asking for URLs. Here looks like there's some:
>>
>> https://gdb-build.sergiodj.net/builders/Debian-s390x-native-gdbserver-m64
>>
>> Here, for example:
>>
>> https://gdb-build.sergiodj.net/builders/Debian-s390x-native-gdbserver-m64/builds/4351
>>
>> "test gdb tested GDB failed (9 hrs, 2 mins, 56 secs)"
>>
>> That's definitely too long.
>>
>> I downloaded the gdb.log file, and did:
>>
>> $ grep FAIL gdb.log | grep timeout | sed 's/.exp.*/.exp/g' | sort | uniq -c | sort -n
>> 1 FAIL: gdb.base/watch-cond.exp
>> 1 FAIL: gdb.multi/watchpoint-multi-exit.exp
>> 1 FAIL: gdb.threads/interrupted-hand-call.exp
>> 1 FAIL: gdb.threads/thread-unwindonsignal.exp
>> 2 FAIL: gdb.base/value-double-free.exp
>> 3 FAIL: gdb.mi/mi-async.exp
>> 3 FAIL: gdb.threads/process-dies-while-detaching.exp
>> 4 FAIL: gdb.base/pr11022.exp
>> 10 FAIL: gdb.base/watch-bitfields.exp
>> 15 FAIL: gdb.base/watchpoints.exp
>> 20 FAIL: gdb.threads/interrupt-while-step-over.exp
>> 32 FAIL: gdb.threads/watchpoint-fork.exp
>> 45 FAIL: gdb.threads/step-over-trips-on-watchpoint.exp
>> 46 FAIL: gdb.base/display.exp
>> 51 FAIL: gdb.base/watchpoint.exp
>>
>> Not _that_ many. Could they explain the long time? I suspect not.
>>
>> We see this:
>>
>> $ grep "Test run by" gdb.log | head -n 3
>> Test run by dje on Tue Nov 21 03:23:01 2017
>> Test run by dje on Tue Nov 21 03:23:01 2017
>> Test run by dje on Tue Nov 21 03:23:01 2017
>>
>> $ grep "Test run by" gdb.log | tail -n 3
>> Test run by dje on Tue Nov 21 03:29:54 2017
>> Test run by dje on Tue Nov 21 03:29:54 2017
>> Test run by dje on Tue Nov 21 03:29:54 2017
>>
>> So most of the testsuite actually ran for 7 minutes. And then
>> something hung for 9 hours? I have no idea how that
>> could happen from the existing logs. The tail end of the log has:
>>
>> ~~~
>> FAIL: gdb.base/watchpoint.exp: delete all breakpoints in delete_breakpoints (timeout)
>> ERROR: breakpoints not deleted
>> ERROR: breakpoints not deleted
>>
>> command timed out: 1200 seconds without output running ['make', '-k', 'check', 'RUNTESTFLAGS=--target_board native-gdbserver', '-j8', 'FORCE_PARALLEL=1'], attempting to kill
>> process killed by signal 9
>> program finished with exit code -1
>> elapsedTime=32576.210392
>> ~~~
>>
>> I don't understand how 7 minutes plus 1200 seconds (~20min)
>> resulted in "elapsedTime=32576.210392" (~9h). Maybe that number
>> isn't to be trusted.
>>
>> Anyway, I'm sorry, but I really don't have the time to be
>> looking at this. Someone with the motivation and access to
>> the machine could try running the testsuite manually,
>> for example, see how long that takes, and where the hang is.
>
> I will try reverting to an older version of DejaGNU framework.
Older DejaGNU does not seem to have an effect. All of the processes
are stuck in "gdb.threads/process-dies-while-handling-bp"
Thanks, David