This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: x86_64-m32 internal error for multi-thread-step.exp [Re: [PATCH v10 06/28] btrace: change branch trace data structure]
- From: Pedro Alves <palves at redhat dot com>
- To: "Metzger, Markus T" <markus dot t dot metzger at intel dot com>, Jan Kratochvil <jan dot kratochvil at redhat dot com>, Patrick Palka <patrick at parcs dot ath dot cx>
- Cc: "gdb-patches at sourceware dot org" <gdb-patches at sourceware dot org>
- Date: Thu, 22 Jan 2015 13:36:28 +0000
- Subject: Re: x86_64-m32 internal error for multi-thread-step.exp [Re: [PATCH v10 06/28] btrace: change branch trace data structure]
- Authentication-results: sourceware.org; auth=none
- References: <1389686678-9039-1-git-send-email-markus dot t dot metzger at intel dot com> <1389686678-9039-7-git-send-email-markus dot t dot metzger at intel dot com> <20150108204943 dot GA4851 at host2 dot jankratochvil dot net> <A78C989F6D9628469189715575E55B231E6C3811 at IRSMSX104 dot ger dot corp dot intel dot com> <A78C989F6D9628469189715575E55B231E6C4759 at IRSMSX104 dot ger dot corp dot intel dot com>
On 01/22/2015 12:29 PM, Metzger, Markus T wrote:
>> -----Original Message-----
>> From: Metzger, Markus T
>> Sent: Tuesday, January 20, 2015 4:08 PM
>> To: Jan Kratochvil
>> Cc: palves@redhat.com; gdb-patches@sourceware.org
>
>
>> I can't reproduce this fail; I don't get that far. This test fails for me with
>>
>> FAIL: gdb.btrace/multi-thread-step.exp: continue to breakpoint: cont
>> to multi-thread-step.c:34 (timeout)
>
> This fail seems to be caused by 588dcc3edbde19f90e76de969dbfa7ab3e17951a
> "Consolidate the custom TUI query hook with the default query hook". It is not
> related to btrace.
>
> The failing test program looks like this:
>
> pthread_barrier_wait (&barrier);
> global = 42; /* bp.1 */
> pthread_barrier_wait (&barrier);
>
> There are two threads, both are at bp.1 between the two barriers. When I now
> delete all breakpoints like this:
>
> (gdb) del
> Delete all breakpoints? (y or n) y
>
> and then continue the inferior, only the current thread is resumed. The other
> thread remains at its current location. The resumed thread waits at the barrier
> and the test runs into a timeout.
>
> Here's a complete debug session:
>
> (gdb) b 30
> Breakpoint 1 at 0x400776: file gdb.btrace/multi-thread-step.c, line 30.
> (gdb) r
> Starting program: gdb.btrace/multi-thread-step
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0x7ffff7fce700 (LWP 22156)]
>
> Breakpoint 1, test (arg=0x0) at gdb.btrace/multi-thread-step.c:30
> 30 global = 42; /* bp.1 */
> (gdb) del
> Delete all breakpoints? (y or n) y
> (gdb) info thr
> Id Target Id Frame
> 2 Thread 0x7ffff7fce700 (LWP 22156) "multi-thread-st" test (arg=0x0) at gdb.btrace/multi-thread-step.c:30
> * 1 Thread 0x7ffff7fcf740 (LWP 22152) "multi-thread-st" test (arg=0x0) at gdb.btrace/multi-thread-step.c:30
> (gdb) c
> Continuing.
> ^C
> Program received signal SIGINT, Interrupt.
> 0x000000384380c20c in pthread_barrier_wait () from /lib64/libpthread.so.0
> (gdb) info thr
> Id Target Id Frame
> 2 Thread 0x7ffff7fce700 (LWP 22156) "multi-thread-st" test (arg=0x0) at gdb.btrace/multi-thread-step.c:30
> * 1 Thread 0x7ffff7fcf740 (LWP 22152) "multi-thread-st" 0x000000384380c20c in pthread_barrier_wait () from /lib64/libpthread.so.0
>
> When I set debug infrun, I get the this:
>
> (gdb) del
> Delete all breakpoints? (y or n) y
> (gdb)
> infrun: target_wait (-1, status) =
> infrun: -1 [process -1],
> infrun: status->kind = no-resumed
> infrun: TARGET_WAITKIND_NO_RESUMED (ignoring)
> infrun: prepare_to_wait
>
> I don't see this with the old query behaviour or when I remove breakpoints like this
>
> (gdb) del 1
Hmm, gdb_readline_wrapper believes the target was async to begin
with. That seems to be an issue with linux_nat_is_async_p. And
then, gdb_readline_wrapper_cleanup sets the target async again,
which triggers the target_wait call. It's normal that only
one thread is resumed, because the other thread has an event
pending already. Normally that works because at the very end of
linux_nat_resume, we'll re-enable async, which, if we were
sync before, tells the event loop to poll them. But in this
case, we're reaching linux_nat_resume already async, do nothing
wakes up the event loop, and so the pending event is never
collected and handled by infrun.
Let me see if I can come up with a fix.
Thanks,
Pedro Alves