This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
RE: Displaced stepping not always working as expected
> -----Original Message-----
> From: Pedro Alves [mailto:pedro@codesourcery.com]
> Sent: Wednesday, September 21, 2011 6:20 AM
> To: gdb@sourceware.org
> Cc: Marc Khouzam
> Subject: Re: Displaced stepping not always working as expected
>
> On Tuesday 20 September 2011 20:54:24, Marc Khouzam wrote:
> > Hi,
> >
> > I just need a hint on where next to look...
>
> > I've been asked to look into problems with non-stop on
> > a user-mode-linux virtual machine
> > (http://user-mode-linux.sourceforge.net/)
>
> So does this only happen with UML? UML uses ptrace internally for
> its own business, I wouldn't be surprised if there's something
> wonky going on at that level.
Yes, only on UML. In fact, only on a particular installation of UML:
I am not able to reproduce the problem on my own installation of UML.
So, I also believe it is because of the UML. But I'm hoping it might be
something that can be fixed. Which is why I'm trying to pin-point
the cause.
> > On that AMD 64bit machine, I cannot step or resume past a breakpoint
> > when using non-stop with a multi-threaded program _if_ any of the
> > threads is still running. If I interrupt all threads, then
> displaced
> > stepping works.
>
> I wouldn't be surprised if the UM kernel is reporting a spurious
> SIGTRAP to gdb. Try "set debug lin-lwp 1" as well, but I don't
> think it'll tell you much. Maybe peeking at eflags or the siginfo
> of that SIGTRAP reveals something.
Thanks.
No luck with "set debug lin-lwp 1" but I will try to open up the
siginfo. I've spend the better part of the day getting familiar
with the relevant code.
> Try "set debug lin-lwp 1", and see if the resume was preempted and
> for some bizarre reason the core is getting a cached wait status
> instead of really resuming the thread.
>
> Otherwise, this smells like a UML problem.
One bizarre thing I noticed when trying "set debug target 1"
is that more threads get started than there should!
Normally, I get (in summary):
set non-stop on
set target-async on
b 8
r&
Breakpoint 1, thread_exec1 (ptr=0x400888) at multithread.c:8
8 i++;
info thr
Id Target Id Frame
2 Thread 0x40804940 (LWP 946) "multi3" thread_exec1 (ptr=0x400888) at multithread.c:8
* 1 Thread 0x40003800 (LWP 943) "multi3" (running)
If I add "set debug target 1" before 'r&', I get:
info thr
Id Target Id Frame
4 Thread 0x41806940 (LWP 940) "multi3" thread_exec1 (ptr=0x400888) at multithread.c:8
target_core_of_thread (935) = 0
3 Thread 0x41005940 (LWP 939) "multi3" (running)
target_core_of_thread (935) = 0
2 Thread 0x40804940 (LWP 938) "multi3" (running)
target_core_of_thread (935) = 0
* 1 Thread 0x40003800 (LWP 935) "multi3" (running)
What the??? The program (copied below), only starts two threads.
I wonder if libthread is causing some problem when using UML?
When using gdbserver, I do often get a gdbserver warning when attaching:
"PID mismatch! Expected 789, got 791"
where 791 is the LWP of a thread other than the main one.
Anyway, I'll keep digging. Thanks for the pointers.
Marc
My test program
===============
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void *thread_exec1(void *ptr) {
int i;
for (i=0;i<500;i++) {
i++;
i--;
printf("in the second thread %d\n", i);
sleep(1);
}
}
int main() {
pthread_t thread2;
int iret2 = pthread_create( &thread2, NULL, thread_exec1, (void*) "Thread 2");
printf("in the first thread\n");
int i;
for (i=0;i<30;i++) {
sleep(2); // while here, non-stop can't step over breakpoints
}
printf("ABOUT TO CALL JOIN\n");
pthread_join(thread2, NULL); // but while here, it can!
return 0;
}