This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [RFC] mask off is-syscall bit for TRAP_IS_SYSCALL
A Monday 28 September 2009 20:58:48, Pedro Alves escreveu:
> On Monday 28 September 2009 20:39:05, Doug Evans wrote:
> > Hi.
> >
> > On one system I use (bi-arch ubuntu-hardy clone),
> > i386-disp-step.exp is failing because the wait status
> > value linux_nat_wait_1 gets when hitting a system call is 0857f
> > which gets passed to the upper layers which then get confused
> > by a signal of 0x85 (== 0x80 | SIGTRAP).
> >
> > This patch fixes things by masking off the 0x80 bit before
> > passing the signal number to up the call chain.
> >
> > Ok to check in?
>
> This seems OK-is to me, although I see one extra case that
> isn't handled correctly:
>
> stop_wait_callback:
>
> status = wait_lwp (lp);
>
> ...
>
> if (WSTOPSIG (status) != SIGSTOP)
> {
> if (WSTOPSIG (status) == SIGTRAP)
> {
> ...
> }
> else
> {
> /* If the lp->status field is still empty, use it to
> hold this event. If not, then this event must be
> returned to the event queue of the LWP. */
> if (lp->status)
> {
> if (debug_linux_nat)
> {
> fprintf_unfiltered (gdb_stdlog,
> "SWC: kill %s, %s\n",
> target_pid_to_str (lp->ptid),
> status_to_str ((int) status));
> }
> kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (status)); <<<<<<<<
> }
>
> It seems we can reach that <<< marked code with a TRAP_IS_SYSCALL, but,
> I doubt that we want to requeue that signal in the kernel (?).
>
In case I wasn't clear, I was alluding at the fact that I think
the filtering is done too late.. In fact, things are worse than I
imagined. See here for a simple failing example:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
void *
thread_function (void *arg)
{
while (1)
{
usleep (1);
}
}
int
main ()
{
int res;
pthread_t thread;
long i = 0;
for (i = 0; i < 10; i++)
{
res = pthread_create(&thread,
NULL,
thread_function,
NULL);
}
thread_function (NULL);
}
(gdb) r
Starting program: /home/pedro/gdb/tests/trap_is_syscall
[New Thread 0x7fe7dee066e0 (LWP 9531)]
[Thread debugging using libthread_db enabled]
[New Thread 0x40800950 (LWP 9538)]
[New Thread 0x41001950 (LWP 9539)]
[New Thread 0x41802950 (LWP 9540)]
[New Thread 0x42003950 (LWP 9541)]
[New Thread 0x42804950 (LWP 9542)]
[New Thread 0x43005950 (LWP 9545)]
[New Thread 0x43806950 (LWP 9547)]
[New Thread 0x44007950 (LWP 9548)]
[New Thread 0x44808950 (LWP 9549)]
[New Thread 0x45009950 (LWP 9550)]
<ctrl-c>
Program received signal SIGINT, Interrupt.
0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6
(gdb) catch syscall
warning: Could not open "syscalls/amd64-linux.xml"
warning: Could not load the syscall XML file `syscalls/amd64-linux.xml'.
GDB will not be able to display syscall names.
Catchpoint 1 (any syscall)
(gdb) c
Continuing.
[Switching to Thread 0x45009950 (LWP 9550)]
Catchpoint 1 (call to syscall 35), 0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6
(gdb) c
Continuing.
Program received signal ?, Unknown signal.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Switching to Thread 0x44808950 (LWP 9549)]
0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6
(gdb)
`?' is due to TRAP_IS_SYSCALL.
Same with "set debug lin-lwp 1":
linux_nat_wait: [process -1]
LLW: Using pending wait status Trace/breakpoint trap (stopped at syscall) for Thread 0x44808950 (LWP 10484).
LLW: Candidate event Trace/breakpoint trap (stopped at syscall) in Thread 0x44808950 (LWP 10484).
Program received signal ?, Unknown signal.
[Switching to Thread 0x44808950 (LWP 10484)]
0x00007ffff78ffb81 in nanosleep () from /lib/libc.so.6
I wouldn't be surprised if syscall events were inverted from here
on (entry/exit).
Playing with a patch like the below, it's easier to trigger the case
I was mentioning before:
---
gdb/linux-nat.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
Index: src/gdb/linux-nat.c
===================================================================
--- src.orig/gdb/linux-nat.c 2009-09-29 00:09:13.000000000 +0100
+++ src/gdb/linux-nat.c 2009-09-29 00:13:59.000000000 +0100
@@ -2400,6 +2400,13 @@ stop_wait_callback (struct lwp_info *lp,
target_pid_to_str (lp->ptid),
errno ? safe_strerror (errno) : "OK");
+ if (WSTOPSIG (status) == TRAP_IS_SYSCALL)
+ {
+ /* Simulate the case of a signal < SIGTRAP and
+ SIGSTOP being delivered before the SIGSTOP. */
+ // kill_lwp (GET_LWP (lp->ptid), SIGINT);
+ }
+
/* Hold this event/waitstatus while we check to see if
there are any more (we still want to get that SIGSTOP). */
stop_wait_callback (lp, NULL);
@@ -2416,7 +2423,9 @@ stop_wait_callback (struct lwp_info *lp,
target_pid_to_str (lp->ptid),
status_to_str ((int) status));
}
- kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (status));
+
+ // gdb_assert (WSTOPSIG (status) != TRAP_IS_SYSCALL);
+ gdb_assert (kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (status)) == 0);
}
else
lp->status = status;
Uncommenting the first commented out kill_lwp the patchlet adds, makes
the last gdb_assert trigger (can't kill with signal 0x85), meaning, a syscall event
gets lost.
--
Pedro Alves