This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: RFA [PATCH] Implement 'catch syscall' for gdbserver
- From: Philippe Waroquiers <philippe dot waroquiers at skynet dot be>
- To: Sergio Durigan Junior <sergiodj at redhat dot com>
- Cc: gdb-patches at sourceware dot org
- Date: Fri, 20 Sep 2013 01:30:06 +0200
- Subject: Re: RFA [PATCH] Implement 'catch syscall' for gdbserver
- Authentication-results: sourceware.org; auth=none
- References: <1377876388 dot 3041 dot 3 dot camel at soleil> <m3y56tjs5y dot fsf at redhat dot com>
On Thu, 2013-09-19 at 02:29 -0300, Sergio Durigan Junior wrote:
> Thank you very much for this, Philippe :-). I have some comments.
Thanks for the in-depth review.
I have a few comments for which I have questions/remarks below.
> > 1. GDB native has a bug in the detection of "syscall entry/return",
> > when the catch syscall is disabled when inferior is stopped
> > on a syscall entry, and then catch syscall is re-enabled later:
> Hm, I remember seeing and trying to fix this bug a long time ago. Not
> sure if I managed to do it or not (guess not, since I didn't send the
> patch, but I remember having something close to the fix). My initial
> idea was to record the event (syscall entry or return) inside GDB and
> make it decide based on it.
>
> Anyway, I agree it's a bug and that it deserves a bug report :-). Could
> you please file it?
I stared at the problem when I detected it,
but could not see any easy fix which would work properly
in all circumstances e.g. enable/disable catch at various
moments, including interrupting a process blocked in a syscall,
enable catch syscall and have the syscall restarted
(did not check the behaviour of ptrace, I guess we will directly
have a syscall return event, and so GDB has to guess this is a return
without having seen the entry).
I will file a bug report (will analyse first the various problematic
cases and maybe do a test to reproduce them).
> > RCS file: /cvs/src/src/gdb/remote.c,v
> > +/* If 'QCatchSyscalls' is supported, tell the remote stub
> > + to report syscalls to GDB. */
>
> I generally prefer to explicitly say which method this function is
> implementing. Like, in this case, mention that it is implementing
> "target_set_syscall_catchpoint". It makes it easier to find the
> definitions of each argument.
No problem to do that, however it looks like all other functions
in the same file are not commented like that.
So, doing this would make this new function commented differently
of all others.
Maybe better to keep it consistent ?
(and in another patch add for all functions a
"implements target_xxxxxxxx") ?
>
> > +
> > +static int
> > +remote_set_syscall_catchpoint (int pid, int needed, int any_count,
> > + int table_size, int *table)
> > +{
> > + if (remote_protocol_packets[PACKET_QCatchSyscalls].support != PACKET_DISABLE)
> > + {
> > + char *catch_packet, *p;
> > + enum packet_result result;
> > + int n_sysno = 0;
> > +
> > + if (needed && !any_count)
> > + {
> > + int i;
> > +
> > + for (i = 0; i < table_size; i++)
> > + if (table[i])
> > + n_sysno++;
>
> When would table[i] be 0 in this case? I guess you could use table_size
> directly without worrying about n_sysno.
No, the 0 condition is to be checked.
See following comment in target.h:
TABLE is an array of ints, indexed by syscall number. An element in
this array is nonzero if that syscall should be caught. This argument
only matters if ANY_COUNT is zero.
I added a comment:
/* Count how many syscalls are to be caught (table[sysno] != 0). */
>
> > + }
> > +
> > + if (remote_debug)
> > + fprintf_unfiltered (gdb_stdlog,
> > + "remote_set_syscall_catchpoint "
> > + "pid %d needed %d any_count %d n_sysno %d\n",
> > + pid, needed, any_count, n_sysno);
> > + if (needed)
> > + {
> > + /* Prepare a packet with the sysno list, assuming
> > + max 8+1 characters for a sysno. If the resulting
> > + packet size is too big, fallback on the non
> > + selective packet. */
> > + const char *q1 = "QCatchSyscalls:1";
>
> Really nitpicking but you could avoid calling strlen for q1 and replace
> it by a size_t variable holding the sizeof (q1) - 1. But this is just
> a minor thing.
I cannot use sizeof(q1) but I could possibly do
const size_t q1len = strlen (q1);
and use q1len after ?
(that would however only spare a (very unlikely) second call
to strlen, probably optimised by the compiler in any case).
> > + p = catch_packet;
> > + p += strlen(p);
> > + for (i = 0; i < table_size; i++)
> > + {
> > + if (table[i])
>
> Again, I don't see the reason of this check.
See above.
Added a comment:
/* Add in catch_packet each syscall to be caught (table[i] != 0). */
>
> > + {
> > + xsnprintf(p, catch_packet + maxpktsz - p,
> > + ";%x", i);
> > + p += strlen(p);
> > + }
> > + }
> > + }
> > + if (strlen(catch_packet) > get_remote_packet_size())
> > + {
> > + /* catch_packet too big. Fallback to less efficient
> > + non selective mode, with GDB doing the filtering. */
> > + catch_packet[strlen (q1)] = 0;
> > + }
>
> Hm, can't you do this check before start filling catch_packet? This
> would save some time.
Not easy to do (and saving probably very little cpu time in very
unfrequent cases):
The size of the packet depends on which syscalls
are added, and not only on the nr of syscalls to add in the packet.
E.g. a sysno can give either 2 or 3 or 4 characters (if we have
sysno only below 999).
> > + int syscall;
>
> I know this struct is already messed up, but I would put a comment above
> this field to explain its purpose.
Added the comment:
/* Set to 1 if the stop_reply is for a syscall entry or
return. The field ws.value.syscall_number then identifies
the syscall. */
> > - /* Check if the target supports PTRACE_O_TRACESYSGOOD. */
> > + /* Check if the target supports PTRACE_O_TRACESYSGOOD, keeping
> > + PTRACE_O_TRACEFORK option activated. */
> > ret = ptrace (PTRACE_SETOPTIONS, child_pid, (PTRACE_TYPE_ARG3) 0,
> > - (PTRACE_TYPE_ARG4) PTRACE_O_TRACESYSGOOD);
> > + (PTRACE_TYPE_ARG4) (PTRACE_O_TRACEFORK
> > + | PTRACE_O_TRACESYSGOOD));
>
> Any reason why you are keeping PTRACE_O_TRACEFORK set?
Changed the comment to:
/* Check if the target supports PTRACE_O_TRACESYSGOOD.
We keep PTRACE_O_TRACEFORK option activated as a fork
event notification is expected after my_waitpid below. */
> > RCS file: /cvs/src/src/gdb/common/linux-ptrace.c,v
> > if (ret == 0)
> > current_ptrace_options |= PTRACE_O_TRACESYSGOOD;
> >
> > +#ifdef GDBSERVER
> > + /* gdbserver does not support PTRACE_O_TRACEVFORKDONE yet. */
> > +#else
> > /* Check if the target supports PTRACE_O_TRACEVFORKDONE. */
> > ret = ptrace (PTRACE_SETOPTIONS, child_pid, (PTRACE_TYPE_ARG3) 0,
> > (PTRACE_TYPE_ARG4) (PTRACE_O_TRACEFORK
> > @@ -474,6 +475,11 @@ linux_enable_event_reporting (pid_t pid)
> > static int
> > ptrace_supports_feature (int ptrace_options)
> > {
> > + /* Check if we have initialized the ptrace features for this
> > + target. If not, do it now. */
> > + if (current_ptrace_options == -1)
> > + linux_check_ptrace_features ();
> > +
>
> This seems correct to me, but I think you should send it in a separate
> patch, because it is not specific to catch syscall IMO.
It looks better to me to keep it in this patch, as it was not needed
before, but is now needed due to the "early calls" to
ptrace_supports_feature needed for catch syscalls in gdbserver.
> > ===================================================================
> > RCS file: /cvs/src/src/gdb/gdbserver/linux-low.c,v
> > +static void
> > +get_syscall_trapinfo (struct lwp_info *lwp, int *sysno, int *sysret)
> > +{
> > + struct thread_info *saved_inferior;
> > + struct regcache *regcache;
> > +
> > + if (the_low_target.get_syscall_trapinfo == NULL)
> > + {
> > + *sysno = 0;
> > + *sysret = 0;
>
> Aren't you missing a "return" here?
Yes, added it.
>
> In fact, is there any way for "the_low_target.get_syscall_trapinfo" to
> be NULL? It seems to me that this is a necessary condition for this
> function to be called (see my comment about putting an assert at
> server.c:handle_general_set). So maybe you should put an assert here to
> check that the_low_target.get_syscall_trapinfo != NULL. WDYT?
Good question, I am not sure what to do.
It looks like many other calls to low level target
are protected similarly (which is not a good enough reason).
However, in this case, I suspect that if the GDB users forces
the use of the QCatchSyscalls packet, that we might end up there:
GDB will send such a packet, resulting in server.c setting
catching_syscalls_p to 1, resulting to ptrace "syscall" to be activated
resulting in a syscall trap event resulting in a call to getsyscall
trapinfo.
So, returning sysno 0 looks safe to me, to avoid a GDB user
causing an assert due to forcing the use of a (normally)
auto detected packet.
> > +/* Returns 1 if GDB is interested in the event_child syscall.
> > + Only to be called when stopped reason is SIGTRAP_SYSCALL. */
> > +
> > +static int
> > +gdb_catched_syscall (struct lwp_info *event_child)
>
> s/catched/caught/g ?
>
> Maybe also rename the function to "catch_current_syscall_p"? Or
> something to give the idea that we are checking if this specific syscall
> is interesting for us.
changed to gdb_catch_this_syscall_p.
_
>
> > +{
> > + int i;
> > + int sysno, sysret;
> > +
> > + if (!catch_syscalls_p)
>
> I think you could rename this to "catching_syscalls_p".
Yes, good idea.
>
> > + return 0;
> > +
> > + if (catched_syscalls_size == 0)
> > + return 1;
> > +
> > + get_syscall_trapinfo (event_child, &sysno, &sysret);
> > + for (i = 0; i < catched_syscalls_size; i++)
> > + if (catched_syscalls[i] == sysno)
> > + return 1;
>
> Also, I think the names of the variables are a bit misleading.
> "catched_syscalls" does not represent syscalls that were caught, but
> rather syscalls that are to be caught. Maybe rename them to
> "syscalls_to_be_caught" or something?
Yes.
Renamed to syscalls_to_catch_size and syscalls_to_catch.
grep-ped for frenglish 'catched' to ensure I catched all errors :).
> > Index: gdbserver/remote-utils.c
> > ===================================================================
> > RCS file: /cvs/src/src/gdb/gdbserver/remote-utils.c,v
> > retrieving revision 1.98
> > diff -u -p -r1.98 remote-utils.c
> > --- gdbserver/remote-utils.c 1 Jul 2013 11:19:27 -0000 1.98
> > +++ gdbserver/remote-utils.c 30 Aug 2013 15:18:16 -0000
> > @@ -1319,12 +1319,17 @@ prepare_resume_reply (char *buf, ptid_t
> > switch (status->kind)
> > {
> > case TARGET_WAITKIND_STOPPED:
> > + case TARGET_WAITKIND_SYSCALL_ENTRY:
> > + case TARGET_WAITKIND_SYSCALL_RETURN:
> > {
> > struct thread_info *saved_inferior;
> > const char **regp;
> > struct regcache *regcache;
> >
> > - sprintf (buf, "T%02x", status->value.sig);
> > + if (status->kind == TARGET_WAITKIND_STOPPED)
> > + sprintf (buf, "T%02x", status->value.sig);
> > + else
> > + sprintf (buf, "T%02x", SIGTRAP);
>
> Again, maybe it's just me, but would you mind putting a comment here
> explainig why you're using SIGTRAP directly if status->kind ==
> TARGET_WAITKIND_SYSCALL_{ENTRY,RETURN} ?
This is according to the protocol definition:
* If N is a recognized "stop reason", it describes a more
specific event that stopped the target. The currently
defined stop reasons are listed below. AA should be `05',
the trap signal. At most one stop reason should be present.
> > {
> > Index: gdbserver/server.c
> > ===================================================================
> > RCS file: /cvs/src/src/gdb/gdbserver/server.c,v
> > retrieving revision 1.196
> > diff -u -p -r1.196 server.c
> > --- gdbserver/server.c 28 Aug 2013 17:40:58 -0000 1.196
> > +++ gdbserver/server.c 30 Aug 2013 15:18:17 -0000
> > @@ -71,6 +71,9 @@ int debug_threads;
> > int debug_hw_points;
> >
> > int pass_signals[GDB_SIGNAL_LAST];
> > +int catch_syscalls_p;
> > +int catched_syscalls_size;
>
> size_t? :-)
I guess an int will be ok unless we reach zillions
of different syscalls, and then the protocol packet
decoding code will suffer a lot :).
> I am not very familiar with the remote protocol, but IMHO you could do
> an assert here to check if target_supports_catch_syscall returns 1,
> WDYT? IOW, we shouldn't get here if it returns 0, right?
>
> Or is there some sort of mechanism intended to overwrite the initial
> probing of features and force unsupported features to be enabled? In
> that case, I guess the assert would trigger every time... Not sure what
> to do.
Yes, I suspect this is true (so has not added asserts).
It is however not very clear to me why GDB allows to overwrite
auto-detected packets, as this might effectively give
unexpected results. But it looks better to not crash gdbserver
following a "legal" user action.
> > Index: testsuite/gdb.base/catch-syscall.exp
> >
> > +# This shall be updated whenever QCatchSyscalls packet support is implemented
> > +# on some gdbserver architecture.
> > +if { [is_remote target]
> > + && ![istarget "x86_64-*-linux*"]
> > + && ![istarget "i\[34567\]86-*-linux*"] } {
> > + continue
> > +}
>
> Hm, am I too sleepy or this test does nothing at all? :-).
We both suspect we are too sleepy :).
Looks to me the above is the expected condition, which is to "continue
somewhere else" (whatever that "continue" exactly means in tcl/dejagnu,
I have no idea :).
This condition looks similar to equivalent conditions just before,
that are skipping this test on non matching target.
But my tcl/dejagnu knowledge is close to 0, so I might have missed
something.
> Otherwise, the patch looks pretty good to me! I have tested it here and
> as far as I have seen it is working OK. I hope my comments help you
> with getting the patch approved. I am not a maintainer, so you will
> need to wait for one to review the patch as well :-).
Thanks for the extensive and helpful comments.
I will retest and submit another patch version (I guess this WE).
Any additional comment or feedback welcome in the meantime ...
Philippe