Stone, Joshua I wrote:
On Wednesday, August 16, 2006 1:36 AM, Li Guanglei wrote:
Below are my questions while I am trying to add signal trace hooks
into LKET:
Thanks for reviewing.
1. group_send_sig_info() will check the permissions for sending the
signal before calling __group_send_sig_info(). So probing
__group_send_sig_info() will omit the situation that a signal has been
actually sent out but was rejected due to wrong permission.
But __group_send_sig_info() will be also called by the following
functions besides group_send_sig_info(),:
1> do_notify_parent() when a process exits
2> check_process_timers() for POSIX timers
3> do_notify_parent_cldstop()
4> kill_proc_info_as_uid()
So which one do you prefer to probe, __group_send_sig_info or
group_send_sig_info?
I prefer __group_signal_sig_info, because it catches the additional
signals that you noted. Also, consider the parallel case of
specific_send_sig_info, which is called by sys_tkill and sys_tgkill. In
that case, the permission check happens in the sys_* functions, before
specific_send_sig_info is called. Thus on that path we are only
capturing the signals that will actually be sent (though possibly
ignored).
Yes. Since do_tkill will also call check_kill_permission before calling
specific_send_sig_info(), __group_send_sig_info() looks more like a
parallel of specific_send_sig_info() than group_send_sig_info().
This establishes a pretty consistent view of what signal.send means: if
a process successfully sends a signal, that will trigger signal.send.
Signals that are rejected because of permissions will return a failure
to the sender, and thus shouldn't trigger signal.send. Signals that are
ignored will still report success to the sender, even though the
recipient dropped it on the floor, so it should trigger signal.send.
Yes. From the source codes, sending signal will return success(0) even
if it is ignored or is a duplication of an existing signal in the
pending queue. And it will return failure if failed permission checking.
So I agree with you that __group_send_sig_info() is a more suitable
probe point than group_send_sig_info(). Thanks for your suggestion.
If you want to find signals that are rejected due to permissions, then a
different probe point will be needed, like perhaps a return probe on
check_kill_permissions.
Yes. I am considering adding signal.check_perm. It will also probe the
entry of check_kill_permissions() to get more info about the signals
besides the return value.
2. send_sigqueue() and send_group_sigqueue() is only used for POSIX
timers. If we choose to probe them, then I think it's better to add a
local variable like "send_to_queue=1" to indicate that the signal is
generated by posix timers. Then by looking at "send_to_queue" and
"shared" local variables, we can determine which function actually
triggers the probe handler.
Ok, that's a good suggestion.
* handle_stop_signal: Your comment says "fires when a stop signal is
sent", but this is incorrect. This function is called to _check_
whether this signal is a stop/cont, and if so take special action.
Thus, this will get called for almost every signal, many of which
will not be stop signals.
Yes. You are right. But the comment is still unchanged. :-)
Indeed -- I didn't take it on myself to fix *all* of my gripes. :) I
didn't fix this one because I'm not convinced that it's needed, or what
useful information can be gotten from probing handle_stop_signal. If
you really want, we could make a probe that does what the comment says,
something like this:
Yes. I see. At the time that handle_stop_signal() is called, it doesn't
know whether current signal is STOP/COUNT. So the calling of
handle_stop_signal() doesn't mean that the Kernel is now processing the
STOP/COUNT signal.
probe signal.send_stop = signal.send {
if (sig != SIGSTOP) next;
}
But users can do this themselves without much trouble, and with better
granularity. It doesn't make sense for us to enumerate send_stop,
send_cont, send_int, etc., when the user can just check the sig value
manually.
Yes. The user can just look at the sig variable in signal.send to get
enough information.
* syscall duplication: some of your probes are duplicating points
from the syscall tapset (signal.syskill, etc.). Do we really need
these? If you really want these, it would be preferable to make use
of the syscall tapset, e.g.: probe signal.syskill = syscall.kill
Yes. Agree.
Agree that they're not needed? Or do you want the latter, where we keep
the probepoint, but alias it to the syscall tapset.
I don't have a obvious preference towards whether to keep or remove
them. I just think that keeping and alias them to syscall tapset will be
a little convenient when a user want to trace only the signal activities.
- Guanglei