This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Using sys_enter sys_exit trace point in place of syscall.*{.return} probes where possible
On 9/20/18 12:07 PM, David Smith wrote:
> On Thu, Sep 20, 2018 at 10:12 AM William Cohen <wcohen@redhat.com> wrote:
>>
>> On 9/19/18 5:14 PM, David Smith wrote:
>>> In testsuite/systemtap.examples/profiling/container_check.stp, you
>>> used _stp_syscall_nr(). I wouldn't do that, I'd use $id. I'm not 100%
>>> sure that _stp_syscall_nr() is going to work on every arch at that
>>> point.
>>
>> Hi David,
>>
>> Here are the raw tracepoints:
>>
>> $ stap -L 'kernel.trace("sys_*")'
>> kernel.trace("raw_syscalls:sys_enter") $regs:struct pt_regs* $id:long int
>> kernel.trace("raw_syscalls:sys_exit") $regs:struct pt_regs* $ret:long int
>>
>> It would have been preferable to use $id for the kernel.trace("sys_exit"), but it doesn't exist there. So it was _stp_syscall_nr() which works on some machines versus $id which doesn't work on any machine. I spent some time Wednesday changing things to have a tapset encapsulate with syscall_any and syscall_any.return probe points to hide details like the _stp_syscall_nr().>
> Ah, I didn't realize we were talking about syscall returns needing the
> syscall number. I seem to recall that arm64 (and perhaps s390x) had
> some restrictions about when the stuff called by _stp_syscall_nr()
> could be called. You might try testing on those platforms.
>
There are cases where it is nicer/more efficient to make the syscall number/name available for the return only need one probe point rather than two to collect data. An example of this are testsuite/systemtap.examples/lwtools/syscallbypid-nd.stp and testsuite/systemtap.examples/profiling/errno.stp where just want to know the syscall name/number and the return value.
Thanks for listing the architectures should look at to make sure that things work properly on. I will take a look at those and test on those machine.
>>> I also wonder if you shouldn't use the old code as a fallback,
>>> something like the following:
>>>
>>> ====
>>> probe kernel.trace("sys_exit")!, nd_syscall.*.return {
>>> # probe that doesn't do anything with the syscall info
>>> }
>>> ====
>>>
>>> That gets trickier if the probe does something with the syscall info.
>>
>> I considered using the nd_syscall.* and nd_syscall.*.return as fallbacks if the tracepoints were not available. However, the sys_enter and sys_exit tracepoints have been available since 2009. Even the RHEL6 kernel has them. It seemed unlikely that fallbacks on the nd_syscall.* would be needed, so they were omitted.
>
> OK, you talked me out of that one then.
>
Well, I had practice. I talked myself out it earlier. :) I previously had the probe points written just like you suggested.
-Will