Bug 15913 - on s390x, nd_syscall testsuite failures when accessing arguments 1 and 6
Summary: on s390x, nd_syscall testsuite failures when accessing arguments 1 and 6
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: tapsets (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-30 20:32 UTC by David Smith
Modified: 2013-10-17 18:52 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Smith 2013-08-30 20:32:25 UTC
When using the 'filename' convenience variable from 'execve', the syscall.execve probe works correctly on s390x:

====
# stap -e 'probe syscall.execve { printf("%s\n", filename) }' -c "sleep 0.2"
/usr/lib64/qt-3.3/bin/sleep
/usr/local/sbin/sleep
/usr/local/bin/sleep
/usr/sbin/sleep
/usr/bin/sleep
====

However, the nd_syscall.execve probe gets a copy fault:

====
# stap -e 'probe nd_syscall.execve { printf("%s\n", filename) }' -c "sleep 0.2"
ERROR: user string copy fault -14 at 0000000028ef1648 near identifier 'user_string_n' at /usr/local/share/systemtap/tapset/uconversions.stp:120:10
WARNING: Number of errors: 1, skipped probes: 1
WARNING: /usr/local/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]
====

(I'm sure this has something to do with the odd s390x argument passing. Normally, if you are in a syscall, the 'orig_gpr2' register has the 1st argument, while in a regular kernel function the 'r2' register has the 1st argument.)
Comment 1 David Smith 2013-08-30 20:55:33 UTC
Fixed in commit 1ab99b2.
Comment 2 David Smith 2013-09-03 19:36:27 UTC
After some more testing, the problem isn't in just with nd_syscall.execve, it is with all nd_syscall probes. The problem is that the kernel's syscall_get_arguments() function no longer returns the correct value for the 1st argument.

This will need to be fixed upstream in the kernel.
Comment 3 David Smith 2013-10-17 18:51:19 UTC
After some discussion with the kernel folks, this is a systemtap problem. Bug #11763 tried to fix accessing argument 6 on the s390x by using the kernel's syscall_get_arguments(). However, that function is only guaranteed to work on the pt_regs structure that gets intialized when a context switch from user space to kernel space happens due to a system call. This pt_regs structure is returned by 'task_pt_regs(current)'.

But, when using int_arg(N) in the nd_syscall tapset, we don't want the syscall's arg N, we want the *current* kernel function's arg N (since the function we're probing could be several calls away from the actual system call).

So, after some investigation, I've rewritten the s390x _stp_get_arg() to handle getting argument 6 (and above) from the stack.

I've tested this on RHEL5, RHEL6, and more recent kernels (3.10).

Fixed in commit eefd579.