Bug 11388

Summary: syscall.mmap* probes versus 2.6.33+ kernels
Product: systemtap Reporter: Mark Wielaard <mark>
Component: tapsetsAssignee: David Smith <dsmith>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Mark Wielaard 2010-03-16 16:42:39 UTC
Recent kernels saw some cleanups of the mmap implementation code. The systemtap
syscall tapsets should be updated to match.

The mmap syscalls are traditionally architecture specific because different
architectures used different implementation strategies, so had different entry
points. So look for the tapset aliases under tapset/<arch>/syscall.stp. The main
differences were because on some architectures not all arguments could be passed
in registers, so the arguments were passed in as one argument to a struct. Some
architectures even had both variants of the syscall. i386 for example has
syscall number 90 mmap, which takes a struct, and syscall number 192 mmap2,
which takes 6 arguments).

On different architectures either variant might exist and mapped to different
implementation function names. Recent cleanups (some only in recent git, not yet
released) map them to more consistent names (sys_old_mmap and sys_mmap_pgoff)

The relevant kernel commits are:

unreleased kernel:

commit a4679373cf4ee0e7792dc56205365732b725c2c1
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Mar 10 15:21:15 2010 -0800

    Add generic sys_old_mmap()
    
    Add a generic implementation of the old mmap() syscall, which expects its
    argument in a memory block and switch all architectures over to use it.

2.6.33 kernel:

commit f8b7256096a20436f6d0926747e3ac3d64c81d24
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Mon Nov 30 17:37:04 2009 -0500

    Unify sys_mmap*
    
    New helper - sys_mmap_pgoff(); switch syscalls to using it.
    
    Acked-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Comment 1 David Smith 2010-03-17 16:09:50 UTC
Here's an example of the stap output with these kernel changes (kernel
2.6.32.9-70.fc12.i686.PAE):

# stap -ve 'probe syscall.mmap* { printf("%s\n", argstr) }'
Pass 1: parsed user script and 67 library script(s) using
20428virt/12512res/2048shr kb, in 490usr/30sys/527real ms.
semantic error: probe point mismatch at position 1 (alternatives: accept access
acct add_key adjtimex alarm bdflush bind brk capget capset chdir chmod chown
chown16 chroot clock_getres clock_gettime clock_nanosleep clock_settime close
compat_adjtimex compat_clock_nanosleep compat_execve compat_futex
compat_futimesat compat_getitimer compat_io_setup compat_io_submit
compat_nanosleep compat_ppoll compat_pselect6 compat_pselect7 compat_pselect7a
compat_select compat_setitimer compat_signalfd compat_sys_msgctl
compat_sys_msgrcv compat_sys_msgsnd compat_sys_recvmsg compat_sys_semctl
compat_sys_semtimedop compat_sys_sendmsg compat_sys_shmat compat_sys_shmctl
compat_sys_utimes compat_utime compat_utimensat compat_vmsplice connect creat
delete_module dup dup2 epoll_create epoll_ctl epoll_pwait epoll_wait eventfd
execve exit exit_group faccessat fadvise64 fadvise64_64 fchdir fchmod fchmodat
fchown fchown16 fchownat fcntl fdatasync fgetxattr flistxattr flock fork
fremovexattr fsetxattr fstat fstatat fstatfs fstatfs64 fsync ftruncate
ftruncate64 futex futimesat get_mempolicy get_thread_area getcwd getdents
getegid geteuid getgid getgroups gethostname getitimer getpeername getpgid
getpgrp getpid getppid getpriority getresgid getresuid getrlimit getrusage
getsid getsockname getsockopt gettid gettimeofday getuid getxattr init_module
inotify_add_watch inotify_init inotify_rm_watch io_cancel io_destroy
io_getevents io_setup io_submit ioctl ioperm iopl ioprio_get ioprio_set ipc
kexec_load keyctl kill lchown lchown16 lgetxattr link linkat listen listxattr
llistxattr llseek lookup_dcookie lremovexattr lseek lsetxattr lstat madvise
mbind migrate_pages mincore mkdir mkdirat mknod mknodat mlock mlockall mmap2
modify_ldt mount move_pages mprotect mq_getsetattr mq_notify mq_open
mq_timedreceive mq_timedsend mq_unlink mremap msgctl msgget msgrcv msgsnd msync
munlock munlockall munmap nanosleep nfsservctl ni_syscall nice open openat pause
personality pipe pivot_root poll ppoll prctl pread pselect6 pselect7 ptrace
pwrite pwrite32 quotactl read readahead readdir readlink readlinkat readv reboot
recv recvfrom recvmsg remap_file_pages removexattr rename renameat request_key
restart_syscall rmdir rt_sigaction rt_sigaction32 rt_sigpending rt_sigprocmask
rt_sigqueueinfo rt_sigreturn rt_sigsuspend rt_sigtimedwait
sched_get_priority_max sched_get_priority_min sched_getaffinity sched_getparam
sched_getscheduler sched_rr_get_interval sched_setaffinity sched_setparam
sched_setscheduler sched_yield select semctl semget semop semtimedop send
sendfile sendmsg sendto set_mempolicy set_thread_area set_tid_address
set_zone_reclaim setdomainname setfsgid setfsuid setgid setgroups sethostname
setitimer setpgid setpriority setregid setregid16 setresgid setresgid16
setresuid setresuid16 setreuid setreuid16 setrlimit setsid setsockopt
settimeofday settimeofday32 setuid setxattr sgetmask shmat shmctl shmdt shmget
shutdown sigaction sigaction32 sigaltstack signal signalfd sigpending
sigprocmask sigreturn sigsuspend socket socketpair splice ssetmask stat statfs
statfs64 stime swapoff swapon symlink symlinkat sync sysctl sysfs sysinfo syslog
tee tgkill time timer_create timer_delete timer_getoverrun timer_gettime
timer_settime timerfd times tkill truncate tux umask umount uname unlink
unlinkat unshare uselib ustat ustat32 utime utimensat utimes vhangup vm86
vm86old vmsplice wait4 waitid write writev) didn't find any wildcard matches:
identifier 'mmap*' at <input>:1:15 while resolving probe point syscall.mmap*
        source: probe syscall.mmap* { printf("%s\n", argstr) }
                              ^
Pass 2: analyzed script: 0 probe(s), 0 function(s), 0 embed(s), 0 global(s)
using 172624virt/99344res/73720shr kb, in 1310usr/860sys/8520real ms.
Pass 2: analysis failed.  Try again with another '--vp 01' option.
Comment 2 David Smith 2010-03-22 15:44:16 UTC
Commit 56f3dbb handles the sys_mmap_pgoff() change.
Comment 3 David Smith 2010-03-22 18:15:21 UTC
Commit 1304578 handles the sys_mmap_pgoff() change for the nd_syscall tapset.
Comment 4 David Smith 2010-12-20 22:52:02 UTC
Commits 0c655dc (syscall tapsets) and 701af3b (nd_syscall tapsets) fix the mmap stuff for all arches.  For kernel's < 2.6.33, the existing code is used.  For kernel's >= 2.6.33, the arch-generic sys_mmap_pgoff is used.

Using sys_mmap_pgoff also catches sys_old_mmap() calls.