We need a tapset that uses the new task-finder runtime in order to populate and maintain systemtap script-level globals/functions that allow pid-based data lookup. Specifically: function pid2execname:string (pid:long) {} function pid2cwdpath:string (pid:long) {} ... and maybe more. These functions could be implemented as embedded-c or script, using globals that are managed by utrace task-finder callback functions notifying the tapset of tasks running chdir()/chroot()/exec(). To initially populate the globals (for preexisting processes), the embedded-c code would need to use possibly sleepy kernel functions to process task_struct*'s.
Similarly, pid2argv:string (pid:long, argn:long) {} from which command-line argument-based filtering.
Regarding pid2execname, find_task_by_vpid/pid may be better candidate for implementation of the lookup function.
(In reply to comment #2) > Regarding pid2execname, find_task_by_vpid/pid may be better candidate for > implementation of the lookup function. Right, as long as find_task_XXX may be invoked from atomic context.
Created attachment 3771 [details] lookup functions pid2task and pid2execname based on find_task_by_*
I once tried to implement the functions using task_finder. The basic code is like the following #include "task_finder.c" static char _stp_taskname[TASK_COMM_LEN]=""; static int _stp_process_search_cb(struct stap_task_finder_target *tgt, struct task_struct *tsk, int register_p, int process_p) { if (register_p) { /* found one match */ if (tsk == NULL) { strlcpy(_stp_taskname, "UNKNOWN", MAXSTRINGLEN); return 1; } else { strlcpy(_stp_taskname, tsk->comm, MAXSTRINGLEN); return 0; } } return 1; } function pid2execname2:string(pid:long) %{ /* pure */ struct stap_task_finder_target tgt; tgt.pathname = NULL; tgt.pid = (pid_t)(long)THIS->pid; tgt.callback = &_stp_process_search_cb; tgt.vm_callback = NULL; stap_register_task_finder_target(&tgt); stap_start_task_finder(); strlcpy(THIS->__retvalue, _stp_taskname, MAXSTRINGLEN); stap_stop_task_finder(); CATCH_DEREF_FAULT(); %} But it can't return the execname of most running processes like init, mingetty, kjournald because the callback is never invoked. Only works fine on my forked test process. My box is 2.6.27.9-159.fc10.i686. Did I miss something or misunderstand the mechanism?
(In reply to comment #5) > I once tried to implement the functions using task_finder. The basic code is > like the following > > #include "task_finder.c" > static char _stp_taskname[TASK_COMM_LEN]=""; > static int _stp_process_search_cb(struct stap_task_finder_target *tgt, struct > task_struct *tsk, int register_p, int process_p) { > if (register_p) { /* found one match */ > if (tsk == NULL) { > strlcpy(_stp_taskname, "UNKNOWN", MAXSTRINGLEN); > return 1; > } else { > strlcpy(_stp_taskname, tsk->comm, MAXSTRINGLEN); > return 0; > } > } > return 1; > } > > function pid2execname2:string(pid:long) %{ /* pure */ > struct stap_task_finder_target tgt; > tgt.pathname = NULL; > tgt.pid = (pid_t)(long)THIS->pid; > tgt.callback = &_stp_process_search_cb; > tgt.vm_callback = NULL; > stap_register_task_finder_target(&tgt); > stap_start_task_finder(); > strlcpy(THIS->__retvalue, _stp_taskname, MAXSTRINGLEN); > stap_stop_task_finder(); > CATCH_DEREF_FAULT(); > %} > > But it can't return the execname of most running processes like init, mingetty, > kjournald because the callback is never invoked. > Only works fine on my forked test process. My box is 2.6.27.9-159.fc10.i686. The task_finder will never be able to return the execname or pathname of kernel threads, like 'init' or 'kjournald'. The 'u' in 'utrace' stands for 'user' - it only works on user threads. I'm unsure of why 'mingetty' couldn't be found. > Did I miss something or misunderstand the mechanism? Besides the above problem of returning the execname/pathname for kernel threads, there are other things that won't work well in the code you posted. As currently written, the task_finder isn't designed to register new targets while running (since there is no locking provided) or to be started/stopped more than once.
(In reply to comment #6) > > But it can't return the execname of most running processes like init, mingetty, > > kjournald because the callback is never invoked. > > Only works fine on my forked test process. My box is 2.6.27.9-159.fc10.i686. > > The task_finder will never be able to return the execname or pathname of kernel > threads, like 'init' or 'kjournald'. The 'u' in 'utrace' stands for 'user' - it > only works on user threads. I'm unsure of why 'mingetty' couldn't be found. > > > Did I miss something or misunderstand the mechanism? > > Besides the above problem of returning the execname/pathname for kernel threads, > there are other things that won't work well in the code you posted. As > currently written, the task_finder isn't designed to register new targets while > running (since there is no locking provided) or to be started/stopped more than > once. > Thank you for letting me know this. Seems task_finder is not good candidate for pid lookup functions. Maybe my patch in #4 is a choice.
tricky, not really needed
(In reply to Frank Ch. Eigler from comment #0) > function pid2cwdpath:string (pid:long) {} I'd find this pretty handy. How about reconsidering this one?
Right now stap can access parts of the procfs. How about declassifying certain userspace information right from there? That might bring systemtap closer to the sysadmins.
(In reply to Martin Cermak from comment #10) > Right now stap can access parts of the procfs. How about declassifying > certain userspace information right from there? That might bring systemtap > closer to the sysadmins. Unless I'm forgetting something, systemtap can't really access procfs. We can add additional information to procfs (via procfs probes), but we really can't access other parts of procfs. If procfs was a real filesystem, we might be able to traverse it inside a systemtap module, but even then once we found the right file we wouldn't be able to open it and read it. Until we can think of a good way to accomplish this, I'm going to reclose this one.
David, I think the old idea was to introduce some tapset functions that, based on kprocess.* probes or the like, maintain global data like pid2FOO tables for use by stap. The idea was not to -read- /proc/$PID/foo, but to track some equivalent data within stap globals.
(In reply to Frank Ch. Eigler from comment #12) > David, I think the old idea was to introduce some tapset functions that, > based on kprocess.* probes or the like, maintain global data like pid2FOO > tables for use by stap. The idea was not to -read- /proc/$PID/foo, but to > track some equivalent data within stap globals. Yes, that was the old idea. I thought Martin was proposing a new idea of reading /proc/$PID/foo. With enough work, we could probably maintain some global data that mapped pids to execnames. The unfortunate part would be that it would have to map every single process in the system in order to be able to give you any pid. Mapping pids to cwd paths would be harder, since we don't keep up with that now. The trickiest part of all of this would be finding storage for all this new information, especially paths.
e.g. approximately cat > tapset/linux/pid2cmdline.stp global __pid2cmdline% function pid2cmdline(p) { if (p in __pid2cmdline) return __pid2cmdline[p] } probe kprocess.exec_complete { __pid2cmdline(tid() /* or pid? */) = cmdline_str() } ^D etc.
(In reply to Frank Ch. Eigler from comment #14) > e.g. approximately > > cat > tapset/linux/pid2cmdline.stp > global __pid2cmdline% > function pid2cmdline(p) { if (p in __pid2cmdline) return __pid2cmdline[p] } > probe kprocess.exec_complete { > __pid2cmdline(tid() /* or pid? */) = cmdline_str() > } > ^D > > etc. That's an interesting idea. The tricky part would be initially populating the array with processes that already exist.
> The tricky part would be initially populating > the array with processes that already exist. Yup. A gross hack could look thusly: function populate_foo () %{ iterate across processes, call into stap array-setting routines %} probe begin { populate_foo() } A nice hack could look thusly: probe kprocess.preexisting { pid2(...)=.. } # new probe type, triggered at STARTING time # from a task-work callback added for each # preexisting process/thread? (this pattern could be replicated to provide iteration of preexisting other kernel/userspace objects like network connections, file descriptors, ...)
After working on bug #19065 recently, I believe we can actually do this without the task_finder keeping a table of information around just in case we need it. Instead we'd look up this information on the fly as needed. The current kernel code to do /proc/PID/cwd looks like this: ==== task_lock(task); if (task->fs) { get_fs_pwd(task->fs, path); result = 0; } task_unlock(task); put_task_struct(task); ==== The current kernel code to do /proc/PID/exe looks like: ==== mm = get_task_mm(task); put_task_struct(task); if (!mm) return -ENOENT; exe_file = get_mm_exe_file(mm); mmput(mm); ==== I believe everything called above is either inlined or exported, so it should be reasonable to call from a systemtap module. Note that this method might be harder to implement on older kernels, I'm unsure.
Fixed in commit 1c70d65. This adds 2 new functions to get a task's cwd and exe name. Note that we can't get the cwd on kernel's < 2.6.25.