Marker-based top-process listing
Problem
To demonstrate use of static instrumentation markers (see fche's OLS2006Talks), the FC5 kernel was patched with a marker in a sensitive spot (context switching).
Scripts
Here is the kernel patch: as minimal as can be.
--- kernel/sched.c.orig 2006-03-28 14:53:13.000000000 -0500 +++ kernel/sched.c 2006-03-28 14:56:05.000000000 -0500 @@ -51,6 +51,8 @@ #include <linux/acct.h> #include <asm/tlb.h> +#include "...../stapmark.h" + #include <asm/unistd.h> /* @@ -1596,6 +1598,7 @@ } /* Here we just switch the register state and the stack. */ + STAP_MARK_NN(context_switch, prev->pid, next->pid); switch_to(prev, next, prev); return prev;
probe kernel.mark("context_switch") { switches ++ # count number of context switches now = get_cycles() times[$arg1] += now-lasttime # accumulate cycles spent in process execnames[$arg1] = execname() # remember name of pid lasttime = now } probe timer.ms(3000) { # every 3000 ms printf ("\n%5s %20s %10s (%d switches)\n", "pid", "execname", "cycles", switches); foreach ([pid] in times-) # sort in decreasing order of cycle-count printf ("%5d %20s %10d\n", pid, execnames[pid], times[pid]); # clear data for next report delete times switches = 0 } probe begin { lasttime = get_cycles() } global lasttime, times, execnames, switches
Output
During an idle moment on the test laptop, nothing much happened. pid 0, named "swapper" really means "idle". Note though the precision in time spent in the respective tasks: not just two digits for a rough percentage, but almost down to the nanosecond.
# stap mark-top.stp pid execname cycles (1813 switches) 0 swapper 764411819 4473 X 51465833 4538 gnome-terminal 33217978 4745 firefox-bin 24762308 4540 gnome-terminal 4373587 4932 mhz 4293245 4532 xfce4-panel 3111898 4526 xfwm4 2691366 4528 xftaskbar4 2114246 4934 sed 1909229 4933 cat 1901134 4523 xfce-mcs-manage 1641378 4520 xfce4-session 1484422 4931 systemtap/0 1478318 4530 xfdesktop 1254507 4930 udevd 564886 4517 xscreensaver 479184 1110 cpuspeed 339318 4771 firefox-bin 303033 434 udevd 245346 4439 ntpd 119060 4 events/0 99844 4456 ssh-agent 47817 4928 stpd 16487 3 watchdog/0 9797 ^C
Lessons
- A few thousand marker hits per second are not noticable.
- Marker-based probes integrate just fine with other systemtap probes.
- It would be easy to build from/to process-transition matrixes rather than just flat profiles, which could help tune inter-process communication code. With processor numbers mixed in, one can track the movement of a thread between processors.