Marker-based top-process listing
Problem
To demonstrate use of static instrumentation markers (see fche's OLS2006Talks), the FC5 kernel was patched with a marker in a sensitive spot (context switching).
Scripts
Here is the kernel patch: as minimal as can be.
--- kernel/sched.c.orig 2006-03-28 14:53:13.000000000 -0500
+++ kernel/sched.c 2006-03-28 14:56:05.000000000 -0500
@@ -51,6 +51,8 @@
#include <linux/acct.h>
#include <asm/tlb.h>
+#include "...../stapmark.h"
+
#include <asm/unistd.h>
/*
@@ -1596,6 +1598,7 @@
}
/* Here we just switch the register state and the stack. */
+ STAP_MARK_NN(context_switch, prev->pid, next->pid);
switch_to(prev, next, prev);
return prev;probe kernel.mark("context_switch") {
switches ++ # count number of context switches
now = get_cycles()
times[$arg1] += now-lasttime # accumulate cycles spent in process
execnames[$arg1] = execname() # remember name of pid
lasttime = now
}
probe timer.ms(3000) { # every 3000 ms
printf ("\n%5s %20s %10s (%d switches)\n",
"pid", "execname", "cycles", switches);
foreach ([pid] in times-) # sort in decreasing order of cycle-count
printf ("%5d %20s %10d\n", pid, execnames[pid], times[pid]);
# clear data for next report
delete times
switches = 0
}
probe begin {
lasttime = get_cycles()
}
global lasttime, times, execnames, switches
Output
During an idle moment on the test laptop, nothing much happened. pid 0, named "swapper" really means "idle". Note though the precision in time spent in the respective tasks: not just two digits for a rough percentage, but almost down to the nanosecond.
# stap mark-top.stp
pid execname cycles (1813 switches)
0 swapper 764411819
4473 X 51465833
4538 gnome-terminal 33217978
4745 firefox-bin 24762308
4540 gnome-terminal 4373587
4932 mhz 4293245
4532 xfce4-panel 3111898
4526 xfwm4 2691366
4528 xftaskbar4 2114246
4934 sed 1909229
4933 cat 1901134
4523 xfce-mcs-manage 1641378
4520 xfce4-session 1484422
4931 systemtap/0 1478318
4530 xfdesktop 1254507
4930 udevd 564886
4517 xscreensaver 479184
1110 cpuspeed 339318
4771 firefox-bin 303033
434 udevd 245346
4439 ntpd 119060
4 events/0 99844
4456 ssh-agent 47817
4928 stpd 16487
3 watchdog/0 9797
^C
Lessons
- A few thousand marker hits per second are not noticable.
- Marker-based probes integrate just fine with other systemtap probes.
- It would be easy to build from/to process-transition matrixes rather than just flat profiles, which could help tune inter-process communication code. With processor numbers mixed in, one can track the movement of a thread between processors.
