System utilization graphing with Gnuplot
Problem
Sometimes a textual chart of data is not enough - some quick eye candy is needed. The old Gnuplot program is a lightweight math grapher for X-Windows and other backends. In the UNIX tradition, gnuplot can operate as part of a pipeline, receiving commands and data from files. Systemtap can write files. A match made in heaven!
Scripts
# ------------------------------------------------------------------------
# data collection
# disk I/O stats
probe begin { qnames["ioblock"] ++; qsq_start ("ioblock") }
probe ioblock.request { qs_wait ("ioblock") qs_run("ioblock") }
probe ioblock.end { qs_done ("ioblock") }
# CPU utilization
probe begin { qnames["cpu"] ++; qsq_start ("cpu") }
probe scheduler.cpu_on { if (!idle) {qs_wait ("cpu") qs_run ("cpu") }}
probe scheduler.cpu_off { if (!idle) qs_done ("cpu") }
# ------------------------------------------------------------------------
# utilization history tracking
global N
probe begin { N = 50 }
global qnames, util, histidx
function qsq_util_reset(q) {
u=qsq_utilization (q, 100)
qsq_start (q)
return u
}
probe timer.ms(100) { # collect utilization percentages frequently
histidx = (histidx + 1) % N # into circular buffer
foreach (q in qnames)
util[histidx,q] = qsq_util_reset(q)
}
# ------------------------------------------------------------------------
# general gnuplot graphical report generation
probe timer.ms(1000) {
# emit gnuplot command to display recent history
printf ("set yrange [0:100]\n")
printf ("plot ")
foreach (q in qnames+)
{
if (++nq >= 2) printf (", ")
printf ("'-' title \"%s\" with lines", q)
}
printf ("\n")
foreach (q in qnames+) {
for (i = (histidx + 1) % N; i != histidx; i = (i + 1) % N)
printf("%d\n", util[i,q])
printf ("e\n")
}
printf ("pause 1\n")
}
Output
# stap graphs.stp | gnuplot
Here are some gnuplot outputs. Data points in the graphs are 100ms apart. Every second, a new picture is generated with the last 50 samples, i.e., 5 seconds of history.
Lessons
- The queueing statistics tapset can be used even in a light-weight manner like this, just to track overall utilization.
- It is not hard to talk to gnuplot.
- It may be nice if systemtap supported a couple of distinct output streams.
- It is not hard to build such scripts in a modular manner. Note how the gnuplot communication functions are generic - independent of the number of types of resources being utilization-monitored.
- We are collecting low-level, raw statistics, not something pre-digested for ordinary user-level monitoring programs. This has the potential for greater precision and specificity.



