System utilization graphing with Gnuplot
Problem
Sometimes a textual chart of data is not enough - some quick eye candy is needed. The old Gnuplot program is a lightweight math grapher for X-Windows and other backends. In the UNIX tradition, gnuplot can operate as part of a pipeline, receiving commands and data from files. Systemtap can write files. A match made in heaven!
Scripts
# ------------------------------------------------------------------------ # data collection # disk I/O stats probe begin { qnames["ioblock"] ++; qsq_start ("ioblock") } probe ioblock.request { qs_wait ("ioblock") qs_run("ioblock") } probe ioblock.end { qs_done ("ioblock") } # CPU utilization probe begin { qnames["cpu"] ++; qsq_start ("cpu") } probe scheduler.cpu_on { if (!idle) {qs_wait ("cpu") qs_run ("cpu") }} probe scheduler.cpu_off { if (!idle) qs_done ("cpu") } # ------------------------------------------------------------------------ # utilization history tracking global N probe begin { N = 50 } global qnames, util, histidx function qsq_util_reset(q) { u=qsq_utilization (q, 100) qsq_start (q) return u } probe timer.ms(100) { # collect utilization percentages frequently histidx = (histidx + 1) % N # into circular buffer foreach (q in qnames) util[histidx,q] = qsq_util_reset(q) } # ------------------------------------------------------------------------ # general gnuplot graphical report generation probe timer.ms(1000) { # emit gnuplot command to display recent history printf ("set yrange [0:100]\n") printf ("plot ") foreach (q in qnames+) { if (++nq >= 2) printf (", ") printf ("'-' title \"%s\" with lines", q) } printf ("\n") foreach (q in qnames+) { for (i = (histidx + 1) % N; i != histidx; i = (i + 1) % N) printf("%d\n", util[i,q]) printf ("e\n") } printf ("pause 1\n") }
Output
# stap graphs.stp | gnuplot
Here are some gnuplot outputs. Data points in the graphs are 100ms apart. Every second, a new picture is generated with the last 50 samples, i.e., 5 seconds of history.
Lessons
- The queueing statistics tapset can be used even in a light-weight manner like this, just to track overall utilization.
- It is not hard to talk to gnuplot.
- It may be nice if systemtap supported a couple of distinct output streams.
- It is not hard to build such scripts in a modular manner. Note how the gnuplot communication functions are generic - independent of the number of types of resources being utilization-monitored.
- We are collecting low-level, raw statistics, not something pre-digested for ordinary user-level monitoring programs. This has the potential for greater precision and specificity.