Adding User Space Probing to an Application (heapsort example)


SystemTap is a powerful Linux tool that allows collection of data from both the Linux kernel and user-space applications. SystemTap includes an extensive library of predefined probes and functions for the kernel (tapsets) and a convenient scripting language to do on-the-fly data reduction. SystemTap's probing capabilities can be extended to user-space applications.

There are two basic ways for user-space probing. One can rely on the basic symbolic probing

probe process("/bin/foo").function("name") { log($$parms) }
probe process("/bin/foo").statement("*@file.c:443") { log($$vars) }

style, in which one specifies source code level "co-ordinates", and accesses variables available in context. This style does not require any changes to the target application binaries, merely preserving their debugging data. (In the usual systemtap way, one can package a collection of salient probe points into a tapset script that gives abstract names for given functions/statements.)

Another way is to instrument the application itself with embedded markers, which expose selected names and values only to systemtap scripts. These permit abstraction (easier usage, by exposing only a small number of salient probe points) and sometimes assist reliable provision of local variable values. This makes it unnecessary to maintain a tapset script, and instead involves adding calls to macros from <sys/sdt.h> into your program.

There are already some applications in Fedora 12 such as java-1.6.0-openjdk and postgresql that support probing by SystemTap. This simple example uses a heapsort program to show how SystemTap support can added to nearly any user-space application and include that support in the RPM.

The user-space probes allow you investigate the operation of the program without the need to recompile the program or restart the program. It also make it very easy to create simple scripts to look at interesting characteristics of program behavior, for example how long did it take to do the average postgresql query, when did any of the Java virtual machines start doing garbage collection, and how long did java garbage collection take.

To implement the userspace application probes you will need to have a kernel that supports utrace (Fedora 10 and later kernels) and the following SystemTap rpms installed on the computer:

The process of adding and using the user-space application probes can be broken down into the following steps:

An extremely simple heapsort sort written in C++ is used as a starting point for this example. It reads in an arbitrary number of integers from stdin terminated by a ctrl-d, sorts the integers using a heapsort algorithm, and then outputs the sorted integers. This README describes the changes made to add the SystemTap probes to the code. All of this is packaged in heapsort-0.5-1.src.rpm.

Modifying Application Source Code

Even though this example is too simple to really need the auto configuration and auto make machinery, it was provided with and files to make it more closely match the typical application program. The outline for changing the source code is the following:


The first step is to add tests to the to enable and disable the SystemTap support. The modification to the application code should not prevent the code from compiling in environments that do not have the SystemTap user-space support. The following lines in the file control whether SystemTap support is enabled:

AC_MSG_CHECKING([whether to include systemtap tracing support])
                              [Enable inclusion of systemtap trace support])],
              [ENABLE_SYSTEMTAP="${enableval}"], [ENABLE_SYSTEMTAP='no'])

if test "x${ENABLE_SYSTEMTAP}" = xyes; then
  # Additional configuration for --enable-systemtap is HERE

When the "--enable-systemtap" is used during configuration the needs to check to determine if the dtrace script and the sdt.h header are available. The dtrace script generates a header file and a stub object file. Within the if statement for the SystemTap configure there is the following additional code:

if test -z "$DTRACE"; then
  AC_MSG_ERROR([dtrace not found])
AC_CHECK_HEADER([sys/sdt.h], [SDT_H_FOUND='yes'],
                   AC_MSG_ERROR([systemtap support needs sys/sdt.h header])])

If those dtrace script and sys/sdt.h header are found, then HAVE_SYSTEMTAP is de fined in the config.h with:

AC_DEFINE([HAVE_SYSTEMTAP], [1], [Define to 1 if using  probes.])

SystemTap has library files call tapsets. The configuration needs to determine where to install those with the following in the

                              [The absolute path where the tapset dir will be installed])],
              [if test "x${withval}" = x; then
               fi], [ABS_TAPSET_DIR="\$(datadir)/systemtap/tapset"])

Tapset Skeleton

The tapset/heapsort.stp is a file that will make it easier for people to use the SystemTap probe points in the code. It hides some of the details about the probe point from the user. The code for the tapset will be placed in tapset/heapsort.stp. Initially, tapset/heapsort.stp can be empty. There is also a very simple in the tapset directory to indicate how to install and remove the tapset file. The in the top level directory will need to indicate that there is a subdirectory with the following lines:

SUBDIRS = tapset

Probe Point Declaration

The next step is to declare the probes points in the probes.d file and the arguments that they take. The probes.d contents listed below will be processed to generate the needed include file (probes.h) and stub object file (probes.o):

provider heapsort {
         probe input_start();
         probe input_done(int); /* (int number of items) */
         probe buffer_resize_start();
         probe buffer_resize_done();
         probe output_start(int);       /* (int number of items) */
         probe output_done();
         probe heap_place(int, int);    /* (int position, int value) */
         probe heap_build_start();
         probe heap_build_done();

Some minor changes are needed in the First, need to add probes.d and a very simple wrapper trace.h to SOURCES list and indicate the probes.h is a generated file:

heapsort_SOURCES = heapsort.cxx probes.d trace.h
BUILT_SOURCES = probes.h

Also need some rules to generate the probes.h and probes.o as needed in the

probes.h: probes.d
        $(DTRACE) -C -h -s $< -o $@

probes.o: probes.d
        $(DTRACE) -C -G -s $< -o $@

heapsort_LDADD += probes.o

The following line in probes.d: .

probe heap_place(int, int);    (int position, int value)

Generates the following macro in probes.h:

#define HEAPSORT_HEAP_PLACE(arg1,arg2) \

What is in STAP_PROBE2() is not important. The important thing is macros are now available to instrument the application code.

Adding Probes in Source Code

For each source file with added probes the following include will be needed to provide the macros:

#include "probes.h"

The is implemented in the trace.h, a very short include file that conditionally includes the probes.h and has a TRACE macro to conditionally use the tracepoints:

#include "config.h"
// include the generated probes header and put markers in code
#include "probes.h"
#define TRACE(probe) probe
// Wrap the probe to allow it to be removed when no systemtap available
#define TRACE(probe)

The macros can be placed in any place that normally executable code is placed. They will be inactive until they are used by SystemTap. The arguments can be used to relay useful state information to SystemTap. For the HEAPSORT_HEAP_PLACE the location in the heap and the value being inserted into the heap are available to SystemTap.

The raw TRACE(HEAPSORT_HEAP_PLACE()) probe can be accessed with:

The raw probes are not particularly user-friendly. The following section describes how to abstract the interface and hide those details with a tapset.

Adding a Tapset

Tapsets provide an ABI that hides the details of the probe from the user. The tapsets are typically placed in /usr/share/systemtap/tapsets. The tapsets consists of aliases and local variables for the probes. The tapset can also definite SystemTap functions and local variable that make it easier to use the probes.

The following is an example probe alias for the HEAPSORT_HEAP_PLACE() probe used in the source code:

probe heapsort_heap_place = process("heapsort").mark("heap_place")
  position = $arg1;
  value = $arg2;
  probestr = sprintf("%s(position=%d, value=%d)", $$name, position, value);

Modifying RPM Spec file

Packaging software that has user-space probing as an RPM requires some additional changes. The changes can be broken down into the following steps:

!SystemTap Flag

As with the original source code, it should be possible to build the RPMs with or without the SystemTap support enabled. The variable sdt in heapsort.spec controls whether SystemTap support is enabled. The following line indicates that the SystemTap support is enabled by default:

%{!?sdt:%define sdt 1}

The SystemTap support can be turned off with the --define on the following rpmbuild line:

rpmbuild --define "sdt 0" heapsort.spec

The variable sdt will be used in the rest of the spec file to control whether the package is built with SystemTap support.

Build Dependencies

When the package is built with SystemTap support an additional BuildRequires is needed to supply the tools to generate the probes.h header and probes.o stub files:

%if %sdt
BuildRequires: systemtap-sdt-devel


In the %build section of the heapsort.spec file the configure is extended to:

%configure \
%if %sdt
        --enable-systemtap \
        --with-tapset-install-dir=%tapsetdir \

The %tapsetdir is set earlier in the spec file with:

%define tapsetdir       /usr/share/systemtap/tapset

The build of the executable stays the same with:

make %{?_smp_mflags}

Files to Install

To make life easier for users the tapset file should be installed. The following is an addition to the %file section of the spec file:

%if %sdt

Verifying Probe existence

To verify the probes actually exist in the executable, the following stap command can be used:

$ stap -L 'process("heapsort-0.5/heapsort").mark("*")'
process("heapsort").mark("buffer_resize_done") $temp_buffer:struct item_type* $capacity:int $value:int $DEFAULT_CAPACITY:int
process("heapsort").mark("buffer_resize_start") $capacity:int $value:int $success:int $DEFAULT_CAPACITY:int
process("heapsort").mark("heap_build_done") $i:int
process("heapsort").mark("heap_place") $arg1:int $arg2:int
process("heapsort").mark("input_done") $arg1:int
process("heapsort").mark("input_start") $value:int $DEFAULT_CAPACITY:int
process("heapsort").mark("output_start") $arg1:int

Alternatively, readelf can be used to dump the probe data on systems without stap installed.

$ readelf -x.probes heapsort-0.5/heapsort

Hex dump of section '.probes':
  0x00601240 68656170 5f706c61 63650000 50524231 heap_place..PRB1
  0x00601250 40126000 00000000 56074000 00000000 @.`.....V.@.....
  0x00601260 00000000 00000000 68656170 5f627569 ........heap_bui
  0x00601270 6c645f73 74617274 00000000 50524231 ld_start....PRB1
  0x00601280 68126000 00000000 67074000 00000000 h.`.....g.@.....

Using the New Probes

In addition to the regular RPM heapsort, the heapsort-debuginfo will need to be installed. This is to allow SystemTap to determine the proper type of the arguments used in the probes. One very simple script to run is heap_tap_all.stp which probes all the probes listed in the tapset and prints out data when the each time a probe fires:

probe heapsort* {
      printf("%s\n", probestr);

This could be run with a specific instance of heapsort and generate something like the following output:

$ stap heap_tap_all.stp -c /usr/bin/heapsort
heap_place(position=1, value=3)

The heap_time_phases.stp tracks statistics about the amount of time spent in the input, heap_build, and output phases for all runs of /usr/bin/heapsort.


This is an simple example of how incorportate SystemTap probes into an application. For more information check out the SystemTap webpage:

Also feel free to send email to the mailing list or join the IRC channel to discuss issues with systemtap:

Email: IRC #systemtap on