4 Probe points

4.1 General syntax

The general probe point syntax is a dotted-symbol sequence. This divides the event namespace into parts, analogous to the style of the Domain Name System. Each component identifier is parameterized by a string or number literal, with a syntax analogous to a function call.

The following are all syntactically valid probe points.

     kernel.function("no_such_function") ?

Probes may be broadly classified into synchronous or asynchronous. A synchronous event occurs when any processor executes an instruction matched by the specification. This gives these probes a reference point (instruction address) from which more contextual data may be available. Other families of probe points refer to asynchronous events such as timers, where no fixed reference point is related. Each probe point specification may match multiple locations, such as by using wildcards or aliases, and all are probed. A probe declaration may contain several specifications separated by commas, which are all probed.

4.1.1 Prefixes

Prefixes specify the probe target, such as kernel, module, timer, and so on.

4.1.2 Suffixes

Suffixes further qualify the point to probe, such as .return for the exit point of a probed function. The absence of a suffix implies the function entry point.

4.1.3 Wildcarded file names, function names

A component may include an asterisk (*) character, which expands to other matching probe points. An example follows.


4.1.4 Optional probe points

A probe point may be followed by a question mark (?) character, to indicate that it is optional, and that no error should result if it fails to expand. This effect passes down through all levels of alias or wildcard expansion.

The following is the general syntax.

     kernel.function("no_such_function") ?

4.1.5 Brace expansion

Brace expansion is a mechanism which allows a list of probe points to be generated. It is very similar to shell expansion. A component may be surrounded by a pair of curly braces to indicate that the comma-separated sequence of one or more subcomponents will each constitute a new probe point. The braces may be arbitrarily nested. The ordering of expanded results is based on product order.

The question mark (?), exclamation mark (!) indicators and probe point conditions may not be placed in any expansions that are before the last component.

The following is an example of brace expansion.

     # Expands to
     syscall.write, syscall.read
     # Expands to
     kernel.function("nfs*")!, module("nfs").function("nfs*")!

4.2 Built-in probe point types (DWARF probes)

This family of probe points uses symbolic debugging information for the target kernel or module, as may be found in executables that have not been stripped, or in the separate debuginfo packages. They allow logical placement of probes into the execution path of the target by specifying a set of points in the source or object code. When a matching statement executes on any processor, the probe handler is run in that context.

Points in a kernel are identified by module, source file, line number, function name or some combination of these.

Here is a list of probe point specifications currently supported:


The .function variant places a probe near the beginning of the named function, so that parameters are available as context variables.

The .return variant places a probe at the moment of return from the named function, so the return value is available as the $return context variable. The entry parameters are also available, though the function may have changed their values. Return probes may be further qualified with .maxactive, which specifies how many instances of the specified function can be probed simultaneously. You can leave off .maxactive in most cases, as the default (KRETACTIVE) should be sufficient. However, if you notice an excessive number of skipped probes, try setting .maxactive to incrementally higher values to see if the number of skipped probes decreases.

The .inline modifier for .function filters the results to include only instances of inlined functions. The .call modifier selects the opposite subset. The .exported modifier filters the results to include only exported functions. Inline functions do not have an identifiable return point, so .return is not supported on .inline probes.

The .statement variant places a probe at the exact spot, exposing those local variables that are visible there.

In the above probe descriptions, MPATTERN stands for a string literal that identifies the loaded kernel module of interest and LPATTERN stands for a source program label. Both MPATTERN and LPATTERN may include asterisk (*), square brackets ”[]”, and question mark (?) wildcards.

PATTERN stands for a string literal that identifies a point in the program. It is composed of three parts:

  1. The first part is the name of a function, as would appear in the nm program’s output. This part may use the asterisk and question mark wildcard operators to match multiple names.
  2. The second part is optional, and begins with the ampersand (@) character. It is followed by the path to the source file containing the function, which may include a wildcard pattern, such as mm/slab*. In most cases, the path should be relative to the top of the linux source directory, although an absolute path may be necessary for some kernels. If a relative pathname doesn’t work, try absolute.
  3. The third part is optional if the file name part was given. It identifies the line number in the source file, preceded by a “:” or “+”. The line number is assumed to be an absolute line number if preceded by a “:”, or relative to the entry of the function if preceded by a “+”. All the lines in the function can be matched with “:*”. A range of lines x through y can be matched with “:x-y”.

Alternately, specify PATTERN as a numeric constant to indicate a relative module address or an absolute kernel address.

Some of the source-level variables, such as function parameters, locals, or globals visible in the compilation unit, are visible to probe handlers. Refer to these variables by prefixing their name with a dollar sign within the scripts. In addition, a special syntax allows limited traversal of structures, pointers, arrays, taking the address of a variable or pretty printing a whole structure.

$var refers to an in-scope variable var. If it is a type similar to an integer, it will be cast to a 64-bit integer for script use. Pointers similar to a string (char *) are copied to SystemTap string values by the kernel_string() or user_string() functions.

@var("varname") is an alternative syntax for $varname. It can also be used to access global variables in a particular compile unit (CU). @var("varname@src/file.c") refers to the global (either file local or external) variable varname defined when the file src/file.c was compiled. The CU in which the variable is resolved is the first CU in the module of the probe point which matches the given file name at the end and has the shortest file name path (e.g. given @var("foo@bar/baz.c") and CUs with file name paths src/sub/module/bar/baz.c and src/bar/baz.c the second CU will be chosen to resolve foo).

The notation @var("varname", "/path/to/exe-or-so) is also supported to explicitly specify an executable or library file path in which the global or top-level static variable resides.

$var->field or @var("var@file.c")->field traverses a structure’s field. The indirection operator may be repeated to follow additional levels of pointers.

$var[N] or @var("var@file.c")[N] indexes into an array. The index is given with a literal number.

&$var or &@var("var@file.c") provides the address of a variable as a long. It can also be used in combination with field access or array indexing to provide the address of a particular field or an element in an array with &var->field, &@var("var@file.c")[N] or a combination of those accessors.

Using a single $ or a double $$ suffix provides a swallow or deep string representation of the variable data type. Using a single $, as in $var$, will provide a string that only includes the values of all basic type values of fields of the variable structure type but not any nested complex type values (which will be represented with {...}). Using a double $$, as in @var("var")$$ will provide a string that also includes all values of nested data types.

$$vars expands to a character string that is equivalent to sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x", $parm1, ..., $parmN, $var1, ..., $varN)

$$locals expands to a character string that is equivalent to sprintf("var1=%x ... varN=%x", $var1, ..., $varN)

$$parms expands to a character string that is equivalent to sprintf("parm1=%x ... parmN=%x", $parm1, ..., $parmN)

4.2.1 kernel.function, module().function

The .function variant places a probe near the beginning of the named function, so that parameters are available as context variables.

General syntax:



     # Refers to all kernel functions with "init" or "exit"
     # in the name:
     kernel.function("*init*"), kernel.function("*exit*")
     # Refers to any functions within the "kernel/time.c"
     # file that span line 240:
     # Refers to all functions in the ext3 module:

4.2.2 kernel.statement, module().statement

The .statement variant places a probe at the exact spot, exposing those local variables that are visible there.

General syntax:



     # Refers to the statement at line 296 within the
     # kernel/time.c file:
     # Refers to the statement at line bio_init+3 within the fs/bio.c file:

4.3 Function return probes

The .return variant places a probe at the moment of return from the named function, so that the return value is available as the $return context variable. The entry parameters are also accessible in the context of the return probe, though their values may have been changed by the function. Inline functions do not have an identifiable return point, so .return is not supported on .inline probes.

4.4 DWARF-less probing

In the absence of debugging information, you can still use the kprobe family of probes to examine the entry and exit points of kernel and module functions. You cannot look up the arguments or local variables of a function using these probes. However, you can access the parameters by following this procedure:

When you’re stopped at the entry to a function, you can refer to the function’s arguments by number. For example, when probing the function declared:

     asmlinkage ssize_t sys_read(unsigned int fd, char __user * buf, size_t

You can obtain the values of fd, buf, and count, respectively, as uint_arg(1), pointer_arg(2), and ulong_arg(3). In this case, your probe code must first call asmlinkage(), because on some architectures the asmlinkage attribute affects how the function’s arguments are passed.

When you’re in a return probe, $return isn’t supported without DWARF, but you can call returnval() to get the value of the register in which the function value is typically returned, or call returnstr() to get a string version of that value.

And at any code probepoint, you can call register("regname") to get the value of the specified CPU register when the probe point was hit. u_register("regname") is like register("regname"), but interprets the value as an unsigned integer.

SystemTap supports the following constructs:


Use .function probes for kernel functions and .module probes for probing functions of a specified module. If you do not know the absolute address of a kernel or module function, use .statement probes. Do not use wildcards in FUNCTION and MODULE names. Wildcards cause the probe to not register. Also, statement probes are available only in guru mode.

4.5 Userspace probing

Support for userspace probing is supported on kernels that are configured to include the utrace or uprobes extensions.

4.5.1 Begin/end variants



The .begin variant is called when a new process described by PID or PATH is created. If no PID or PATH argument is specified (for example process.begin), the probe flags any new process being spawned.

The .thread.begin variant is called when a new thread described by PID or PATH is created.

The .end variant is called when a process described by PID or PATH dies.

The .thread.end variant is called when a thread described by PID or PATH dies.

4.5.2 Syscall variants



The .syscall variant is called when a thread described by PID or PATH makes a system call. The system call number is available in the $syscall context variable. The first six arguments of the system call are available in the $argN parameter, for example $arg1, $arg2, and so on.

The .syscall.return variant is called when a thread described by PID or PATH returns from a system call. The system call number is available in the $syscall context variable. The return value of the system call is available in the $return context variable.

4.5.3 Function/statement variants



Full symbolic source-level probes in userspace programs and shared libraries are supported. These are exactly analogous to the symbolic DWARF-based kernel or module probes described previously and expose similar contextual $-variables. See Section 4.2 for more information

Here is an example of prototype symbolic userspace probing support:

     # stap -e ’probe process("ls").function("*").call {
                log (probefunc()." ".$$parms)
                }’ \
            -c ’ls -l’

To run, this script requires debugging information for the named program and utrace support in the kernel. If you see a ”pass 4a-time” build failure, check that your kernel supports utrace.

4.5.4 Absolute variant

A non-symbolic probe point such as process(PID).statement(ADDRESS).absolute is analogous to
kernel.statement(ADDRESS).absolute in that both use raw, unverified virtual addresses and provide no $variables. The target PID parameter must identify a running process and ADDRESS must identify a valid instruction address. All threads of the listed process will be probed. This is a guru mode probe.

4.5.5 Process probe paths

For all process probes, PATH names refer to executables that are searched the same way that shells do: the explicit path specified if the path name begins with a slash (/) character sequence; otherwise $PATH is searched. For example, the following probe syntax:

     probe process("ls").syscall {}
     probe process("./a.out").syscall {}

works the same as:

     probe process("/bin/ls").syscall {}
     probe process("/my/directory/a.out").syscall {}

If a process probe is specified without a PID or PATH parameter, all user threads are probed. However, if systemtap is invoked in target process mode, process probes are restricted to the process hierarchy associated with the target process. If stap is running in --unprivileged mode, only processes owned by the current user are selected.

4.5.6 Target process mode

Target process mode (invoked with stap -c CMD or -x PID) implicitly restricts all process.* probes to the given child process. It does not affect kernel.* or other probe types. The CMD string is normally run directly, rather than from a “/bin/sh -c” sub-shell, since utrace and uprobe probes receive a fairly ”clean” event stream. If meta-characters such as redirection operators are present in CMD, “/bin/sh -c CMD” is still used, and utrace and uprobe probes will receive events from the shell. For example:

     % stap -e ’probe process.syscall, process.end {
                printf("%s %d %s\n", execname(), pid(), pp())}’ \
            -c ls

Here is the output from this command:

     ls 2323 process.syscall
     ls 2323 process.syscall
     ls 2323 process.end

If PATH names a shared library, all processes that map that shared library can be probed. If dwarf debugging information is installed, try using a command with this syntax:

     probe process("/lib64/libc-2.8.so").function("....") { ... }

This command probes all threads that call into that library. Typing “stap -c CMD” or “stap -x PID” restricts this to the target command and descendants only. You can use $$vars and others. You can provide the location of debug information to the stap command with the -d DIRECTORY option. To qualify a probe point to a location in a library required by a particular process try using a command with this syntax:

     probe process("...").library("...").function("....") { ... }

The library name may use wildcards.

The first syntax in the following will probe the functions in the program linkage table of a particular process. The second syntax will also add the program linkage tables of libraries required by that process. .plt(”...”) can be specified to match particular plt entries.

     probe process("...").plt { ... }
     probe process("...").plt process("...").library("...").plt { ... }

4.5.7 Instruction probes



The process().insn and process().insn.block probes inspect the process after each instruction or block of instructions is executed. These probes are not implemented on all architectures. If they are not implemented on your system, you will receive an error message when the script starts.

The .insn probe is called for every single-stepped instruction of the process described by PID or PATH.

The .insn.block probe is called for every block-stepped instruction of the process described by PID or PATH.

To count the total number of instructions that a process executes, type a command similar to:

     $ stap -e ’global steps; probe process("/bin/ls").insn {steps++}
                probe end {printf("Total instructions: %d\n", steps);}’ \
            -c /bin/ls

Using this feature will significantly slow process execution.

4.5.8 Static userspace probing

You can probe symbolic static instrumentation compiled into programs and shared libraries with the following syntax:


The .mark variant is called from a static probe defined in the application by STAP_PROBE1(handle,LABEL,arg1). STAP_PROBE1 is defined in the sdt.h file. The parameters are:

Parameter Definition

handle the application handle

LABEL corresponds to the .mark argument

arg1 the argument

Use STAP_PROBE1 for probes with one argument. Use STAP_PROBE2 for probes with 2 arguments, and so on. The arguments of the probe are available in the context variables $arg1, $arg2, and so on.

As an alternative to the STAP_PROBE macros, you can use the dtrace script to create custom macros. The sdt.h file also provides dtrace compatible markers through DTRACE_PROBE and an associated python dtrace script. You can use these in builds based on dtrace that need dtrace -h or -G functionality.

4.6 Java probes

Support for probing Java methods is available using Byteman as a backend. Byteman is an instrumentation tool from the JBoss project which systemtap can use to monitor invocations for a specific method or line in a Java program.

Systemtap does so by generating a Byteman script listing the probes to instrument and then invoking the Byteman bminstall utility. A custom option -D OPTION (see the Byteman documentation for more details) can be passed to bminstall by invoking systemtap with option -J OPTION. The systemtap option -j is also provided as a shorthand for -J org.jboss.byteman.compile.to.bytecode.

This Java instrumentation support is currently a prototype feature with major limitations: Java probes attach only to one Java process at a time; other Java processes beyond the first one to be observed are ignored. Moreover, Java probing currently does not work across users; the stap script must run (with appropriate permissions) under the same user as the Java process being probed. (Thus a stap script under root currently cannot probe Java methods in a non-root-user Java process.)

There are four probe point variants supported by the translator:


The first two probe points refer to Java processes by the name of the Java process. The PATTERN parameter specifies the signature of the Java method to probe. The signature must consist of the exact name of the method, followed by a bracketed list of the types of the arguments, for instance myMethod(int,double,Foo). Wildcards are not supported.

The probe can be set to trigger at a specific line within the method by appending a line number with colon, just as in other types of probes: myMethod(int,double,Foo):245.

The CLASSNAME parameter identifies the Java class the method belongs to, either with or without the package qualification. By default, the probe only triggers on descendants of the class that do not override the method definition of the original class. However, CLASSNAME can take an optional caret prefix, as in class("^org.my.MyClass"), which specifies that the probe should also trigger on all descendants of MyClass that override the original method. For instance, every method with signature foo(int) in program org.my.MyApp can be probed at once using


The last two probe points work analogously, but refer to Java processes by PID. (PIDs for already running processes can be obtained using the jps utility.)

Context variables defined within java probes include $provider (which identifies the class providing the definition of the triggered method) and $name (which gives the signature of the method). Arguments to the method can be accessed using context variables $arg1$ through $arg10, for up to the first 10 arguments of a method.

4.7 PROCFS probes

These probe points allow procfs pseudo-files in /proc/systemtap/MODNAME to be created, read and written. Specify the name of the systemtap module as MODNAME . There are four probe point variants supported by the translator:


PATH is the file name to be created, relative to /proc/systemtap/MODNAME. If no PATH is specified (as in the last two variants in the previous list), PATH defaults to ”command”.

When a user reads /proc/systemtap/MODNAME/PATH, the corresponding procfs read probe is triggered. Assign the string data to be read to a variable named $value, as follows:

     procfs("PATH").read { $value = "100\n" }

When a user writes into /proc/systemtap/MODNAME/PATH, the corresponding procfs write probe is triggered. The data the user wrote is available in the string variable named $value, as follows:

     procfs("PATH").write { printf("User wrote: %s", $value) }

4.8 Marker probes

This family of probe points connects to static probe markers inserted into the kernel or a module. These markers are special macro calls in the kernel that make probing faster and more reliable than with DWARF-based probes. DWARF debugging information is not required to use probe markers.

Marker probe points begin with a kernel prefix which identifies the source of the symbol table used for finding markers. The suffix names the marker itself: mark.("MARK"). The marker name string, which can contain wildcard characters, is matched against the names given to the marker macros when the kernel or module is compiled. Optionally, you can specify format("FORMAT"). Specifying the marker format string allows differentiation between two markers with the same name but different marker format strings.

The handler associated with a marker probe reads any optional parameters specified at the macro call site named $arg1 through $argNN, where NN is the number of parameters supplied by the macro. Number and string parameters are passed in a type-safe manner.

The marker format string associated with a marker is available in $format. The marker name string is available in $name.

Here are the marker probe constructs:


For more information about marker probes, see http://sourceware.org/systemtap/wiki/UsingMarkers.

4.9 Tracepoints

This family of probe points hooks to static probing tracepoints inserted into the kernel or kernel modules. As with marker probes, these tracepoints are special macro calls inserted by kernel developers to make probing faster and more reliable than with DWARF-based probes. DWARF debugging information is not required to probe tracepoints. Tracepoints have more strongly-typed parameters than marker probes.

Tracepoint probes begin with kernel. The next part names the tracepoint itself: trace("name"). The tracepoint name string, which can contain wildcard characters, is matched against the names defined by the kernel developers in the tracepoint header files.

The handler associated with a tracepoint-based probe can read the optional parameters specified at the macro call site. These parameters are named according to the declaration by the tracepoint author. For example, the tracepoint probe kernel.trace("sched_switch") provides the parameters $rq, $prev, and $next. If the parameter is a complex type such as a struct pointer, then a script can access fields with the same syntax as DWARF $target variables. Tracepoint parameters cannot be modified; however, in guru mode a script can modify fields of parameters.

The name of the tracepoint is available in $$name, and a string of name=value pairs for all parameters of the tracepoint is available in $$vars or $$parms.

4.10 Syscall probes

The syscall.* aliases define several hundred probes. They use the following syntax:


Generally, two probes are defined for each normal system call as listed in the syscalls(2) manual page: one for entry and one for return. System calls that never return do not have a corresponding .return probe.

Each probe alias defines a variety of variables. Look at the tapset source code to find the most reliable source of variable definitions. Generally, each variable listed in the standard manual page is available as a script-level variable. For example, syscall.open exposes file name, flags, and mode. In addition, a standard suite of variables is available at most aliases, as follows:

Not all probe aliases obey all of these general guidelines. Please report exceptions that you encounter as a bug.

4.11 Timer probes

You can use intervals defined by the standard kernel jiffies timer to trigger probe handlers asynchronously. A jiffy is a kernel-defined unit of time typically between 1 and 60 msec. Two probe point variants are supported by the translator:


The probe handler runs every N jiffies. If the randomize component is given, a linearly distributed random value in the range [-M … +M] is added to N every time the handler executes. N is restricted to a reasonable range (1 to approximately 1,000,000), and M is restricted to be less than N. There are no target variables provided in either context. Probes can be run concurrently on multiple processors.

Intervals may be specified in units of time. There are two probe point variants similar to the jiffies timer:


Here, N and M are specified in milliseconds, but the full options for units are seconds (s or sec), milliseconds (ms or msec), microseconds (us or usec), nanoseconds (ns or nsec), and hertz (hz). Randomization is not supported for hertz timers.

The resolution of the timers depends on the target kernel. For kernels prior to 2.6.17, timers are limited to jiffies resolution, so intervals are rounded up to the nearest jiffies interval. After 2.6.17, the implementation uses hrtimers for greater precision, though the resulting resolution will be dependent upon architecture. In either case, if the randomize component is given, then the random value will be added to the interval before any rounding occurs.

Profiling timers are available to provide probes that execute on all CPUs at each system tick. This probe takes no parameters, as follows.


Full context information of the interrupted process is available, making this probe suitable for implementing a time-based sampling profiler.

It is recommended to use the tapset probe timer.profile rather than timer.profile.tick. This probe point behaves identically to timer.profile.tick when the underlying functionality is available, and falls back to using perf.sw.cpu_clock on some recent kernels which lack the corresponding profile timer facility.

The following is an example of timer usage.

     # Refers to a periodic interrupt, every 1000 jiffies:
     # Fires every 5 seconds:
     # Refers to a periodic interrupt, every 1000 +/- 200 jiffies:

4.12 Special probe points

The probe points begin and end are defined by the translator to refer to the time of session startup and shutdown. There are no target variables available in either context.

4.12.1 begin

The begin probe is the start of the SystemTap session. All begin probe handlers are run during the startup of the session.

4.12.2 end

The end probe is the end of the SystemTap session. All end probes are run during the normal shutdown of a session, such as in the aftermath of a SystemTap exit function call, or an interruption from the user. In the case of an shutdown triggered by error, end probes are not run.

4.12.3 error

The error probe point is similar to the end probe, except the probe handler runs when the session ends if an error occurred. In this case, an end probe is skipped, but each error probe is still attempted. You can use an error probe to clean up or perform a final action on script termination.

Here is a simple example:

     probe error { println ("Oops, errors occurred. Here’s a report anyway.")
                   foreach (coin in mint) { println (coin) } }

4.12.4 begin, end, and error probe sequence

begin, end, and error probes can be specified with an optional sequence number that controls the order in which they are run. If no sequence number is provided, the sequence number defaults to zero and probes are run in the order that they occur in the script file. Sequence numbers may be either positive or negative, and are especially useful for tapset writers who want to do initialization in a begin probe. The following are examples.

     # In a tapset file:
     probe begin(-1000) { ... }
     # In a user script:
     probe begin { ... }

The user script begin probe defaults to sequence number zero, so the tapset begin probe will run first.

4.12.5 never

The never probe point is defined by the translator to mean never. Its statements are analyzed for symbol and type correctness, but its probe handler is never run. This probe point may be useful in conjunction with optional probes. See Section 4.1.4.