]> sourceware.org Git - systemtap.git/blame - stapprobes.3stap
eventcount: extend script with more info and prettier formatting
[systemtap.git] / stapprobes.3stap
CommitLineData
5f92f126 1.\" t
ec1a2239 2.TH STAPPROBES 3stap
ba4a90fd
FCE
3.SH NAME
4stapprobes \- systemtap probe points
5
6.\" macros
7.de SAMPLE
8.br
9.RS
10.nf
11.nh
12..
13.de ESAMPLE
14.hy
15.fi
16.RE
17..
18
19.SH DESCRIPTION
20The following sections enumerate the variety of probe points supported
89965a32
FCE
21by the systemtap translator, and some of the additional aliases defined by
22standard tapset scripts. Many are individually documented in the
23.IR 3stap
24manual section, with the
25.IR probe::
26prefix.
ba4a90fd 27.PP
7abecb38 28The general probe point syntax is a dotted-symbol sequence. This
ba4a90fd
FCE
29allows a breakdown of the event namespace into parts, somewhat like
30the Domain Name System does on the Internet. Each component
7abecb38 31identifier may be parametrized by a string or number literal, with a
d898100a 32syntax like a function call. A component may include a "*" character,
649260f3
JS
33to expand to a set of matching probe points. It may also include "**"
34to match multiple sequential components at once. Probe aliases likewise
d898100a
FCE
35expand to other probe points. Each and every resulting probe point is
36normally resolved to some low-level system instrumentation facility
37(e.g., a kprobe address, marker, or a timer configuration), otherwise
38the elaboration phase will fail.
39.PP
40However, a probe point may be followed by a "?" character, to indicate
41that it is optional, and that no error should result if it fails to
42resolve. Optionalness passes down through all levels of
43alias/wildcard expansion. Alternately, a probe point may be followed
44by a "!" character, to indicate that it is both optional and
37f6433e 45sufficient. (Think vaguely of the Prolog cut operator.) If it does
d898100a
FCE
46resolve, then no further probe points in the same comma-separated list
47will be resolved. Therefore, the "!" sufficiency mark only makes
48sense in a list of probe point alternatives.
dfd11cc3
MH
49.PP
50Additionally, a probe point may be followed by a "if (expr)" statement, in
51order to enable/disable the probe point on-the-fly. With the "if" statement,
52if the "expr" is false when the probe point is hit, the whole probe body
53including alias's body is skipped. The condition is stacked up through
54all levels of alias/wildcard expansion. So the final condition becomes
55the logical-and of conditions of all expanded alias/wildcard.
6e3347a9 56
e904ad95
FCE
57These are all
58.B syntactically
59valid probe points. (They are generally
60.B semantically
61invalid, depending on the contents of the tapsets, and the versions of
62kernel/user software installed.)
ca88561f 63
ba4a90fd
FCE
64.SAMPLE
65kernel.function("foo").return
e904ad95 66process("/bin/vi").statement(0x2222)
ba4a90fd 67end
729286d8 68syscall.*
649260f3 69sys**open
6e3347a9 70kernel.function("no_such_function") ?
d898100a 71module("awol").function("no_such_function") !
dfd11cc3 72signal.*? if (switch)
94c3c803 73kprobe.function("foo")
ba4a90fd
FCE
74.ESAMPLE
75
6f05b6ab
FCE
76Probes may be broadly classified into "synchronous" and
77"asynchronous". A "synchronous" event is deemed to occur when any
78processor executes an instruction matched by the specification. This
79gives these probes a reference point (instruction address) from which
80more contextual data may be available. Other families of probe points
81refer to "asynchronous" events such as timers/counters rolling over,
82where there is no fixed reference point that is related. Each probe
83point specification may match multiple locations (for example, using
84wildcards or aliases), and all them are then probed. A probe
85declaration may also contain several comma-separated specifications,
86all of which are probed.
87
5f92f126
FCE
88.SH DWARF DEBUGINFO
89
90Resolving some probe points requires DWARF debuginfo or "debug
91symbols" for the specific part being instrumented. For some others,
92DWARF is automatically synthesized on the fly from source code header
93files. For others, it is not needed at all. Since a systemtap script
94may use any mixture of probe points together, the union of their DWARF
95requirements has to be met on the computer where script compilation
96occurs. (See the \fI\-\-use\-server\fR option and the \fBstap-server\
97(8)\fR man page for information about the remote compilation facility,
98which allows these requirements to be met on a different machine.)
99.PP
100The following point lists many of the available probe point families,
101to classify them with respect to their need for DWARF debuginfo.
102
103.TS
104l l l.
105\fBDWARF AUTO-DWARF NON-DWARF\fP
106
107kernel.function, .statement kernel.trace kernel.mark
108module.function, .statement process.mark
109process.function, .statement begin, end, error, never
110process.mark \fI(backup)\fP timer
111 perf
112 procfs
113 kernel.statement.absolute
64800fd0 114 kernel.data
5f92f126
FCE
115 kprobe.function
116 process.statement.absolute
117 process.begin, .end, .error
118.TE
119
120.SH PROBE POINT FAMILIES
121
65aeaea0 122.SS BEGIN/END/ERROR
ba4a90fd
FCE
123
124The probe points
125.IR begin " and " end
126are defined by the translator to refer to the time of session startup
127and shutdown. All "begin" probe handlers are run, in some sequence,
128during the startup of the session. All global variables will have
129been initialized prior to this point. All "end" probes are run, in
130some sequence, during the
131.I normal
132shutdown of a session, such as in the aftermath of an
133.I exit ()
134function call, or an interruption from the user. In the case of an
135error-triggered shutdown, "end" probes are not run. There are no
136target variables available in either context.
6a256b03
JS
137.PP
138If the order of execution among "begin" or "end" probes is significant,
139then an optional sequence number may be provided:
ca88561f 140
6a256b03
JS
141.SAMPLE
142begin(N)
143end(N)
144.ESAMPLE
ca88561f 145
6a256b03
JS
146The number N may be positive or negative. The probe handlers are run in
147increasing order, and the order between handlers with the same sequence
148number is unspecified. When "begin" or "end" are given without a
149sequence, they are effectively sequence zero.
ba4a90fd 150
65aeaea0
FCE
151The
152.IR error
153probe point is similar to the
154.IR end
d898100a
FCE
155probe, except that each such probe handler run when the session ends
156after errors have occurred. In such cases, "end" probes are skipped,
37f6433e 157but each "error" probe is still attempted. This kind of probe can be
d898100a
FCE
158used to clean up or emit a "final gasp". It may also be numerically
159parametrized to set a sequence.
65aeaea0 160
6e3347a9
FCE
161.SS NEVER
162The probe point
163.IR never
164is specially defined by the translator to mean "never". Its probe
165handler is never run, though its statements are analyzed for symbol /
166type correctness as usual. This probe point may be useful in
167conjunction with optional probes.
168
1027502b
FCE
169.SS SYSCALL
170
171The
172.IR syscall.*
173aliases define several hundred probes, too many to
174summarize here. They are:
175
176.SAMPLE
177syscall.NAME
178.br
179syscall.NAME.return
180.ESAMPLE
181
182Generally, two probes are defined for each normal system call as listed in the
183.IR syscalls(2)
184manual page, one for entry and one for return. Those system calls that never
185return do not have a corresponding
186.IR .return
187probe.
188.PP
df7f3a01 189Each probe alias provides a variety of variables. Looking at the tapset source
1027502b
FCE
190code is the most reliable way. Generally, each variable listed in the standard
191manual page is made available as a script-level variable, so
192.IR syscall.open
193exposes
194.IR filename ", " flags ", and " mode .
195In addition, a standard suite of variables is available at most aliases:
196.TP
197.IR argstr
198A pretty-printed form of the entire argument list, without parentheses.
199.TP
200.IR name
201The name of the system call.
202.TP
203.IR retstr
204For return probes, a pretty-printed form of the system-call result.
205.PP
df7f3a01
FCE
206As usual for probe aliases, these variables are all simply initialized
207once from the underlying $context variables, so that later changes to
208$context variables are not automatically reflected. Not all probe
209aliases obey all of these general guidelines. Please report any
210bothersome ones you encounter as a bug.
1027502b
FCE
211
212
ba4a90fd
FCE
213.SS TIMERS
214
215Intervals defined by the standard kernel "jiffies" timer may be used
216to trigger probe handlers asynchronously. Two probe point variants
217are supported by the translator:
ca88561f 218
ba4a90fd
FCE
219.SAMPLE
220timer.jiffies(N)
221timer.jiffies(N).randomize(M)
222.ESAMPLE
ca88561f 223
ba4a90fd
FCE
224The probe handler is run every N jiffies (a kernel-defined unit of
225time, typically between 1 and 60 ms). If the "randomize" component is
13d2ecdb 226given, a linearly distributed random value in the range [\-M..+M] is
ba4a90fd
FCE
227added to N every time the handler is run. N is restricted to a
228reasonable range (1 to around a million), and M is restricted to be
229smaller than N. There are no target variables provided in either
230context. It is possible for such probes to be run concurrently on
231a multi-processor computer.
422d1ceb 232.PP
197a4d62 233Alternatively, intervals may be specified in units of time.
422d1ceb 234There are two probe point variants similar to the jiffies timer:
ca88561f 235
422d1ceb
FCE
236.SAMPLE
237timer.ms(N)
238timer.ms(N).randomize(M)
239.ESAMPLE
ca88561f 240
197a4d62
JS
241Here, N and M are specified in milliseconds, but the full options for units
242are seconds (s/sec), milliseconds (ms/msec), microseconds (us/usec),
243nanoseconds (ns/nsec), and hertz (hz). Randomization is not supported for
244hertz timers.
245
246The actual resolution of the timers depends on the target kernel. For
247kernels prior to 2.6.17, timers are limited to jiffies resolution, so
248intervals are rounded up to the nearest jiffies interval. After 2.6.17,
249the implementation uses hrtimers for tighter precision, though the actual
250resolution will be arch-dependent. In either case, if the "randomize"
251component is given, then the random value will be added to the interval
252before any rounding occurs.
39e57ce0
FCE
253.PP
254Profiling timers are also available to provide probes that execute on all
3ca1f652
FCE
255CPUs at the rate of the system tick (CONFIG_HZ).
256This probe takes no parameters.
ca88561f 257
39e57ce0
FCE
258.SAMPLE
259timer.profile
260.ESAMPLE
ca88561f 261
39e57ce0
FCE
262Full context information of the interrupted process is available, making
263this probe suitable for a time-based sampling profiler.
ba4a90fd
FCE
264
265.SS DWARF
266
267This family of probe points uses symbolic debugging information for
268the target kernel/module/program, as may be found in unstripped
269executables, or the separate
270.I debuginfo
271packages. They allow placement of probes logically into the execution
272path of the target program, by specifying a set of points in the
273source or object code. When a matching statement executes on any
274processor, the probe handler is run in that context.
275.PP
276Points in a kernel, which are identified by
ca88561f 277module, source file, line number, function name, or some
6f05b6ab 278combination of these.
ba4a90fd
FCE
279.PP
280Here is a list of probe point families currently supported. The
281.B .function
282variant places a probe near the beginning of the named function, so that
283parameters are available as context variables. The
284.B .return
39e3139a
FCE
285variant places a probe at the moment
286.B after
287the return from the named function, so the return value is available
288as the "$return" context variable. The
54efe513 289.B .inline
b8da0ad1 290modifier for
54efe513 291.B .function
b8da0ad1
FCE
292filters the results to include only instances of inlined functions.
293The
294.B .call
295modifier selects the opposite subset. Inline functions do not have an
296identifiable return point, so
54efe513
GH
297.B .return
298is not supported on
299.B .inline
300probes. The
ba4a90fd
FCE
301.B .statement
302variant places a probe at the exact spot, exposing those local variables
303that are visible there.
ca88561f 304
ba4a90fd
FCE
305.SAMPLE
306kernel.function(PATTERN)
307.br
b8da0ad1
FCE
308kernel.function(PATTERN).call
309.br
ba4a90fd
FCE
310kernel.function(PATTERN).return
311.br
b8da0ad1 312kernel.function(PATTERN).inline
54efe513 313.br
592470cd
SC
314kernel.function(PATTERN).label(LPATTERN)
315.br
ba4a90fd
FCE
316module(MPATTERN).function(PATTERN)
317.br
b8da0ad1
FCE
318module(MPATTERN).function(PATTERN).call
319.br
ba4a90fd
FCE
320module(MPATTERN).function(PATTERN).return
321.br
b8da0ad1
FCE
322module(MPATTERN).function(PATTERN).inline
323.br
2cab6244
JS
324module(MPATTERN).function(PATTERN).label(LPATTERN)
325.br
54efe513 326.br
ba4a90fd
FCE
327kernel.statement(PATTERN)
328.br
37ebca01
FCE
329kernel.statement(ADDRESS).absolute
330.br
ba4a90fd 331module(MPATTERN).statement(PATTERN)
6f017dee
FCE
332.br
333process("PATH").function("NAME")
334.br
335process("PATH").statement("*@FILE.c:123")
336.br
b73a1293
SC
337process("PATH").library("PATH").function("NAME")
338.br
339process("PATH").library("PATH").statement("*@FILE.c:123")
340.br
6f017dee
FCE
341process("PATH").function("*").return
342.br
343process("PATH").function("myfun").label("foo")
5fa99496
FCE
344.br
345process(PID).statement(ADDRESS).absolute
ba4a90fd 346.ESAMPLE
ca88561f 347
6f017dee
FCE
348(See the USER-SPACE section below for more information on the process
349probes.)
350
ba4a90fd 351In the above list, MPATTERN stands for a string literal that aims to
592470cd
SC
352identify the loaded kernel module of interest and LPATTERN stands for
353a source program label. Both MPATTERN and LPATTERN may include the "*"
354"[]", and "?" wildcards.
355PATTERN stands for a string literal that
6f05b6ab 356aims to identify a point in the program. It is made up of three
ca88561f
MM
357parts:
358.IP \(bu 4
359The first part is the name of a function, as would appear in the
ba4a90fd
FCE
360.I nm
361program's output. This part may use the "*" and "?" wildcarding
ca88561f
MM
362operators to match multiple names.
363.IP \(bu 4
364The second part is optional and begins with the "@" character.
365It is followed by the path to the source file containing the function,
366which may include a wildcard pattern, such as mm/slab*.
79640c29 367If it does not match as is, an implicit "*/" is optionally added
ea384b8c 368.I before
79640c29
FCE
369the pattern, so that a script need only name the last few components
370of a possibly long source directory path.
ca88561f 371.IP \(bu 4
ba4a90fd 372Finally, the third part is optional if the file name part was given,
1bd128a3
SC
373and identifies the line number in the source file preceded by a ":"
374or a "+". The line number is assumed to be an
375absolute line number if preceded by a ":", or relative to the entry of
99a5f9cf
SC
376the function if preceded by a "+".
377All the lines in the function can be matched with ":*".
f7470174 378A range of lines x through y can be matched with ":x\-y".
ca88561f 379.PP
ba4a90fd 380As an alternative, PATTERN may be a numeric constant, indicating an
ea384b8c
FCE
381address. Such an address may be found from symbol tables of the
382appropriate kernel / module object file. It is verified against
383known statement code boundaries, and will be relocated for use at
384run time.
385.PP
386In guru mode only, absolute kernel-space addresses may be specified with
387the ".absolute" suffix. Such an address is considered already relocated,
388as if it came from
389.BR /proc/kallsyms ,
390so it cannot be checked against statement/instruction boundaries.
6f017dee
FCE
391
392.SS CONTEXT VARIABLES
393
ba4a90fd 394.PP
6f017dee 395Many of the source-level context variables, such as function parameters,
ba4a90fd
FCE
396locals, globals visible in the compilation unit, may be visible to
397probe handlers. They may refer to these variables by prefixing their
398name with "$" within the scripts. In addition, a special syntax
6f017dee
FCE
399allows limited traversal of structures, pointers, and arrays. More
400syntax allows pretty-printing of individual variables or their groups.
401See also
402.BR @cast .
403
ba4a90fd
FCE
404.TP
405$var
406refers to an in-scope variable "var". If it's an integer-like type,
7b9361d5
FCE
407it will be cast to a 64-bit int for systemtap script use. String-like
408pointers (char *) may be copied to systemtap string values using the
409.IR kernel_string " or " user_string
410functions.
ba4a90fd 411.TP
ab5e90c2
FCE
412$var\->field traversal via a structure's or a pointer's field. This
413generalized indirection operator may be repeated to follow more
414levels. Note that the
415.IR .
416operator is not used for plain structure
417members, only
418.IR \->
419for both purposes. (This is because "." is reserved for string
420concatenation.)
ba4a90fd 421.TP
a43ba433
FCE
422$return
423is available in return probes only for functions that are declared
424with a return value.
425.TP
ba4a90fd 426$var[N]
33b081c5
JS
427indexes into an array. The index given with a literal number or even
428an arbitrary numeric expression.
6f017dee
FCE
429.PP
430A number of operators exist for such basic context variable expressions:
34af38db 431.TP
2cb3fe26
SC
432$$vars
433expands to a character string that is equivalent to
6f017dee
FCE
434.SAMPLE
435sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
436 parm1, ..., parmN, var1, ..., varN)
437.ESAMPLE
438for each variable in scope at the probe point. Some values may be
439printed as
440.IR =?
441if their run-time location cannot be found.
2cb3fe26
SC
442.TP
443$$locals
a43ba433 444expands to a subset of $$vars for only local variables.
2cb3fe26
SC
445.TP
446$$parms
a43ba433
FCE
447expands to a subset of $$vars for only function parameters.
448.TP
449$$return
450is available in return probes only. It expands to a string that
fd574705 451is equivalent to sprintf("return=%x", $return)
a43ba433 452if the probed function has a return value, or else an empty string.
6f017dee
FCE
453.TP
454& $EXPR
455expands to the address of the given context variable expression, if it
456is addressable.
457.TP
458@defined($EXPR)
459expands to 1 or 0 iff the given context variable expression is resolvable,
460for use in conditionals such as
461.SAMPLE
f7470174 462@defined($foo\->bar) ? $foo\->bar : 0
6f017dee
FCE
463.ESAMPLE
464.TP
465$EXPR$
466expands to a string with all of $EXPR's members, equivalent to
467.SAMPLE
468sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
469 $EXPR\->a, $EXPR\->b)
470.ESAMPLE
471.TP
472$EXPR$$
473expands to a string with all of $var's members and submembers, equivalent to
474.SAMPLE
475sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
476 $EXPR\->a, $EXPR\->b, $EXPR\->c\->x, $EXPR\->c\->y, $EXPR\->d[0])
477.ESAMPLE
478
39e3139a
FCE
479.PP
480For ".return" probes, context variables other than the "$return"
481value itself are only available for the function call parameters.
482The expressions evaluate to the
483.IR entry-time
484values of those variables, since that is when a snapshot is taken.
485Other local variables are not generally accessible, since by the time
486a ".return" probe hits, the probed function will have already returned.
8cc799a5
JS
487.PP
488Arbitrary entry-time expressions can also be saved for ".return"
489probes using the
490.IR @entry(expr)
491operator. For example, one can compute the elapsed time of a function:
492.SAMPLE
493probe kernel.function("do_filp_open").return {
494 println( get_timeofday_us() \- @entry(get_timeofday_us()) )
495}
496.ESAMPLE
39e3139a 497
ba4a90fd 498
94c3c803
AM
499.SS DWARFLESS
500In absence of debugging information, entry & exit points of kernel & module
501functions can be probed using the "kprobe" family of probes.
502However, these do not permit looking up the arguments / local variables
503of the function.
504Following constructs are supported :
505.SAMPLE
506kprobe.function(FUNCTION)
507kprobe.function(FUNCTION).return
508kprobe.module(NAME).function(FUNCTION)
509kprobe.module(NAME).function(FUNCTION).return
510kprobe.statement.(ADDRESS).absolute
511.ESAMPLE
512.PP
513Probes of type
514.B function
515are recommended for kernel functions, whereas probes of type
516.B module
517are recommended for probing functions of the specified module.
518In case the absolute address of a kernel or module function is known,
519.B statement
520probes can be utilized.
521.PP
522Note that
523.I FUNCTION
524and
525.I MODULE
526names
527.B must not
528contain wildcards, or the probe will not be registered.
529Also, statement probes must be run under guru-mode only.
530
531
1ada6f08 532.SS USER-SPACE
0a1c696d
FCE
533Support for user-space probing is available for kernels
534that are configured with the utrace extensions. See
535.SAMPLE
536http://people.redhat.com/roland/utrace/
537.ESAMPLE
538.PP
539There are several forms. First, a non-symbolic probe point:
1ada6f08
FCE
540.SAMPLE
541process(PID).statement(ADDRESS).absolute
542.ESAMPLE
543is analogous to
544.IR
545kernel.statement(ADDRESS).absolute
546in that both use raw (unverified) virtual addresses and provide
547no $variables. The target PID parameter must identify a running
548process, and ADDRESS should identify a valid instruction address.
549All threads of that process will be probed.
29cb9b42 550.PP
0a1c696d
FCE
551Second, non-symbolic user-kernel interface events handled by
552utrace may be probed:
29cb9b42 553.SAMPLE
dd078c96 554process(PID).begin
82f0e81b 555process("FULLPATH").begin
986e98de 556process.begin
dd078c96 557process(PID).thread.begin
82f0e81b 558process("FULLPATH").thread.begin
986e98de 559process.thread.begin
dd078c96 560process(PID).end
82f0e81b 561process("FULLPATH").end
986e98de 562process.end
dd078c96 563process(PID).thread.end
82f0e81b 564process("FULLPATH").thread.end
986e98de 565process.thread.end
29cb9b42 566process(PID).syscall
82f0e81b 567process("FULLPATH").syscall
986e98de 568process.syscall
29cb9b42 569process(PID).syscall.return
82f0e81b 570process("FULLPATH").syscall.return
986e98de 571process.syscall.return
0afb7073 572process(PID).insn
82f0e81b 573process("FULLPATH").insn
0afb7073 574process(PID).insn.block
82f0e81b 575process("FULLPATH").insn.block
29cb9b42
DS
576.ESAMPLE
577.PP
578A
dd078c96 579.B .begin
82f0e81b 580probe gets called when new process described by PID or FULLPATH gets created.
29cb9b42 581A
dd078c96 582.B .thread.begin
82f0e81b 583probe gets called when a new thread described by PID or FULLPATH gets created.
159cb109 584A
dd078c96 585.B .end
82f0e81b 586probe gets called when process described by PID or FULLPATH dies.
dd078c96
DS
587A
588.B .thread.end
82f0e81b 589probe gets called when a thread described by PID or FULLPATH dies.
29cb9b42
DS
590A
591.B .syscall
82f0e81b 592probe gets called when a thread described by PID or FULLPATH makes a
6270adc1
MH
593system call. The system call number is available in the
594.BR $syscall
595context variable, and the first 6 arguments of the system call
596are available in the
597.BR $argN
598(ex. $arg1, $arg2, ...) context variable.
29cb9b42
DS
599A
600.B .syscall.return
82f0e81b 601probe gets called when a thread described by PID or FULLPATH returns from a
5d67b47c
MH
602system call. The system call number is available in the
603.BR $syscall
604context variable, and the return value of the system call is available
605in the
606.BR $return
29cb9b42 607context variable.
a96d1db0 608A
0afb7073 609.B .insn
82f0e81b 610probe gets called for every single-stepped instruction of the process described by PID or FULLPATH.
0afb7073
FCE
611A
612.B .insn.block
82f0e81b
FCE
613probe gets called for every block-stepped instruction of the process described by PID or FULLPATH.
614.PP
615If a process probe is specified without a PID or FULLPATH, all user
616threads will be probed. However, if systemtap was invoked with the
f7470174 617.IR \-c " or " \-x
82f0e81b
FCE
618options, then process probes are restricted to the process
619hierarchy associated with the target process.
0a1c696d
FCE
620
621.PP
622Third, symbolic static instrumentation compiled into programs and
623shared libraries may be
624probed:
625.SAMPLE
626process("PATH").mark("LABEL")
a794dbeb 627process("PATH").provider("PROVIDER").mark("LABEL")
0a1c696d
FCE
628.ESAMPLE
629.PP
f28a8c28
SC
630A
631.B .mark
632probe gets called via a static probe which is defined in the
a794dbeb
FCE
633application by STAP_PROBE1(PROVIDER,LABEL,arg1), which is defined in
634sdt.h. The handle is an application handle, LABEL corresponds to
635the .mark argument, and arg1 is the argument. STAP_PROBE1 is used for
636probes with 1 argument, STAP_PROBE2 is used for probes with 2
637arguments, and so on. The arguments of the probe are available in the
638context variables $arg1, $arg2, ... An alternative to using the
639STAP_PROBE macros is to use the dtrace script to create custom macros.
640Additionally, the variables $$name and $$provider are available as
641parts of the probe point name.
0a1c696d 642
29cb9b42 643.PP
0a1c696d
FCE
644Finally, full symbolic source-level probes in user-space programs
645and shared libraries are supported. These are exactly analogous
646to the symbolic DWARF-based kernel/module probes described above,
f7470174 647and expose similar contextual $variables.
0a1c696d
FCE
648.SAMPLE
649process("PATH").function("NAME")
650process("PATH").statement("*@FILE.c:123")
b73a1293
SC
651process("PATH").library("PATH").function("NAME")
652process("PATH").library("PATH").statement("*@FILE.c:123")
0a1c696d
FCE
653process("PATH").function("*").return
654process("PATH").function("myfun").label("foo")
655.ESAMPLE
656
657.PP
658Note that for all process probes,
29cb9b42 659.I PATH
ea384b8c
FCE
660names refer to executables that are searched the same way shells do: relative
661to the working directory if they contain a "/" character, otherwise in
662.BR $PATH .
d1bcbe71
RH
663If PATH names refer to scripts, the actual interpreters (specified in the
664script in the first line after the #! characters) are probed.
b73a1293
SC
665If PATH is a process component parameter referring to shared libraries
666then all processes that map it at runtime would be selected for
667probing. If PATH is a library component parameter referring to shared
668libraries then the process specified by the process component would be
669selected.
82f0e81b
FCE
670If the PATH string contains wildcards as in the MPATTERN case, then
671standard globbing is performed to find all matching paths. In this
672case, the
673.BR $PATH
674environment variable is not used.
675
676.PP
153e7a22
FCE
677If systemtap was invoked with the
678.IR \-c " or " \-x
760695db
FCE
679options, then process probes are restricted to the process
680hierarchy associated with the target process.
1ada6f08 681
9cb48751
DS
682.SS PROCFS
683
684These probe points allow procfs "files" in
c243f608
LB
685/proc/systemtap/MODNAME to be created, read and written using a
686permission that may be modified using the proper umask value. Default permissions are 0400 for read
687probes, and 0200 for write probes. If both a read and write probe are being
688used on the same file, a default permission of 0600 will be used.
689Using procfs.umask(0040).read would
690result in a 0404 permission set for the file.
9cb48751
DS
691.RI ( MODNAME
692is the name of the systemtap module). The
693.I proc
694filesystem is a pseudo-filesystem which is used an an interface to
c243f608 695kernel data structures. There are several probe point variants supported
9cb48751 696by the translator:
ca88561f 697
9cb48751
DS
698.SAMPLE
699procfs("PATH").read
c243f608 700procfs("PATH").umask(UMASK).read
38975255 701procfs("PATH").read.maxsize(MAXSIZE)
c243f608 702procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
9cb48751 703procfs("PATH").write
c243f608 704procfs("PATH").umask(UMASK).write
9cb48751 705procfs.read
c243f608 706procfs.umask(UMASK).read
38975255 707procfs.read.maxsize(MAXSIZE)
c243f608 708procfs.umask(UMASK).read.maxsize(MAXSIZE)
9cb48751 709procfs.write
c243f608 710procfs.umask(UMASK).write
9cb48751 711.ESAMPLE
ca88561f 712
9cb48751
DS
713.I PATH
714is the file name (relative to /proc/systemtap/MODNAME) to be created.
715If no
716.I PATH
717is specified (as in the last two variants above),
718.I PATH
719defaults to "command".
720.PP
721When a user reads /proc/systemtap/MODNAME/PATH, the corresponding
722procfs
723.I read
724probe is triggered. The string data to be read should be assigned to
725a variable named
726.IR $value ,
727like this:
ca88561f 728
9cb48751
DS
729.SAMPLE
730procfs("PATH").read { $value = "100\\n" }
731.ESAMPLE
732.PP
733When a user writes into /proc/systemtap/MODNAME/PATH, the
734corresponding procfs
735.I write
736probe is triggered. The data the user wrote is available in the
737string variable named
738.IR $value ,
739like this:
ca88561f 740
9cb48751
DS
741.SAMPLE
742procfs("PATH").write { printf("user wrote: %s", $value) }
743.ESAMPLE
38975255
DS
744.PP
745.I MAXSIZE
746is the size of the procfs read buffer. Specifying
747.I MAXSIZE
748allows larger procfs output. If no
749.I MAXSIZE
750is specified, the procfs read buffer defaults to
751.I STP_PROCFS_BUFSIZE
752(which defaults to
753.IR MAXSTRINGLEN ,
754the maximum length of a string).
755If setting the procfs read buffers for more than one file is needed,
756it may be easiest to override the
757.I STP_PROCFS_BUFSIZE
758definition.
759Here's an example of using
760.IR MAXSIZE :
761
762.SAMPLE
763procfs.read.maxsize(1024) {
764 $value = "long string..."
765 $value .= "another long string..."
766 $value .= "another long string..."
767 $value .= "another long string..."
768}
769.ESAMPLE
9cb48751 770
6f05b6ab
FCE
771.SS MARKERS
772
773This family of probe points hooks up to static probing markers
774inserted into the kernel or modules. These markers are special macro
775calls inserted by kernel developers to make probing faster and more
776reliable than with DWARF-based probes. Further, DWARF debugging
777information is
778.I not
779required to probe markers.
780
781Marker probe points begin with
f781f849
DS
782.BR kernel .
783The next part names the marker itself:
6f05b6ab
FCE
784.BR mark("name") .
785The marker name string, which may contain the usual wildcard characters,
786is matched against the names given to the marker macros when the kernel
eb973c2a
DS
787and/or module was compiled. Optionally, you can specify
788.BR format("format") .
37f6433e 789Specifying the marker format string allows differentiation between two
eb973c2a 790markers with the same name but different marker format strings.
6f05b6ab
FCE
791
792The handler associated with a marker-based probe may read the
793optional parameters specified at the macro call site. These are
794named
795.BR $arg1 " through " $argNN ,
796where NN is the number of parameters supplied by the macro. Number
797and string parameters are passed in a type-safe manner.
798
eb973c2a
DS
799The marker format string associated with a marker is available in
800.BR $format .
37f6433e 801And also the marker name string is available in
bc54e71c 802.BR $name .
eb973c2a 803
bc724b8b
JS
804.SS TRACEPOINTS
805
806This family of probe points hooks up to static probing tracepoints
807inserted into the kernel or modules. As with markers, these
808tracepoints are special macro calls inserted by kernel developers to
809make probing faster and more reliable than with DWARF-based probes,
810and DWARF debugging information is not required to probe tracepoints.
811Tracepoints have an extra advantage of more strongly-typed parameters
812than markers.
813
814Tracepoint probes begin with
815.BR kernel .
816The next part names the tracepoint itself:
817.BR trace("name") .
818The tracepoint name string, which may contain the usual wildcard
819characters, is matched against the names defined by the kernel
820developers in the tracepoint header files.
821
822The handler associated with a tracepoint-based probe may read the
823optional parameters specified at the macro call site. These are
824named according to the declaration by the tracepoint author. For
825example, the tracepoint probe
826.BR kernel.trace("sched_switch")
827provides the parameters
828.BR $rq ", " $prev ", and " $next .
829If the parameter is a complex type, as in a struct pointer, then a
830script can access fields with the same syntax as DWARF $target
831variables. Also, tracepoint parameters cannot be modified, but in
832guru-mode a script may modify fields of parameters.
833
834The name of the tracepoint is available in
835.BR $$name ,
836and a string of name=value pairs for all parameters of the tracepoint
837is available in
046e7190 838.BR $$vars " or " $$parms .
bc724b8b 839
dd225250
PS
840.SS HARDWARE BREAKPOINTS
841This family of probes is used to set hardware watchpoints for a given
842 (global) kernel symbol. The probes take three components as inputs :
843
8441. The
845.BR virtual address / name
846of the kernel symbol to be traced is supplied as argument to this class
847of probes. ( Probes for only data segment variables are supported. Probing
848local variables of a function cannot be done.)
849
8502. Nature of access to be probed :
851a.
852.I .write
853probe gets triggered when a write happens at the specified address/symbol
854name.
855b.
856.I rw
857probe is triggered when either a read or write happens.
858
8593.
860.BR .length
861(optional)
862Users have the option of specifying the address interval to be probed
863using "length" constructs. The user-specified length gets approximated
864to the closest possible address length that the architecture can
865support. If the specified length exceeds the limits imposed by
866architecture, an error message is flagged and probe registration fails.
867Wherever 'length' is not specified, the translator requests a hardware
868breakpoint probe of length 1. It should be noted that the "length"
869construct is not valid with symbol names.
870
871Following constructs are supported :
872.SAMPLE
873probe kernel.data(ADDRESS).write
874probe kernel.data(ADDRESS).rw
875probe kernel.data(ADDRESS).length(LEN).write
876probe kernel.data(ADDRESS).length(LEN).rw
877probe kernel.data("SYMBOL_NAME").write
878probe kernel.data("SYMBOL_NAME").rw
879.ESAMPLE
880
881This set of probes make use of the debug registers of the processor,
882which is a scarce resource. (4 on x86 , 1 on powerpc ) The script
883translation flags a warning if a user requests more hardware breakpoint probes
884than the limits set by architecture. For example,a pass-2 warning is flashed
885when an input script requests 5 hardware breakpoint probes on an x86
886system while x86 architecture supports a maximum of 4 breakpoints.
887Users are cautioned to set probes judiciously.
888
ba4a90fd
FCE
889.SH EXAMPLES
890.PP
891Here are some example probe points, defining the associated events.
892.TP
893begin, end, end
894refers to the startup and normal shutdown of the session. In this
895case, the handler would run once during startup and twice during
896shutdown.
897.TP
898timer.jiffies(1000).randomize(200)
13d2ecdb 899refers to a periodic interrupt, every 1000 +/\- 200 jiffies.
ba4a90fd
FCE
900.TP
901kernel.function("*init*"), kernel.function("*exit*")
902refers to all kernel functions with "init" or "exit" in the name.
903.TP
904kernel.function("*@kernel/sched.c:240")
905refers to any functions within the "kernel/sched.c" file that span
6ff00e1d
FCE
906line 240.
907.BR
908Note
909that this is
910.BR not
911a probe at the statement at that line number. Use the
912.IR
913kernel.statement
914probe instead.
ba4a90fd 915.TP
6f05b6ab
FCE
916kernel.mark("getuid")
917refers to an STAP_MARK(getuid, ...) macro call in the kernel.
918.TP
ba4a90fd
FCE
919module("usb*").function("*sync*").return
920refers to the moment of return from all functions with "sync" in the
921name in any of the USB drivers.
922.TP
923kernel.statement(0xc0044852)
924refers to the first byte of the statement whose compiled instructions
925include the given address in the kernel.
b4ceace2 926.TP
a5ae3f3d 927kernel.statement("*@kernel/sched.c:2917")
1bd128a3
SC
928refers to the statement of line 2917 within "kernel/sched.c".
929.TP
930kernel.statement("bio_init@fs/bio.c+3")
931refers to the statement at line bio_init+3 within "fs/bio.c".
a5ae3f3d 932.TP
dd225250
PS
933kernel.data("pid_max").write
934refers to a hardware preakpoint of type "write" set on pid_max
935.TP
729286d8 936syscall.*.return
b4ceace2 937refers to the group of probe aliases with any name in the third position
ba4a90fd 938
f33e9151
FCE
939.SS PERF
940
941This
942.IR prototype
943family of probe points interfaces to the kernel "perf event"
944infrasture for controlling hardware performance counters.
945The events being attached to are described by the "type",
946"config" fields of the
947.IR perf_event_attr
948structure, and are sampled at an interval governed by the
949"sample_period" field.
950
951These fields are made available to systemtap scripts using
952the following syntax:
953.SAMPLE
bb9fd173 954probe perf.type(NN).config(MM).sample(XX)
f33e9151
FCE
955probe perf.type(NN).config(MM)
956.ESAMPLE
957The range of valid type/config is described by the
958.IR perf_event_open (2)
959system call, and/or the
960.IR linux/perf_event.h
8fb91f5f
FCE
961file. Invalid combinations or exhausted hardware counter resources
962result in errors during systemtap script startup. Systemtap does
f33e9151
FCE
963not sanity-check the values: it merely passes them through to
964the kernel for error- and safety-checking.
965
ba4a90fd 966.SH SEE ALSO
78db65bd 967.IR stap (1),
89965a32
FCE
968.IR probe::* (3stap),
969.IR tapset::* (3stap)
This page took 0.215041 seconds and 5 git commands to generate.