]> sourceware.org Git - systemtap.git/blame - stapprobes.3stap
PR11376: a quickie test case for process(PID).statement(NUM).absolute
[systemtap.git] / stapprobes.3stap
CommitLineData
ba4a90fd 1.\" -*- nroff -*-
ec1a2239 2.TH STAPPROBES 3stap
ba4a90fd
FCE
3.SH NAME
4stapprobes \- systemtap probe points
5
6.\" macros
7.de SAMPLE
8.br
9.RS
10.nf
11.nh
12..
13.de ESAMPLE
14.hy
15.fi
16.RE
17..
18
19.SH DESCRIPTION
20The following sections enumerate the variety of probe points supported
89965a32
FCE
21by the systemtap translator, and some of the additional aliases defined by
22standard tapset scripts. Many are individually documented in the
23.IR 3stap
24manual section, with the
25.IR probe::
26prefix.
ba4a90fd 27.PP
7abecb38 28The general probe point syntax is a dotted-symbol sequence. This
ba4a90fd
FCE
29allows a breakdown of the event namespace into parts, somewhat like
30the Domain Name System does on the Internet. Each component
7abecb38 31identifier may be parametrized by a string or number literal, with a
d898100a 32syntax like a function call. A component may include a "*" character,
649260f3
JS
33to expand to a set of matching probe points. It may also include "**"
34to match multiple sequential components at once. Probe aliases likewise
d898100a
FCE
35expand to other probe points. Each and every resulting probe point is
36normally resolved to some low-level system instrumentation facility
37(e.g., a kprobe address, marker, or a timer configuration), otherwise
38the elaboration phase will fail.
39.PP
40However, a probe point may be followed by a "?" character, to indicate
41that it is optional, and that no error should result if it fails to
42resolve. Optionalness passes down through all levels of
43alias/wildcard expansion. Alternately, a probe point may be followed
44by a "!" character, to indicate that it is both optional and
37f6433e 45sufficient. (Think vaguely of the Prolog cut operator.) If it does
d898100a
FCE
46resolve, then no further probe points in the same comma-separated list
47will be resolved. Therefore, the "!" sufficiency mark only makes
48sense in a list of probe point alternatives.
dfd11cc3
MH
49.PP
50Additionally, a probe point may be followed by a "if (expr)" statement, in
51order to enable/disable the probe point on-the-fly. With the "if" statement,
52if the "expr" is false when the probe point is hit, the whole probe body
53including alias's body is skipped. The condition is stacked up through
54all levels of alias/wildcard expansion. So the final condition becomes
55the logical-and of conditions of all expanded alias/wildcard.
6e3347a9 56
e904ad95
FCE
57These are all
58.B syntactically
59valid probe points. (They are generally
60.B semantically
61invalid, depending on the contents of the tapsets, and the versions of
62kernel/user software installed.)
ca88561f 63
ba4a90fd
FCE
64.SAMPLE
65kernel.function("foo").return
e904ad95 66process("/bin/vi").statement(0x2222)
ba4a90fd 67end
729286d8 68syscall.*
649260f3 69sys**open
6e3347a9 70kernel.function("no_such_function") ?
d898100a 71module("awol").function("no_such_function") !
dfd11cc3 72signal.*? if (switch)
94c3c803 73kprobe.function("foo")
ba4a90fd
FCE
74.ESAMPLE
75
e904ad95 76
6f05b6ab
FCE
77Probes may be broadly classified into "synchronous" and
78"asynchronous". A "synchronous" event is deemed to occur when any
79processor executes an instruction matched by the specification. This
80gives these probes a reference point (instruction address) from which
81more contextual data may be available. Other families of probe points
82refer to "asynchronous" events such as timers/counters rolling over,
83where there is no fixed reference point that is related. Each probe
84point specification may match multiple locations (for example, using
85wildcards or aliases), and all them are then probed. A probe
86declaration may also contain several comma-separated specifications,
87all of which are probed.
88
65aeaea0 89.SS BEGIN/END/ERROR
ba4a90fd
FCE
90
91The probe points
92.IR begin " and " end
93are defined by the translator to refer to the time of session startup
94and shutdown. All "begin" probe handlers are run, in some sequence,
95during the startup of the session. All global variables will have
96been initialized prior to this point. All "end" probes are run, in
97some sequence, during the
98.I normal
99shutdown of a session, such as in the aftermath of an
100.I exit ()
101function call, or an interruption from the user. In the case of an
102error-triggered shutdown, "end" probes are not run. There are no
103target variables available in either context.
6a256b03
JS
104.PP
105If the order of execution among "begin" or "end" probes is significant,
106then an optional sequence number may be provided:
ca88561f 107
6a256b03
JS
108.SAMPLE
109begin(N)
110end(N)
111.ESAMPLE
ca88561f 112
6a256b03
JS
113The number N may be positive or negative. The probe handlers are run in
114increasing order, and the order between handlers with the same sequence
115number is unspecified. When "begin" or "end" are given without a
116sequence, they are effectively sequence zero.
ba4a90fd 117
65aeaea0
FCE
118The
119.IR error
120probe point is similar to the
121.IR end
d898100a
FCE
122probe, except that each such probe handler run when the session ends
123after errors have occurred. In such cases, "end" probes are skipped,
37f6433e 124but each "error" probe is still attempted. This kind of probe can be
d898100a
FCE
125used to clean up or emit a "final gasp". It may also be numerically
126parametrized to set a sequence.
65aeaea0 127
6e3347a9
FCE
128.SS NEVER
129The probe point
130.IR never
131is specially defined by the translator to mean "never". Its probe
132handler is never run, though its statements are analyzed for symbol /
133type correctness as usual. This probe point may be useful in
134conjunction with optional probes.
135
1027502b
FCE
136.SS SYSCALL
137
138The
139.IR syscall.*
140aliases define several hundred probes, too many to
141summarize here. They are:
142
143.SAMPLE
144syscall.NAME
145.br
146syscall.NAME.return
147.ESAMPLE
148
149Generally, two probes are defined for each normal system call as listed in the
150.IR syscalls(2)
151manual page, one for entry and one for return. Those system calls that never
152return do not have a corresponding
153.IR .return
154probe.
155.PP
156Each probe alias defines a variety of variables. Looking at the tapset source
157code is the most reliable way. Generally, each variable listed in the standard
158manual page is made available as a script-level variable, so
159.IR syscall.open
160exposes
161.IR filename ", " flags ", and " mode .
162In addition, a standard suite of variables is available at most aliases:
163.TP
164.IR argstr
165A pretty-printed form of the entire argument list, without parentheses.
166.TP
167.IR name
168The name of the system call.
169.TP
170.IR retstr
171For return probes, a pretty-printed form of the system-call result.
172.PP
173Not all probe aliases obey all of these general guidelines. Please report
174any bothersome ones you encounter as a bug.
175
176
ba4a90fd
FCE
177.SS TIMERS
178
179Intervals defined by the standard kernel "jiffies" timer may be used
180to trigger probe handlers asynchronously. Two probe point variants
181are supported by the translator:
ca88561f 182
ba4a90fd
FCE
183.SAMPLE
184timer.jiffies(N)
185timer.jiffies(N).randomize(M)
186.ESAMPLE
ca88561f 187
ba4a90fd
FCE
188The probe handler is run every N jiffies (a kernel-defined unit of
189time, typically between 1 and 60 ms). If the "randomize" component is
13d2ecdb 190given, a linearly distributed random value in the range [\-M..+M] is
ba4a90fd
FCE
191added to N every time the handler is run. N is restricted to a
192reasonable range (1 to around a million), and M is restricted to be
193smaller than N. There are no target variables provided in either
194context. It is possible for such probes to be run concurrently on
195a multi-processor computer.
422d1ceb 196.PP
197a4d62 197Alternatively, intervals may be specified in units of time.
422d1ceb 198There are two probe point variants similar to the jiffies timer:
ca88561f 199
422d1ceb
FCE
200.SAMPLE
201timer.ms(N)
202timer.ms(N).randomize(M)
203.ESAMPLE
ca88561f 204
197a4d62
JS
205Here, N and M are specified in milliseconds, but the full options for units
206are seconds (s/sec), milliseconds (ms/msec), microseconds (us/usec),
207nanoseconds (ns/nsec), and hertz (hz). Randomization is not supported for
208hertz timers.
209
210The actual resolution of the timers depends on the target kernel. For
211kernels prior to 2.6.17, timers are limited to jiffies resolution, so
212intervals are rounded up to the nearest jiffies interval. After 2.6.17,
213the implementation uses hrtimers for tighter precision, though the actual
214resolution will be arch-dependent. In either case, if the "randomize"
215component is given, then the random value will be added to the interval
216before any rounding occurs.
39e57ce0
FCE
217.PP
218Profiling timers are also available to provide probes that execute on all
3ca1f652
FCE
219CPUs at the rate of the system tick (CONFIG_HZ).
220This probe takes no parameters.
ca88561f 221
39e57ce0
FCE
222.SAMPLE
223timer.profile
224.ESAMPLE
ca88561f 225
39e57ce0
FCE
226Full context information of the interrupted process is available, making
227this probe suitable for a time-based sampling profiler.
ba4a90fd
FCE
228
229.SS DWARF
230
231This family of probe points uses symbolic debugging information for
232the target kernel/module/program, as may be found in unstripped
233executables, or the separate
234.I debuginfo
235packages. They allow placement of probes logically into the execution
236path of the target program, by specifying a set of points in the
237source or object code. When a matching statement executes on any
238processor, the probe handler is run in that context.
239.PP
240Points in a kernel, which are identified by
ca88561f 241module, source file, line number, function name, or some
6f05b6ab 242combination of these.
ba4a90fd
FCE
243.PP
244Here is a list of probe point families currently supported. The
245.B .function
246variant places a probe near the beginning of the named function, so that
247parameters are available as context variables. The
248.B .return
39e3139a
FCE
249variant places a probe at the moment
250.B after
251the return from the named function, so the return value is available
252as the "$return" context variable. The
54efe513 253.B .inline
b8da0ad1 254modifier for
54efe513 255.B .function
b8da0ad1
FCE
256filters the results to include only instances of inlined functions.
257The
258.B .call
259modifier selects the opposite subset. Inline functions do not have an
260identifiable return point, so
54efe513
GH
261.B .return
262is not supported on
263.B .inline
264probes. The
ba4a90fd
FCE
265.B .statement
266variant places a probe at the exact spot, exposing those local variables
267that are visible there.
ca88561f 268
ba4a90fd
FCE
269.SAMPLE
270kernel.function(PATTERN)
271.br
b8da0ad1
FCE
272kernel.function(PATTERN).call
273.br
ba4a90fd
FCE
274kernel.function(PATTERN).return
275.br
b8da0ad1 276kernel.function(PATTERN).inline
54efe513 277.br
592470cd
SC
278kernel.function(PATTERN).label(LPATTERN)
279.br
ba4a90fd
FCE
280module(MPATTERN).function(PATTERN)
281.br
b8da0ad1
FCE
282module(MPATTERN).function(PATTERN).call
283.br
ba4a90fd
FCE
284module(MPATTERN).function(PATTERN).return
285.br
b8da0ad1
FCE
286module(MPATTERN).function(PATTERN).inline
287.br
54efe513 288.br
ba4a90fd
FCE
289kernel.statement(PATTERN)
290.br
37ebca01
FCE
291kernel.statement(ADDRESS).absolute
292.br
ba4a90fd 293module(MPATTERN).statement(PATTERN)
6f017dee
FCE
294.br
295process("PATH").function("NAME")
296.br
297process("PATH").statement("*@FILE.c:123")
298.br
299process("PATH").function("*").return
300.br
301process("PATH").function("myfun").label("foo")
ba4a90fd 302.ESAMPLE
ca88561f 303
6f017dee
FCE
304(See the USER-SPACE section below for more information on the process
305probes.)
306
ba4a90fd 307In the above list, MPATTERN stands for a string literal that aims to
592470cd
SC
308identify the loaded kernel module of interest and LPATTERN stands for
309a source program label. Both MPATTERN and LPATTERN may include the "*"
310"[]", and "?" wildcards.
311PATTERN stands for a string literal that
6f05b6ab 312aims to identify a point in the program. It is made up of three
ca88561f
MM
313parts:
314.IP \(bu 4
315The first part is the name of a function, as would appear in the
ba4a90fd
FCE
316.I nm
317program's output. This part may use the "*" and "?" wildcarding
ca88561f
MM
318operators to match multiple names.
319.IP \(bu 4
320The second part is optional and begins with the "@" character.
321It is followed by the path to the source file containing the function,
322which may include a wildcard pattern, such as mm/slab*.
79640c29 323If it does not match as is, an implicit "*/" is optionally added
ea384b8c 324.I before
79640c29
FCE
325the pattern, so that a script need only name the last few components
326of a possibly long source directory path.
ca88561f 327.IP \(bu 4
ba4a90fd 328Finally, the third part is optional if the file name part was given,
1bd128a3
SC
329and identifies the line number in the source file preceded by a ":"
330or a "+". The line number is assumed to be an
331absolute line number if preceded by a ":", or relative to the entry of
99a5f9cf
SC
332the function if preceded by a "+".
333All the lines in the function can be matched with ":*".
334A range of lines x through y can be matched with ":x-y".
ca88561f 335.PP
ba4a90fd 336As an alternative, PATTERN may be a numeric constant, indicating an
ea384b8c
FCE
337address. Such an address may be found from symbol tables of the
338appropriate kernel / module object file. It is verified against
339known statement code boundaries, and will be relocated for use at
340run time.
341.PP
342In guru mode only, absolute kernel-space addresses may be specified with
343the ".absolute" suffix. Such an address is considered already relocated,
344as if it came from
345.BR /proc/kallsyms ,
346so it cannot be checked against statement/instruction boundaries.
6f017dee
FCE
347
348.SS CONTEXT VARIABLES
349
ba4a90fd 350.PP
6f017dee 351Many of the source-level context variables, such as function parameters,
ba4a90fd
FCE
352locals, globals visible in the compilation unit, may be visible to
353probe handlers. They may refer to these variables by prefixing their
354name with "$" within the scripts. In addition, a special syntax
6f017dee
FCE
355allows limited traversal of structures, pointers, and arrays. More
356syntax allows pretty-printing of individual variables or their groups.
357See also
358.BR @cast .
359
ba4a90fd
FCE
360.TP
361$var
362refers to an in-scope variable "var". If it's an integer-like type,
7b9361d5
FCE
363it will be cast to a 64-bit int for systemtap script use. String-like
364pointers (char *) may be copied to systemtap string values using the
365.IR kernel_string " or " user_string
366functions.
ba4a90fd 367.TP
ab5e90c2
FCE
368$var\->field traversal via a structure's or a pointer's field. This
369generalized indirection operator may be repeated to follow more
370levels. Note that the
371.IR .
372operator is not used for plain structure
373members, only
374.IR \->
375for both purposes. (This is because "." is reserved for string
376concatenation.)
ba4a90fd 377.TP
a43ba433
FCE
378$return
379is available in return probes only for functions that are declared
380with a return value.
381.TP
ba4a90fd 382$var[N]
33b081c5
JS
383indexes into an array. The index given with a literal number or even
384an arbitrary numeric expression.
6f017dee
FCE
385.PP
386A number of operators exist for such basic context variable expressions:
34af38db 387.TP
2cb3fe26
SC
388$$vars
389expands to a character string that is equivalent to
6f017dee
FCE
390.SAMPLE
391sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
392 parm1, ..., parmN, var1, ..., varN)
393.ESAMPLE
394for each variable in scope at the probe point. Some values may be
395printed as
396.IR =?
397if their run-time location cannot be found.
2cb3fe26
SC
398.TP
399$$locals
a43ba433 400expands to a subset of $$vars for only local variables.
2cb3fe26
SC
401.TP
402$$parms
a43ba433
FCE
403expands to a subset of $$vars for only function parameters.
404.TP
405$$return
406is available in return probes only. It expands to a string that
fd574705 407is equivalent to sprintf("return=%x", $return)
a43ba433 408if the probed function has a return value, or else an empty string.
6f017dee
FCE
409.TP
410& $EXPR
411expands to the address of the given context variable expression, if it
412is addressable.
413.TP
414@defined($EXPR)
415expands to 1 or 0 iff the given context variable expression is resolvable,
416for use in conditionals such as
417.SAMPLE
418@defined($foo->bar) ? $foo->bar : 0
419.ESAMPLE
420.TP
421$EXPR$
422expands to a string with all of $EXPR's members, equivalent to
423.SAMPLE
424sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
425 $EXPR\->a, $EXPR\->b)
426.ESAMPLE
427.TP
428$EXPR$$
429expands to a string with all of $var's members and submembers, equivalent to
430.SAMPLE
431sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
432 $EXPR\->a, $EXPR\->b, $EXPR\->c\->x, $EXPR\->c\->y, $EXPR\->d[0])
433.ESAMPLE
434
39e3139a
FCE
435.PP
436For ".return" probes, context variables other than the "$return"
437value itself are only available for the function call parameters.
438The expressions evaluate to the
439.IR entry-time
440values of those variables, since that is when a snapshot is taken.
441Other local variables are not generally accessible, since by the time
442a ".return" probe hits, the probed function will have already returned.
8cc799a5
JS
443.PP
444Arbitrary entry-time expressions can also be saved for ".return"
445probes using the
446.IR @entry(expr)
447operator. For example, one can compute the elapsed time of a function:
448.SAMPLE
449probe kernel.function("do_filp_open").return {
450 println( get_timeofday_us() \- @entry(get_timeofday_us()) )
451}
452.ESAMPLE
39e3139a 453
ba4a90fd 454
94c3c803
AM
455.SS DWARFLESS
456In absence of debugging information, entry & exit points of kernel & module
457functions can be probed using the "kprobe" family of probes.
458However, these do not permit looking up the arguments / local variables
459of the function.
460Following constructs are supported :
461.SAMPLE
462kprobe.function(FUNCTION)
463kprobe.function(FUNCTION).return
464kprobe.module(NAME).function(FUNCTION)
465kprobe.module(NAME).function(FUNCTION).return
466kprobe.statement.(ADDRESS).absolute
467.ESAMPLE
468.PP
469Probes of type
470.B function
471are recommended for kernel functions, whereas probes of type
472.B module
473are recommended for probing functions of the specified module.
474In case the absolute address of a kernel or module function is known,
475.B statement
476probes can be utilized.
477.PP
478Note that
479.I FUNCTION
480and
481.I MODULE
482names
483.B must not
484contain wildcards, or the probe will not be registered.
485Also, statement probes must be run under guru-mode only.
486
487
1ada6f08 488.SS USER-SPACE
0a1c696d
FCE
489Support for user-space probing is available for kernels
490that are configured with the utrace extensions. See
491.SAMPLE
492http://people.redhat.com/roland/utrace/
493.ESAMPLE
494.PP
495There are several forms. First, a non-symbolic probe point:
1ada6f08
FCE
496.SAMPLE
497process(PID).statement(ADDRESS).absolute
498.ESAMPLE
499is analogous to
500.IR
501kernel.statement(ADDRESS).absolute
502in that both use raw (unverified) virtual addresses and provide
503no $variables. The target PID parameter must identify a running
504process, and ADDRESS should identify a valid instruction address.
505All threads of that process will be probed.
29cb9b42 506.PP
0a1c696d
FCE
507Second, non-symbolic user-kernel interface events handled by
508utrace may be probed:
29cb9b42 509.SAMPLE
dd078c96 510process(PID).begin
82f0e81b 511process("FULLPATH").begin
986e98de 512process.begin
dd078c96 513process(PID).thread.begin
82f0e81b 514process("FULLPATH").thread.begin
986e98de 515process.thread.begin
dd078c96 516process(PID).end
82f0e81b 517process("FULLPATH").end
986e98de 518process.end
dd078c96 519process(PID).thread.end
82f0e81b 520process("FULLPATH").thread.end
986e98de 521process.thread.end
29cb9b42 522process(PID).syscall
82f0e81b 523process("FULLPATH").syscall
986e98de 524process.syscall
29cb9b42 525process(PID).syscall.return
82f0e81b 526process("FULLPATH").syscall.return
986e98de 527process.syscall.return
0afb7073 528process(PID).insn
82f0e81b 529process("FULLPATH").insn
0afb7073 530process(PID).insn.block
82f0e81b 531process("FULLPATH").insn.block
29cb9b42
DS
532.ESAMPLE
533.PP
534A
dd078c96 535.B .begin
82f0e81b 536probe gets called when new process described by PID or FULLPATH gets created.
29cb9b42 537A
dd078c96 538.B .thread.begin
82f0e81b 539probe gets called when a new thread described by PID or FULLPATH gets created.
159cb109 540A
dd078c96 541.B .end
82f0e81b 542probe gets called when process described by PID or FULLPATH dies.
dd078c96
DS
543A
544.B .thread.end
82f0e81b 545probe gets called when a thread described by PID or FULLPATH dies.
29cb9b42
DS
546A
547.B .syscall
82f0e81b 548probe gets called when a thread described by PID or FULLPATH makes a
6270adc1
MH
549system call. The system call number is available in the
550.BR $syscall
551context variable, and the first 6 arguments of the system call
552are available in the
553.BR $argN
554(ex. $arg1, $arg2, ...) context variable.
29cb9b42
DS
555A
556.B .syscall.return
82f0e81b 557probe gets called when a thread described by PID or FULLPATH returns from a
5d67b47c
MH
558system call. The system call number is available in the
559.BR $syscall
560context variable, and the return value of the system call is available
561in the
562.BR $return
29cb9b42 563context variable.
a96d1db0 564A
0afb7073 565.B .insn
82f0e81b 566probe gets called for every single-stepped instruction of the process described by PID or FULLPATH.
0afb7073
FCE
567A
568.B .insn.block
82f0e81b
FCE
569probe gets called for every block-stepped instruction of the process described by PID or FULLPATH.
570.PP
571If a process probe is specified without a PID or FULLPATH, all user
572threads will be probed. However, if systemtap was invoked with the
573.IR -c " or " -x
574options, then process probes are restricted to the process
575hierarchy associated with the target process.
0a1c696d
FCE
576
577.PP
578Third, symbolic static instrumentation compiled into programs and
579shared libraries may be
580probed:
581.SAMPLE
582process("PATH").mark("LABEL")
a794dbeb 583process("PATH").provider("PROVIDER").mark("LABEL")
0a1c696d
FCE
584.ESAMPLE
585.PP
f28a8c28
SC
586A
587.B .mark
588probe gets called via a static probe which is defined in the
a794dbeb
FCE
589application by STAP_PROBE1(PROVIDER,LABEL,arg1), which is defined in
590sdt.h. The handle is an application handle, LABEL corresponds to
591the .mark argument, and arg1 is the argument. STAP_PROBE1 is used for
592probes with 1 argument, STAP_PROBE2 is used for probes with 2
593arguments, and so on. The arguments of the probe are available in the
594context variables $arg1, $arg2, ... An alternative to using the
595STAP_PROBE macros is to use the dtrace script to create custom macros.
596Additionally, the variables $$name and $$provider are available as
597parts of the probe point name.
0a1c696d 598
29cb9b42 599.PP
0a1c696d
FCE
600Finally, full symbolic source-level probes in user-space programs
601and shared libraries are supported. These are exactly analogous
602to the symbolic DWARF-based kernel/module probes described above,
603and expose similar contextual $-variables.
604.SAMPLE
605process("PATH").function("NAME")
606process("PATH").statement("*@FILE.c:123")
607process("PATH").function("*").return
608process("PATH").function("myfun").label("foo")
609.ESAMPLE
610
611.PP
612Note that for all process probes,
29cb9b42 613.I PATH
ea384b8c
FCE
614names refer to executables that are searched the same way shells do: relative
615to the working directory if they contain a "/" character, otherwise in
616.BR $PATH .
153e7a22
FCE
617PATH may also refer to shared libraries, in which case all proceses that
618map it at runtime would be selected for probing.
82f0e81b
FCE
619If the PATH string contains wildcards as in the MPATTERN case, then
620standard globbing is performed to find all matching paths. In this
621case, the
622.BR $PATH
623environment variable is not used.
624
625.PP
153e7a22
FCE
626If systemtap was invoked with the
627.IR \-c " or " \-x
760695db
FCE
628options, then process probes are restricted to the process
629hierarchy associated with the target process.
1ada6f08 630
9cb48751
DS
631.SS PROCFS
632
633These probe points allow procfs "files" in
c243f608
LB
634/proc/systemtap/MODNAME to be created, read and written using a
635permission that may be modified using the proper umask value. Default permissions are 0400 for read
636probes, and 0200 for write probes. If both a read and write probe are being
637used on the same file, a default permission of 0600 will be used.
638Using procfs.umask(0040).read would
639result in a 0404 permission set for the file.
9cb48751
DS
640.RI ( MODNAME
641is the name of the systemtap module). The
642.I proc
643filesystem is a pseudo-filesystem which is used an an interface to
c243f608 644kernel data structures. There are several probe point variants supported
9cb48751 645by the translator:
ca88561f 646
9cb48751
DS
647.SAMPLE
648procfs("PATH").read
c243f608 649procfs("PATH").umask(UMASK).read
38975255 650procfs("PATH").read.maxsize(MAXSIZE)
c243f608 651procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
9cb48751 652procfs("PATH").write
c243f608 653procfs("PATH").umask(UMASK).write
9cb48751 654procfs.read
c243f608 655procfs.umask(UMASK).read
38975255 656procfs.read.maxsize(MAXSIZE)
c243f608 657procfs.umask(UMASK).read.maxsize(MAXSIZE)
9cb48751 658procfs.write
c243f608 659procfs.umask(UMASK).write
9cb48751 660.ESAMPLE
ca88561f 661
9cb48751
DS
662.I PATH
663is the file name (relative to /proc/systemtap/MODNAME) to be created.
664If no
665.I PATH
666is specified (as in the last two variants above),
667.I PATH
668defaults to "command".
669.PP
670When a user reads /proc/systemtap/MODNAME/PATH, the corresponding
671procfs
672.I read
673probe is triggered. The string data to be read should be assigned to
674a variable named
675.IR $value ,
676like this:
ca88561f 677
9cb48751
DS
678.SAMPLE
679procfs("PATH").read { $value = "100\\n" }
680.ESAMPLE
681.PP
682When a user writes into /proc/systemtap/MODNAME/PATH, the
683corresponding procfs
684.I write
685probe is triggered. The data the user wrote is available in the
686string variable named
687.IR $value ,
688like this:
ca88561f 689
9cb48751
DS
690.SAMPLE
691procfs("PATH").write { printf("user wrote: %s", $value) }
692.ESAMPLE
38975255
DS
693.PP
694.I MAXSIZE
695is the size of the procfs read buffer. Specifying
696.I MAXSIZE
697allows larger procfs output. If no
698.I MAXSIZE
699is specified, the procfs read buffer defaults to
700.I STP_PROCFS_BUFSIZE
701(which defaults to
702.IR MAXSTRINGLEN ,
703the maximum length of a string).
704If setting the procfs read buffers for more than one file is needed,
705it may be easiest to override the
706.I STP_PROCFS_BUFSIZE
707definition.
708Here's an example of using
709.IR MAXSIZE :
710
711.SAMPLE
712procfs.read.maxsize(1024) {
713 $value = "long string..."
714 $value .= "another long string..."
715 $value .= "another long string..."
716 $value .= "another long string..."
717}
718.ESAMPLE
9cb48751 719
6f05b6ab
FCE
720.SS MARKERS
721
722This family of probe points hooks up to static probing markers
723inserted into the kernel or modules. These markers are special macro
724calls inserted by kernel developers to make probing faster and more
725reliable than with DWARF-based probes. Further, DWARF debugging
726information is
727.I not
728required to probe markers.
729
730Marker probe points begin with
f781f849
DS
731.BR kernel .
732The next part names the marker itself:
6f05b6ab
FCE
733.BR mark("name") .
734The marker name string, which may contain the usual wildcard characters,
735is matched against the names given to the marker macros when the kernel
eb973c2a
DS
736and/or module was compiled. Optionally, you can specify
737.BR format("format") .
37f6433e 738Specifying the marker format string allows differentiation between two
eb973c2a 739markers with the same name but different marker format strings.
6f05b6ab
FCE
740
741The handler associated with a marker-based probe may read the
742optional parameters specified at the macro call site. These are
743named
744.BR $arg1 " through " $argNN ,
745where NN is the number of parameters supplied by the macro. Number
746and string parameters are passed in a type-safe manner.
747
eb973c2a
DS
748The marker format string associated with a marker is available in
749.BR $format .
37f6433e 750And also the marker name string is available in
bc54e71c 751.BR $name .
eb973c2a 752
bc724b8b
JS
753.SS TRACEPOINTS
754
755This family of probe points hooks up to static probing tracepoints
756inserted into the kernel or modules. As with markers, these
757tracepoints are special macro calls inserted by kernel developers to
758make probing faster and more reliable than with DWARF-based probes,
759and DWARF debugging information is not required to probe tracepoints.
760Tracepoints have an extra advantage of more strongly-typed parameters
761than markers.
762
763Tracepoint probes begin with
764.BR kernel .
765The next part names the tracepoint itself:
766.BR trace("name") .
767The tracepoint name string, which may contain the usual wildcard
768characters, is matched against the names defined by the kernel
769developers in the tracepoint header files.
770
771The handler associated with a tracepoint-based probe may read the
772optional parameters specified at the macro call site. These are
773named according to the declaration by the tracepoint author. For
774example, the tracepoint probe
775.BR kernel.trace("sched_switch")
776provides the parameters
777.BR $rq ", " $prev ", and " $next .
778If the parameter is a complex type, as in a struct pointer, then a
779script can access fields with the same syntax as DWARF $target
780variables. Also, tracepoint parameters cannot be modified, but in
781guru-mode a script may modify fields of parameters.
782
783The name of the tracepoint is available in
784.BR $$name ,
785and a string of name=value pairs for all parameters of the tracepoint
786is available in
046e7190 787.BR $$vars " or " $$parms .
bc724b8b 788
dd225250
PS
789.SS HARDWARE BREAKPOINTS
790This family of probes is used to set hardware watchpoints for a given
791 (global) kernel symbol. The probes take three components as inputs :
792
7931. The
794.BR virtual address / name
795of the kernel symbol to be traced is supplied as argument to this class
796of probes. ( Probes for only data segment variables are supported. Probing
797local variables of a function cannot be done.)
798
7992. Nature of access to be probed :
800a.
801.I .write
802probe gets triggered when a write happens at the specified address/symbol
803name.
804b.
805.I rw
806probe is triggered when either a read or write happens.
807
8083.
809.BR .length
810(optional)
811Users have the option of specifying the address interval to be probed
812using "length" constructs. The user-specified length gets approximated
813to the closest possible address length that the architecture can
814support. If the specified length exceeds the limits imposed by
815architecture, an error message is flagged and probe registration fails.
816Wherever 'length' is not specified, the translator requests a hardware
817breakpoint probe of length 1. It should be noted that the "length"
818construct is not valid with symbol names.
819
820Following constructs are supported :
821.SAMPLE
822probe kernel.data(ADDRESS).write
823probe kernel.data(ADDRESS).rw
824probe kernel.data(ADDRESS).length(LEN).write
825probe kernel.data(ADDRESS).length(LEN).rw
826probe kernel.data("SYMBOL_NAME").write
827probe kernel.data("SYMBOL_NAME").rw
828.ESAMPLE
829
830This set of probes make use of the debug registers of the processor,
831which is a scarce resource. (4 on x86 , 1 on powerpc ) The script
832translation flags a warning if a user requests more hardware breakpoint probes
833than the limits set by architecture. For example,a pass-2 warning is flashed
834when an input script requests 5 hardware breakpoint probes on an x86
835system while x86 architecture supports a maximum of 4 breakpoints.
836Users are cautioned to set probes judiciously.
837
ba4a90fd
FCE
838.SH EXAMPLES
839.PP
840Here are some example probe points, defining the associated events.
841.TP
842begin, end, end
843refers to the startup and normal shutdown of the session. In this
844case, the handler would run once during startup and twice during
845shutdown.
846.TP
847timer.jiffies(1000).randomize(200)
13d2ecdb 848refers to a periodic interrupt, every 1000 +/\- 200 jiffies.
ba4a90fd
FCE
849.TP
850kernel.function("*init*"), kernel.function("*exit*")
851refers to all kernel functions with "init" or "exit" in the name.
852.TP
853kernel.function("*@kernel/sched.c:240")
854refers to any functions within the "kernel/sched.c" file that span
855line 240.
856.TP
6f05b6ab
FCE
857kernel.mark("getuid")
858refers to an STAP_MARK(getuid, ...) macro call in the kernel.
859.TP
ba4a90fd
FCE
860module("usb*").function("*sync*").return
861refers to the moment of return from all functions with "sync" in the
862name in any of the USB drivers.
863.TP
864kernel.statement(0xc0044852)
865refers to the first byte of the statement whose compiled instructions
866include the given address in the kernel.
b4ceace2 867.TP
a5ae3f3d 868kernel.statement("*@kernel/sched.c:2917")
1bd128a3
SC
869refers to the statement of line 2917 within "kernel/sched.c".
870.TP
871kernel.statement("bio_init@fs/bio.c+3")
872refers to the statement at line bio_init+3 within "fs/bio.c".
a5ae3f3d 873.TP
dd225250
PS
874kernel.data("pid_max").write
875refers to a hardware preakpoint of type "write" set on pid_max
876.TP
729286d8 877syscall.*.return
b4ceace2 878refers to the group of probe aliases with any name in the third position
ba4a90fd 879
f33e9151
FCE
880.SS PERF
881
882This
883.IR prototype
884family of probe points interfaces to the kernel "perf event"
885infrasture for controlling hardware performance counters.
886The events being attached to are described by the "type",
887"config" fields of the
888.IR perf_event_attr
889structure, and are sampled at an interval governed by the
890"sample_period" field.
891
892These fields are made available to systemtap scripts using
893the following syntax:
894.SAMPLE
bb9fd173 895probe perf.type(NN).config(MM).sample(XX)
f33e9151
FCE
896probe perf.type(NN).config(MM)
897.ESAMPLE
898The range of valid type/config is described by the
899.IR perf_event_open (2)
900system call, and/or the
901.IR linux/perf_event.h
8fb91f5f
FCE
902file. Invalid combinations or exhausted hardware counter resources
903result in errors during systemtap script startup. Systemtap does
f33e9151
FCE
904not sanity-check the values: it merely passes them through to
905the kernel for error- and safety-checking.
906
ba4a90fd 907.SH SEE ALSO
78db65bd 908.IR stap (1),
89965a32
FCE
909.IR probe::* (3stap),
910.IR tapset::* (3stap)
This page took 0.208109 seconds and 5 git commands to generate.