]> sourceware.org Git - systemtap.git/blob - stapprobes.3stap
Document new @var construct in NEWS, langref, beginners and stapprobes.
[systemtap.git] / stapprobes.3stap
1 .\" t
2 .TH STAPPROBES 3stap
3 .SH NAME
4 stapprobes \- systemtap probe points
5
6 .\" macros
7 .de SAMPLE
8 .br
9 .RS
10 .nf
11 .nh
12 ..
13 .de ESAMPLE
14 .hy
15 .fi
16 .RE
17 ..
18
19 .SH DESCRIPTION
20 The following sections enumerate the variety of probe points supported
21 by the systemtap translator, and some of the additional aliases defined by
22 standard tapset scripts. Many are individually documented in the
23 .IR 3stap
24 manual section, with the
25 .IR probe::
26 prefix.
27 .PP
28 The general probe point syntax is a dotted-symbol sequence. This
29 allows a breakdown of the event namespace into parts, somewhat like
30 the Domain Name System does on the Internet. Each component
31 identifier may be parametrized by a string or number literal, with a
32 syntax like a function call. A component may include a "*" character,
33 to expand to a set of matching probe points. It may also include "**"
34 to match multiple sequential components at once. Probe aliases likewise
35 expand to other probe points. Each and every resulting probe point is
36 normally resolved to some low-level system instrumentation facility
37 (e.g., a kprobe address, marker, or a timer configuration), otherwise
38 the elaboration phase will fail.
39 .PP
40 However, a probe point may be followed by a "?" character, to indicate
41 that it is optional, and that no error should result if it fails to
42 resolve. Optionalness passes down through all levels of
43 alias/wildcard expansion. Alternately, a probe point may be followed
44 by a "!" character, to indicate that it is both optional and
45 sufficient. (Think vaguely of the Prolog cut operator.) If it does
46 resolve, then no further probe points in the same comma-separated list
47 will be resolved. Therefore, the "!" sufficiency mark only makes
48 sense in a list of probe point alternatives.
49 .PP
50 Additionally, a probe point may be followed by a "if (expr)" statement, in
51 order to enable/disable the probe point on-the-fly. With the "if" statement,
52 if the "expr" is false when the probe point is hit, the whole probe body
53 including alias's body is skipped. The condition is stacked up through
54 all levels of alias/wildcard expansion. So the final condition becomes
55 the logical-and of conditions of all expanded alias/wildcard.
56
57 These are all
58 .B syntactically
59 valid probe points. (They are generally
60 .B semantically
61 invalid, depending on the contents of the tapsets, and the versions of
62 kernel/user software installed.)
63
64 .SAMPLE
65 kernel.function("foo").return
66 process("/bin/vi").statement(0x2222)
67 end
68 syscall.*
69 sys**open
70 kernel.function("no_such_function") ?
71 module("awol").function("no_such_function") !
72 signal.*? if (switch)
73 kprobe.function("foo")
74 .ESAMPLE
75
76 Probes may be broadly classified into "synchronous" and
77 "asynchronous". A "synchronous" event is deemed to occur when any
78 processor executes an instruction matched by the specification. This
79 gives these probes a reference point (instruction address) from which
80 more contextual data may be available. Other families of probe points
81 refer to "asynchronous" events such as timers/counters rolling over,
82 where there is no fixed reference point that is related. Each probe
83 point specification may match multiple locations (for example, using
84 wildcards or aliases), and all them are then probed. A probe
85 declaration may also contain several comma-separated specifications,
86 all of which are probed.
87
88 .SH DWARF DEBUGINFO
89
90 Resolving some probe points requires DWARF debuginfo or "debug
91 symbols" for the specific part being instrumented. For some others,
92 DWARF is automatically synthesized on the fly from source code header
93 files. For others, it is not needed at all. Since a systemtap script
94 may use any mixture of probe points together, the union of their DWARF
95 requirements has to be met on the computer where script compilation
96 occurs. (See the \fI\-\-use\-server\fR option and the \fBstap-server\
97 (8)\fR man page for information about the remote compilation facility,
98 which allows these requirements to be met on a different machine.)
99 .PP
100 The following point lists many of the available probe point families,
101 to classify them with respect to their need for DWARF debuginfo.
102
103 .TS
104 l l l.
105 \fBDWARF NON-DWARF\fP
106
107 kernel.function, .statement kernel.mark
108 module.function, .statement process.mark
109 process.function, .statement begin, end, error, never
110 process.mark \fI(backup)\fP timer
111 perf
112 procfs
113 \fBAUTO-DWARF\fP kernel.statement.absolute
114 kernel.data
115 kernel.trace kprobe.function
116 process.statement.absolute
117 process.begin, .end, .error
118 .TE
119
120 .SH PROBE POINT FAMILIES
121
122 .SS BEGIN/END/ERROR
123
124 The probe points
125 .IR begin " and " end
126 are defined by the translator to refer to the time of session startup
127 and shutdown. All "begin" probe handlers are run, in some sequence,
128 during the startup of the session. All global variables will have
129 been initialized prior to this point. All "end" probes are run, in
130 some sequence, during the
131 .I normal
132 shutdown of a session, such as in the aftermath of an
133 .I exit ()
134 function call, or an interruption from the user. In the case of an
135 error-triggered shutdown, "end" probes are not run. There are no
136 target variables available in either context.
137 .PP
138 If the order of execution among "begin" or "end" probes is significant,
139 then an optional sequence number may be provided:
140
141 .SAMPLE
142 begin(N)
143 end(N)
144 .ESAMPLE
145
146 The number N may be positive or negative. The probe handlers are run in
147 increasing order, and the order between handlers with the same sequence
148 number is unspecified. When "begin" or "end" are given without a
149 sequence, they are effectively sequence zero.
150
151 The
152 .IR error
153 probe point is similar to the
154 .IR end
155 probe, except that each such probe handler run when the session ends
156 after errors have occurred. In such cases, "end" probes are skipped,
157 but each "error" probe is still attempted. This kind of probe can be
158 used to clean up or emit a "final gasp". It may also be numerically
159 parametrized to set a sequence.
160
161 .SS NEVER
162 The probe point
163 .IR never
164 is specially defined by the translator to mean "never". Its probe
165 handler is never run, though its statements are analyzed for symbol /
166 type correctness as usual. This probe point may be useful in
167 conjunction with optional probes.
168
169 .SS SYSCALL
170
171 The
172 .IR syscall.*
173 aliases define several hundred probes, too many to
174 summarize here. They are:
175
176 .SAMPLE
177 syscall.NAME
178 .br
179 syscall.NAME.return
180 .ESAMPLE
181
182 Generally, two probes are defined for each normal system call as listed in the
183 .IR syscalls(2)
184 manual page, one for entry and one for return. Those system calls that never
185 return do not have a corresponding
186 .IR .return
187 probe.
188 .PP
189 Each probe alias provides a variety of variables. Looking at the tapset source
190 code is the most reliable way. Generally, each variable listed in the standard
191 manual page is made available as a script-level variable, so
192 .IR syscall.open
193 exposes
194 .IR filename ", " flags ", and " mode .
195 In addition, a standard suite of variables is available at most aliases:
196 .TP
197 .IR argstr
198 A pretty-printed form of the entire argument list, without parentheses.
199 .TP
200 .IR name
201 The name of the system call.
202 .TP
203 .IR retstr
204 For return probes, a pretty-printed form of the system-call result.
205 .PP
206 As usual for probe aliases, these variables are all simply initialized
207 once from the underlying $context variables, so that later changes to
208 $context variables are not automatically reflected. Not all probe
209 aliases obey all of these general guidelines. Please report any
210 bothersome ones you encounter as a bug.
211
212
213 .SS TIMERS
214
215 Intervals defined by the standard kernel "jiffies" timer may be used
216 to trigger probe handlers asynchronously. Two probe point variants
217 are supported by the translator:
218
219 .SAMPLE
220 timer.jiffies(N)
221 timer.jiffies(N).randomize(M)
222 .ESAMPLE
223
224 The probe handler is run every N jiffies (a kernel-defined unit of
225 time, typically between 1 and 60 ms). If the "randomize" component is
226 given, a linearly distributed random value in the range [\-M..+M] is
227 added to N every time the handler is run. N is restricted to a
228 reasonable range (1 to around a million), and M is restricted to be
229 smaller than N. There are no target variables provided in either
230 context. It is possible for such probes to be run concurrently on
231 a multi-processor computer.
232 .PP
233 Alternatively, intervals may be specified in units of time.
234 There are two probe point variants similar to the jiffies timer:
235
236 .SAMPLE
237 timer.ms(N)
238 timer.ms(N).randomize(M)
239 .ESAMPLE
240
241 Here, N and M are specified in milliseconds, but the full options for units
242 are seconds (s/sec), milliseconds (ms/msec), microseconds (us/usec),
243 nanoseconds (ns/nsec), and hertz (hz). Randomization is not supported for
244 hertz timers.
245
246 The actual resolution of the timers depends on the target kernel. For
247 kernels prior to 2.6.17, timers are limited to jiffies resolution, so
248 intervals are rounded up to the nearest jiffies interval. After 2.6.17,
249 the implementation uses hrtimers for tighter precision, though the actual
250 resolution will be arch-dependent. In either case, if the "randomize"
251 component is given, then the random value will be added to the interval
252 before any rounding occurs.
253 .PP
254 Profiling timers are also available to provide probes that execute on all
255 CPUs at the rate of the system tick (CONFIG_HZ).
256 This probe takes no parameters.
257
258 .SAMPLE
259 timer.profile
260 .ESAMPLE
261
262 Full context information of the interrupted process is available, making
263 this probe suitable for a time-based sampling profiler.
264
265 .SS DWARF
266
267 This family of probe points uses symbolic debugging information for
268 the target kernel/module/program, as may be found in unstripped
269 executables, or the separate
270 .I debuginfo
271 packages. They allow placement of probes logically into the execution
272 path of the target program, by specifying a set of points in the
273 source or object code. When a matching statement executes on any
274 processor, the probe handler is run in that context.
275 .PP
276 Points in a kernel, which are identified by
277 module, source file, line number, function name, or some
278 combination of these.
279 .PP
280 Here is a list of probe point families currently supported. The
281 .B .function
282 variant places a probe near the beginning of the named function, so that
283 parameters are available as context variables. The
284 .B .return
285 variant places a probe at the moment
286 .B after
287 the return from the named function, so the return value is available
288 as the "$return" context variable. The
289 .B .inline
290 modifier for
291 .B .function
292 filters the results to include only instances of inlined functions.
293 The
294 .B .call
295 modifier selects the opposite subset. The \textbf{.exported} modifier
296 filters the results to include only exported functions. Inline
297 functions do not have an identifiable return point, so
298 .B .return
299 is not supported on
300 .B .inline
301 probes. The
302 .B .statement
303 variant places a probe at the exact spot, exposing those local variables
304 that are visible there.
305
306 .SAMPLE
307 kernel.function(PATTERN)
308 .br
309 kernel.function(PATTERN).call
310 .br
311 kernel.function(PATTERN).return
312 .br
313 kernel.function(PATTERN).inline
314 .br
315 kernel.function(PATTERN).label(LPATTERN)
316 .br
317 module(MPATTERN).function(PATTERN)
318 .br
319 module(MPATTERN).function(PATTERN).call
320 .br
321 module(MPATTERN).function(PATTERN).return
322 .br
323 module(MPATTERN).function(PATTERN).inline
324 .br
325 module(MPATTERN).function(PATTERN).label(LPATTERN)
326 .br
327 .br
328 kernel.statement(PATTERN)
329 .br
330 kernel.statement(ADDRESS).absolute
331 .br
332 module(MPATTERN).statement(PATTERN)
333 .br
334 process("PATH").function("NAME")
335 .br
336 process("PATH").statement("*@FILE.c:123")
337 .br
338 process("PATH").library("PATH").function("NAME")
339 .br
340 process("PATH").library("PATH").statement("*@FILE.c:123")
341 .br
342 process("PATH").function("*").return
343 .br
344 process("PATH").function("myfun").label("foo")
345 .br
346 process(PID).statement(ADDRESS).absolute
347 .ESAMPLE
348
349 (See the USER-SPACE section below for more information on the process
350 probes.)
351
352 In the above list, MPATTERN stands for a string literal that aims to
353 identify the loaded kernel module of interest and LPATTERN stands for
354 a source program label. Both MPATTERN and LPATTERN may include the "*"
355 "[]", and "?" wildcards.
356 PATTERN stands for a string literal that
357 aims to identify a point in the program. It is made up of three
358 parts:
359 .IP \(bu 4
360 The first part is the name of a function, as would appear in the
361 .I nm
362 program's output. This part may use the "*" and "?" wildcarding
363 operators to match multiple names.
364 .IP \(bu 4
365 The second part is optional and begins with the "@" character.
366 It is followed by the path to the source file containing the function,
367 which may include a wildcard pattern, such as mm/slab*.
368 If it does not match as is, an implicit "*/" is optionally added
369 .I before
370 the pattern, so that a script need only name the last few components
371 of a possibly long source directory path.
372 .IP \(bu 4
373 Finally, the third part is optional if the file name part was given,
374 and identifies the line number in the source file preceded by a ":"
375 or a "+". The line number is assumed to be an
376 absolute line number if preceded by a ":", or relative to the entry of
377 the function if preceded by a "+".
378 All the lines in the function can be matched with ":*".
379 A range of lines x through y can be matched with ":x\-y".
380 .PP
381 As an alternative, PATTERN may be a numeric constant, indicating an
382 address. Such an address may be found from symbol tables of the
383 appropriate kernel / module object file. It is verified against
384 known statement code boundaries, and will be relocated for use at
385 run time.
386 .PP
387 In guru mode only, absolute kernel-space addresses may be specified with
388 the ".absolute" suffix. Such an address is considered already relocated,
389 as if it came from
390 .BR /proc/kallsyms ,
391 so it cannot be checked against statement/instruction boundaries.
392
393 .SS CONTEXT VARIABLES
394
395 .PP
396 Many of the source-level context variables, such as function parameters,
397 locals, globals visible in the compilation unit, may be visible to
398 probe handlers. They may refer to these variables by prefixing their
399 name with "$" within the scripts. In addition, a special syntax
400 allows limited traversal of structures, pointers, and arrays. More
401 syntax allows pretty-printing of individual variables or their groups.
402 See also
403 .BR @cast .
404
405 .TP
406 $var
407 refers to an in-scope variable "var". If it's an integer-like type,
408 it will be cast to a 64-bit int for systemtap script use. String-like
409 pointers (char *) may be copied to systemtap string values using the
410 .IR kernel_string " or " user_string
411 functions.
412 .TP
413 @var("varname")
414 an alternative syntax for
415 .IR $varname
416 .
417 .TP
418 @var("varname@src/file.c")
419 refers to the global (either file local or external) variable
420 .IR varname
421 defined when the file
422 .IR src/file.c
423 was compiled. The CU in which the variable is resolved is the first CU
424 in the module of the probe point which matches the given file name at
425 the end and has the shortest file name path (e.g. given
426 .IR @var("foo@bar/baz.c")
427 and CUs with file name paths
428 .IR src/sub/module/bar/baz.c
429 and
430 .IR src/bar/baz.c
431 the second CU will be chosen to resolve the (file) global variable
432 .IR foo
433 .
434 .TP
435 $var\->field traversal via a structure's or a pointer's field. This
436 generalized indirection operator may be repeated to follow more
437 levels. Note that the
438 .IR .
439 operator is not used for plain structure
440 members, only
441 .IR \->
442 for both purposes. (This is because "." is reserved for string
443 concatenation.)
444 .TP
445 $return
446 is available in return probes only for functions that are declared
447 with a return value.
448 .TP
449 $var[N]
450 indexes into an array. The index given with a literal number or even
451 an arbitrary numeric expression.
452 .PP
453 A number of operators exist for such basic context variable expressions:
454 .TP
455 $$vars
456 expands to a character string that is equivalent to
457 .SAMPLE
458 sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
459 parm1, ..., parmN, var1, ..., varN)
460 .ESAMPLE
461 for each variable in scope at the probe point. Some values may be
462 printed as
463 .IR =?
464 if their run-time location cannot be found.
465 .TP
466 $$locals
467 expands to a subset of $$vars for only local variables.
468 .TP
469 $$parms
470 expands to a subset of $$vars for only function parameters.
471 .TP
472 $$return
473 is available in return probes only. It expands to a string that
474 is equivalent to sprintf("return=%x", $return)
475 if the probed function has a return value, or else an empty string.
476 .TP
477 & $EXPR
478 expands to the address of the given context variable expression, if it
479 is addressable.
480 .TP
481 @defined($EXPR)
482 expands to 1 or 0 iff the given context variable expression is resolvable,
483 for use in conditionals such as
484 .SAMPLE
485 @defined($foo\->bar) ? $foo\->bar : 0
486 .ESAMPLE
487 .TP
488 $EXPR$
489 expands to a string with all of $EXPR's members, equivalent to
490 .SAMPLE
491 sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
492 $EXPR\->a, $EXPR\->b)
493 .ESAMPLE
494 .TP
495 $EXPR$$
496 expands to a string with all of $var's members and submembers, equivalent to
497 .SAMPLE
498 sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
499 $EXPR\->a, $EXPR\->b, $EXPR\->c\->x, $EXPR\->c\->y, $EXPR\->d[0])
500 .ESAMPLE
501
502 .PP
503 For ".return" probes, context variables other than the "$return"
504 value itself are only available for the function call parameters.
505 The expressions evaluate to the
506 .IR entry-time
507 values of those variables, since that is when a snapshot is taken.
508 Other local variables are not generally accessible, since by the time
509 a ".return" probe hits, the probed function will have already returned.
510 .PP
511 Arbitrary entry-time expressions can also be saved for ".return"
512 probes using the
513 .IR @entry(expr)
514 operator. For example, one can compute the elapsed time of a function:
515 .SAMPLE
516 probe kernel.function("do_filp_open").return {
517 println( get_timeofday_us() \- @entry(get_timeofday_us()) )
518 }
519 .ESAMPLE
520
521
522 .SS DWARFLESS
523 In absence of debugging information, entry & exit points of kernel & module
524 functions can be probed using the "kprobe" family of probes.
525 However, these do not permit looking up the arguments / local variables
526 of the function.
527 Following constructs are supported :
528 .SAMPLE
529 kprobe.function(FUNCTION)
530 kprobe.function(FUNCTION).return
531 kprobe.module(NAME).function(FUNCTION)
532 kprobe.module(NAME).function(FUNCTION).return
533 kprobe.statement.(ADDRESS).absolute
534 .ESAMPLE
535 .PP
536 Probes of type
537 .B function
538 are recommended for kernel functions, whereas probes of type
539 .B module
540 are recommended for probing functions of the specified module.
541 In case the absolute address of a kernel or module function is known,
542 .B statement
543 probes can be utilized.
544 .PP
545 Note that
546 .I FUNCTION
547 and
548 .I MODULE
549 names
550 .B must not
551 contain wildcards, or the probe will not be registered.
552 Also, statement probes must be run under guru-mode only.
553
554
555 .SS USER-SPACE
556 Support for user-space probing is available for kernels
557 that are configured with the utrace extensions. See
558 .SAMPLE
559 http://people.redhat.com/roland/utrace/
560 .ESAMPLE
561 .PP
562 There are several forms. First, a non-symbolic probe point:
563 .SAMPLE
564 process(PID).statement(ADDRESS).absolute
565 .ESAMPLE
566 is analogous to
567 .IR
568 kernel.statement(ADDRESS).absolute
569 in that both use raw (unverified) virtual addresses and provide
570 no $variables. The target PID parameter must identify a running
571 process, and ADDRESS should identify a valid instruction address.
572 All threads of that process will be probed.
573 .PP
574 Second, non-symbolic user-kernel interface events handled by
575 utrace may be probed:
576 .SAMPLE
577 process(PID).begin
578 process("FULLPATH").begin
579 process.begin
580 process(PID).thread.begin
581 process("FULLPATH").thread.begin
582 process.thread.begin
583 process(PID).end
584 process("FULLPATH").end
585 process.end
586 process(PID).thread.end
587 process("FULLPATH").thread.end
588 process.thread.end
589 process(PID).syscall
590 process("FULLPATH").syscall
591 process.syscall
592 process(PID).syscall.return
593 process("FULLPATH").syscall.return
594 process.syscall.return
595 process(PID).insn
596 process("FULLPATH").insn
597 process(PID).insn.block
598 process("FULLPATH").insn.block
599 .ESAMPLE
600 .PP
601 A
602 .B .begin
603 probe gets called when new process described by PID or FULLPATH gets created.
604 A
605 .B .thread.begin
606 probe gets called when a new thread described by PID or FULLPATH gets created.
607 A
608 .B .end
609 probe gets called when process described by PID or FULLPATH dies.
610 A
611 .B .thread.end
612 probe gets called when a thread described by PID or FULLPATH dies.
613 A
614 .B .syscall
615 probe gets called when a thread described by PID or FULLPATH makes a
616 system call. The system call number is available in the
617 .BR $syscall
618 context variable, and the first 6 arguments of the system call
619 are available in the
620 .BR $argN
621 (ex. $arg1, $arg2, ...) context variable.
622 A
623 .B .syscall.return
624 probe gets called when a thread described by PID or FULLPATH returns from a
625 system call. The system call number is available in the
626 .BR $syscall
627 context variable, and the return value of the system call is available
628 in the
629 .BR $return
630 context variable.
631 A
632 .B .insn
633 probe gets called for every single-stepped instruction of the process described by PID or FULLPATH.
634 A
635 .B .insn.block
636 probe gets called for every block-stepped instruction of the process described by PID or FULLPATH.
637 .PP
638 If a process probe is specified without a PID or FULLPATH, all user
639 threads will be probed. However, if systemtap was invoked with the
640 .IR \-c " or " \-x
641 options, then process probes are restricted to the process
642 hierarchy associated with the target process. If a process probe is
643 specified without a PID or FULLPATH, but with the
644 .IR \-c "
645 option, the PATH of the
646 .IR \-c "
647 cmd will be heuristically filled into the process PATH.
648
649 .PP
650 Third, symbolic static instrumentation compiled into programs and
651 shared libraries may be
652 probed:
653 .SAMPLE
654 process("PATH").mark("LABEL")
655 process("PATH").provider("PROVIDER").mark("LABEL")
656 .ESAMPLE
657 .PP
658 A
659 .B .mark
660 probe gets called via a static probe which is defined in the
661 application by STAP_PROBE1(PROVIDER,LABEL,arg1), which is defined in
662 sdt.h. The handle is an application handle, LABEL corresponds to
663 the .mark argument, and arg1 is the argument. STAP_PROBE1 is used for
664 probes with 1 argument, STAP_PROBE2 is used for probes with 2
665 arguments, and so on. The arguments of the probe are available in the
666 context variables $arg1, $arg2, ... An alternative to using the
667 STAP_PROBE macros is to use the dtrace script to create custom macros.
668 Additionally, the variables $$name and $$provider are available as
669 parts of the probe point name.
670
671 .PP
672 Finally, full symbolic source-level probes in user-space programs
673 and shared libraries are supported. These are exactly analogous
674 to the symbolic DWARF-based kernel/module probes described above,
675 and expose similar contextual $variables.
676 .SAMPLE
677 process("PATH").function("NAME")
678 process("PATH").statement("*@FILE.c:123")
679 process("PATH").plt("NAME")
680 process("PATH").library("PATH").plt("NAME")
681 process("PATH").library("PATH").function("NAME")
682 process("PATH").library("PATH").statement("*@FILE.c:123")
683 process("PATH").function("*").return
684 process("PATH").function("myfun").label("foo")
685 .ESAMPLE
686
687 .PP
688 Note that for all process probes,
689 .I PATH
690 names refer to executables that are searched the same way shells do: relative
691 to the working directory if they contain a "/" character, otherwise in
692 .BR $PATH .
693 If PATH names refer to scripts, the actual interpreters (specified in the
694 script in the first line after the #! characters) are probed.
695 If PATH is a process component parameter referring to shared libraries
696 then all processes that map it at runtime would be selected for
697 probing. If PATH is a library component parameter referring to shared
698 libraries then the process specified by the process component would be
699 selected. A .plt probe will probe functions in the program linkage table
700 corresponding to the rest of the probe point. .plt can be specified
701 as a shorthand for .plt("*").
702 If the PATH string contains wildcards as in the MPATTERN case, then
703 standard globbing is performed to find all matching paths. In this
704 case, the
705 .BR $PATH
706 environment variable is not used.
707
708 .PP
709 If systemtap was invoked with the
710 .IR \-c " or " \-x
711 options, then process probes are restricted to the process
712 hierarchy associated with the target process.
713
714 .SS PROCFS
715
716 These probe points allow procfs "files" in
717 /proc/systemtap/MODNAME to be created, read and written using a
718 permission that may be modified using the proper umask value. Default permissions are 0400 for read
719 probes, and 0200 for write probes. If both a read and write probe are being
720 used on the same file, a default permission of 0600 will be used.
721 Using procfs.umask(0040).read would
722 result in a 0404 permission set for the file.
723 .RI ( MODNAME
724 is the name of the systemtap module). The
725 .I proc
726 filesystem is a pseudo-filesystem which is used an an interface to
727 kernel data structures. There are several probe point variants supported
728 by the translator:
729
730 .SAMPLE
731 procfs("PATH").read
732 procfs("PATH").umask(UMASK).read
733 procfs("PATH").read.maxsize(MAXSIZE)
734 procfs("PATH").umask(UMASK).maxsize(MAXSIZE)
735 procfs("PATH").write
736 procfs("PATH").umask(UMASK).write
737 procfs.read
738 procfs.umask(UMASK).read
739 procfs.read.maxsize(MAXSIZE)
740 procfs.umask(UMASK).read.maxsize(MAXSIZE)
741 procfs.write
742 procfs.umask(UMASK).write
743 .ESAMPLE
744
745 .I PATH
746 is the file name (relative to /proc/systemtap/MODNAME) to be created.
747 If no
748 .I PATH
749 is specified (as in the last two variants above),
750 .I PATH
751 defaults to "command".
752 .PP
753 When a user reads /proc/systemtap/MODNAME/PATH, the corresponding
754 procfs
755 .I read
756 probe is triggered. The string data to be read should be assigned to
757 a variable named
758 .IR $value ,
759 like this:
760
761 .SAMPLE
762 procfs("PATH").read { $value = "100\\n" }
763 .ESAMPLE
764 .PP
765 When a user writes into /proc/systemtap/MODNAME/PATH, the
766 corresponding procfs
767 .I write
768 probe is triggered. The data the user wrote is available in the
769 string variable named
770 .IR $value ,
771 like this:
772
773 .SAMPLE
774 procfs("PATH").write { printf("user wrote: %s", $value) }
775 .ESAMPLE
776 .PP
777 .I MAXSIZE
778 is the size of the procfs read buffer. Specifying
779 .I MAXSIZE
780 allows larger procfs output. If no
781 .I MAXSIZE
782 is specified, the procfs read buffer defaults to
783 .I STP_PROCFS_BUFSIZE
784 (which defaults to
785 .IR MAXSTRINGLEN ,
786 the maximum length of a string).
787 If setting the procfs read buffers for more than one file is needed,
788 it may be easiest to override the
789 .I STP_PROCFS_BUFSIZE
790 definition.
791 Here's an example of using
792 .IR MAXSIZE :
793
794 .SAMPLE
795 procfs.read.maxsize(1024) {
796 $value = "long string..."
797 $value .= "another long string..."
798 $value .= "another long string..."
799 $value .= "another long string..."
800 }
801 .ESAMPLE
802
803 .SS MARKERS
804
805 This family of probe points hooks up to static probing markers
806 inserted into the kernel or modules. These markers are special macro
807 calls inserted by kernel developers to make probing faster and more
808 reliable than with DWARF-based probes. Further, DWARF debugging
809 information is
810 .I not
811 required to probe markers.
812
813 Marker probe points begin with
814 .BR kernel .
815 The next part names the marker itself:
816 .BR mark("name") .
817 The marker name string, which may contain the usual wildcard characters,
818 is matched against the names given to the marker macros when the kernel
819 and/or module was compiled. Optionally, you can specify
820 .BR format("format") .
821 Specifying the marker format string allows differentiation between two
822 markers with the same name but different marker format strings.
823
824 The handler associated with a marker-based probe may read the
825 optional parameters specified at the macro call site. These are
826 named
827 .BR $arg1 " through " $argNN ,
828 where NN is the number of parameters supplied by the macro. Number
829 and string parameters are passed in a type-safe manner.
830
831 The marker format string associated with a marker is available in
832 .BR $format .
833 And also the marker name string is available in
834 .BR $name .
835
836 .SS TRACEPOINTS
837
838 This family of probe points hooks up to static probing tracepoints
839 inserted into the kernel or modules. As with markers, these
840 tracepoints are special macro calls inserted by kernel developers to
841 make probing faster and more reliable than with DWARF-based probes,
842 and DWARF debugging information is not required to probe tracepoints.
843 Tracepoints have an extra advantage of more strongly-typed parameters
844 than markers.
845
846 Tracepoint probes begin with
847 .BR kernel .
848 The next part names the tracepoint itself:
849 .BR trace("name") .
850 The tracepoint name string, which may contain the usual wildcard
851 characters, is matched against the names defined by the kernel
852 developers in the tracepoint header files.
853
854 The handler associated with a tracepoint-based probe may read the
855 optional parameters specified at the macro call site. These are
856 named according to the declaration by the tracepoint author. For
857 example, the tracepoint probe
858 .BR kernel.trace("sched_switch")
859 provides the parameters
860 .BR $rq ", " $prev ", and " $next .
861 If the parameter is a complex type, as in a struct pointer, then a
862 script can access fields with the same syntax as DWARF $target
863 variables. Also, tracepoint parameters cannot be modified, but in
864 guru-mode a script may modify fields of parameters.
865
866 The name of the tracepoint is available in
867 .BR $$name ,
868 and a string of name=value pairs for all parameters of the tracepoint
869 is available in
870 .BR $$vars " or " $$parms .
871
872 .SS HARDWARE BREAKPOINTS
873 This family of probes is used to set hardware watchpoints for a given
874 (global) kernel symbol. The probes take three components as inputs :
875
876 1. The
877 .BR virtual address / name
878 of the kernel symbol to be traced is supplied as argument to this class
879 of probes. ( Probes for only data segment variables are supported. Probing
880 local variables of a function cannot be done.)
881
882 2. Nature of access to be probed :
883 a.
884 .I .write
885 probe gets triggered when a write happens at the specified address/symbol
886 name.
887 b.
888 .I rw
889 probe is triggered when either a read or write happens.
890
891 3.
892 .BR .length
893 (optional)
894 Users have the option of specifying the address interval to be probed
895 using "length" constructs. The user-specified length gets approximated
896 to the closest possible address length that the architecture can
897 support. If the specified length exceeds the limits imposed by
898 architecture, an error message is flagged and probe registration fails.
899 Wherever 'length' is not specified, the translator requests a hardware
900 breakpoint probe of length 1. It should be noted that the "length"
901 construct is not valid with symbol names.
902
903 Following constructs are supported :
904 .SAMPLE
905 probe kernel.data(ADDRESS).write
906 probe kernel.data(ADDRESS).rw
907 probe kernel.data(ADDRESS).length(LEN).write
908 probe kernel.data(ADDRESS).length(LEN).rw
909 probe kernel.data("SYMBOL_NAME").write
910 probe kernel.data("SYMBOL_NAME").rw
911 .ESAMPLE
912
913 This set of probes make use of the debug registers of the processor,
914 which is a scarce resource. (4 on x86 , 1 on powerpc ) The script
915 translation flags a warning if a user requests more hardware breakpoint probes
916 than the limits set by architecture. For example,a pass-2 warning is flashed
917 when an input script requests 5 hardware breakpoint probes on an x86
918 system while x86 architecture supports a maximum of 4 breakpoints.
919 Users are cautioned to set probes judiciously.
920
921 .SH EXAMPLES
922 .PP
923 Here are some example probe points, defining the associated events.
924 .TP
925 begin, end, end
926 refers to the startup and normal shutdown of the session. In this
927 case, the handler would run once during startup and twice during
928 shutdown.
929 .TP
930 timer.jiffies(1000).randomize(200)
931 refers to a periodic interrupt, every 1000 +/\- 200 jiffies.
932 .TP
933 kernel.function("*init*"), kernel.function("*exit*")
934 refers to all kernel functions with "init" or "exit" in the name.
935 .TP
936 kernel.function("*@kernel/time.c:240")
937 refers to any functions within the "kernel/time.c" file that span
938 line 240.
939 .BR
940 Note
941 that this is
942 .BR not
943 a probe at the statement at that line number. Use the
944 .IR
945 kernel.statement
946 probe instead.
947 .TP
948 kernel.mark("getuid")
949 refers to an STAP_MARK(getuid, ...) macro call in the kernel.
950 .TP
951 module("usb*").function("*sync*").return
952 refers to the moment of return from all functions with "sync" in the
953 name in any of the USB drivers.
954 .TP
955 kernel.statement(0xc0044852)
956 refers to the first byte of the statement whose compiled instructions
957 include the given address in the kernel.
958 .TP
959 kernel.statement("*@kernel/time.c:296")
960 refers to the statement of line 296 within "kernel/time.c".
961 .TP
962 kernel.statement("bio_init@fs/bio.c+3")
963 refers to the statement at line bio_init+3 within "fs/bio.c".
964 .TP
965 kernel.data("pid_max").write
966 refers to a hardware preakpoint of type "write" set on pid_max
967 .TP
968 syscall.*.return
969 refers to the group of probe aliases with any name in the third position
970
971 .SS PERF
972
973 This
974 .IR prototype
975 family of probe points interfaces to the kernel "perf event"
976 infrasture for controlling hardware performance counters.
977 The events being attached to are described by the "type",
978 "config" fields of the
979 .IR perf_event_attr
980 structure, and are sampled at an interval governed by the
981 "sample_period" field.
982
983 These fields are made available to systemtap scripts using
984 the following syntax:
985 .SAMPLE
986 probe perf.type(NN).config(MM).sample(XX)
987 probe perf.type(NN).config(MM)
988 .ESAMPLE
989 The systemtap probe handler is called once per XX increments
990 of the underlying performance counter. The default sampling
991 count is 1000000.
992 The range of valid type/config is described by the
993 .IR perf_event_open (2)
994 system call, and/or the
995 .IR linux/perf_event.h
996 file. Invalid combinations or exhausted hardware counter resources
997 result in errors during systemtap script startup. Systemtap does
998 not sanity-check the values: it merely passes them through to
999 the kernel for error- and safety-checking.
1000
1001 .SH SEE ALSO
1002 .IR stap (1),
1003 .IR probe::* (3stap),
1004 .IR tapset::* (3stap)
This page took 0.078714 seconds and 6 git commands to generate.