]>
Commit | Line | Data |
---|---|---|
ba4a90fd | 1 | .\" -*- nroff -*- |
e97c0b29 | 2 | .TH STAPPROBES 3stap @DATE@ "Red Hat" |
ba4a90fd FCE |
3 | .SH NAME |
4 | stapprobes \- systemtap probe points | |
5 | ||
6 | .\" macros | |
7 | .de SAMPLE | |
8 | .br | |
9 | .RS | |
10 | .nf | |
11 | .nh | |
12 | .. | |
13 | .de ESAMPLE | |
14 | .hy | |
15 | .fi | |
16 | .RE | |
17 | .. | |
18 | ||
19 | .SH DESCRIPTION | |
20 | The following sections enumerate the variety of probe points supported | |
21 | by the systemtap translator, and additional aliases defined by | |
22 | standard tapset scripts. | |
23 | .PP | |
7abecb38 | 24 | The general probe point syntax is a dotted-symbol sequence. This |
ba4a90fd FCE |
25 | allows a breakdown of the event namespace into parts, somewhat like |
26 | the Domain Name System does on the Internet. Each component | |
7abecb38 | 27 | identifier may be parametrized by a string or number literal, with a |
d898100a FCE |
28 | syntax like a function call. A component may include a "*" character, |
29 | to expand to a set of matching probe points. Probe aliases likewise | |
30 | expand to other probe points. Each and every resulting probe point is | |
31 | normally resolved to some low-level system instrumentation facility | |
32 | (e.g., a kprobe address, marker, or a timer configuration), otherwise | |
33 | the elaboration phase will fail. | |
34 | .PP | |
35 | However, a probe point may be followed by a "?" character, to indicate | |
36 | that it is optional, and that no error should result if it fails to | |
37 | resolve. Optionalness passes down through all levels of | |
38 | alias/wildcard expansion. Alternately, a probe point may be followed | |
39 | by a "!" character, to indicate that it is both optional and | |
37f6433e | 40 | sufficient. (Think vaguely of the Prolog cut operator.) If it does |
d898100a FCE |
41 | resolve, then no further probe points in the same comma-separated list |
42 | will be resolved. Therefore, the "!" sufficiency mark only makes | |
43 | sense in a list of probe point alternatives. | |
dfd11cc3 MH |
44 | .PP |
45 | Additionally, a probe point may be followed by a "if (expr)" statement, in | |
46 | order to enable/disable the probe point on-the-fly. With the "if" statement, | |
47 | if the "expr" is false when the probe point is hit, the whole probe body | |
48 | including alias's body is skipped. The condition is stacked up through | |
49 | all levels of alias/wildcard expansion. So the final condition becomes | |
50 | the logical-and of conditions of all expanded alias/wildcard. | |
6e3347a9 | 51 | |
e904ad95 FCE |
52 | These are all |
53 | .B syntactically | |
54 | valid probe points. (They are generally | |
55 | .B semantically | |
56 | invalid, depending on the contents of the tapsets, and the versions of | |
57 | kernel/user software installed.) | |
ca88561f | 58 | |
ba4a90fd FCE |
59 | .SAMPLE |
60 | kernel.function("foo").return | |
e904ad95 | 61 | process("/bin/vi").statement(0x2222) |
ba4a90fd | 62 | end |
729286d8 | 63 | syscall.* |
6e3347a9 | 64 | kernel.function("no_such_function") ? |
d898100a | 65 | module("awol").function("no_such_function") ! |
dfd11cc3 | 66 | signal.*? if (switch) |
94c3c803 | 67 | kprobe.function("foo") |
ba4a90fd FCE |
68 | .ESAMPLE |
69 | ||
e904ad95 | 70 | |
6f05b6ab FCE |
71 | Probes may be broadly classified into "synchronous" and |
72 | "asynchronous". A "synchronous" event is deemed to occur when any | |
73 | processor executes an instruction matched by the specification. This | |
74 | gives these probes a reference point (instruction address) from which | |
75 | more contextual data may be available. Other families of probe points | |
76 | refer to "asynchronous" events such as timers/counters rolling over, | |
77 | where there is no fixed reference point that is related. Each probe | |
78 | point specification may match multiple locations (for example, using | |
79 | wildcards or aliases), and all them are then probed. A probe | |
80 | declaration may also contain several comma-separated specifications, | |
81 | all of which are probed. | |
82 | ||
65aeaea0 | 83 | .SS BEGIN/END/ERROR |
ba4a90fd FCE |
84 | |
85 | The probe points | |
86 | .IR begin " and " end | |
87 | are defined by the translator to refer to the time of session startup | |
88 | and shutdown. All "begin" probe handlers are run, in some sequence, | |
89 | during the startup of the session. All global variables will have | |
90 | been initialized prior to this point. All "end" probes are run, in | |
91 | some sequence, during the | |
92 | .I normal | |
93 | shutdown of a session, such as in the aftermath of an | |
94 | .I exit () | |
95 | function call, or an interruption from the user. In the case of an | |
96 | error-triggered shutdown, "end" probes are not run. There are no | |
97 | target variables available in either context. | |
6a256b03 JS |
98 | .PP |
99 | If the order of execution among "begin" or "end" probes is significant, | |
100 | then an optional sequence number may be provided: | |
ca88561f | 101 | |
6a256b03 JS |
102 | .SAMPLE |
103 | begin(N) | |
104 | end(N) | |
105 | .ESAMPLE | |
ca88561f | 106 | |
6a256b03 JS |
107 | The number N may be positive or negative. The probe handlers are run in |
108 | increasing order, and the order between handlers with the same sequence | |
109 | number is unspecified. When "begin" or "end" are given without a | |
110 | sequence, they are effectively sequence zero. | |
ba4a90fd | 111 | |
65aeaea0 FCE |
112 | The |
113 | .IR error | |
114 | probe point is similar to the | |
115 | .IR end | |
d898100a FCE |
116 | probe, except that each such probe handler run when the session ends |
117 | after errors have occurred. In such cases, "end" probes are skipped, | |
37f6433e | 118 | but each "error" probe is still attempted. This kind of probe can be |
d898100a FCE |
119 | used to clean up or emit a "final gasp". It may also be numerically |
120 | parametrized to set a sequence. | |
65aeaea0 | 121 | |
6e3347a9 FCE |
122 | .SS NEVER |
123 | The probe point | |
124 | .IR never | |
125 | is specially defined by the translator to mean "never". Its probe | |
126 | handler is never run, though its statements are analyzed for symbol / | |
127 | type correctness as usual. This probe point may be useful in | |
128 | conjunction with optional probes. | |
129 | ||
1027502b FCE |
130 | .SS SYSCALL |
131 | ||
132 | The | |
133 | .IR syscall.* | |
134 | aliases define several hundred probes, too many to | |
135 | summarize here. They are: | |
136 | ||
137 | .SAMPLE | |
138 | syscall.NAME | |
139 | .br | |
140 | syscall.NAME.return | |
141 | .ESAMPLE | |
142 | ||
143 | Generally, two probes are defined for each normal system call as listed in the | |
144 | .IR syscalls(2) | |
145 | manual page, one for entry and one for return. Those system calls that never | |
146 | return do not have a corresponding | |
147 | .IR .return | |
148 | probe. | |
149 | .PP | |
150 | Each probe alias defines a variety of variables. Looking at the tapset source | |
151 | code is the most reliable way. Generally, each variable listed in the standard | |
152 | manual page is made available as a script-level variable, so | |
153 | .IR syscall.open | |
154 | exposes | |
155 | .IR filename ", " flags ", and " mode . | |
156 | In addition, a standard suite of variables is available at most aliases: | |
157 | .TP | |
158 | .IR argstr | |
159 | A pretty-printed form of the entire argument list, without parentheses. | |
160 | .TP | |
161 | .IR name | |
162 | The name of the system call. | |
163 | .TP | |
164 | .IR retstr | |
165 | For return probes, a pretty-printed form of the system-call result. | |
166 | .PP | |
167 | Not all probe aliases obey all of these general guidelines. Please report | |
168 | any bothersome ones you encounter as a bug. | |
169 | ||
170 | ||
ba4a90fd FCE |
171 | .SS TIMERS |
172 | ||
173 | Intervals defined by the standard kernel "jiffies" timer may be used | |
174 | to trigger probe handlers asynchronously. Two probe point variants | |
175 | are supported by the translator: | |
ca88561f | 176 | |
ba4a90fd FCE |
177 | .SAMPLE |
178 | timer.jiffies(N) | |
179 | timer.jiffies(N).randomize(M) | |
180 | .ESAMPLE | |
ca88561f | 181 | |
ba4a90fd FCE |
182 | The probe handler is run every N jiffies (a kernel-defined unit of |
183 | time, typically between 1 and 60 ms). If the "randomize" component is | |
13d2ecdb | 184 | given, a linearly distributed random value in the range [\-M..+M] is |
ba4a90fd FCE |
185 | added to N every time the handler is run. N is restricted to a |
186 | reasonable range (1 to around a million), and M is restricted to be | |
187 | smaller than N. There are no target variables provided in either | |
188 | context. It is possible for such probes to be run concurrently on | |
189 | a multi-processor computer. | |
422d1ceb | 190 | .PP |
197a4d62 | 191 | Alternatively, intervals may be specified in units of time. |
422d1ceb | 192 | There are two probe point variants similar to the jiffies timer: |
ca88561f | 193 | |
422d1ceb FCE |
194 | .SAMPLE |
195 | timer.ms(N) | |
196 | timer.ms(N).randomize(M) | |
197 | .ESAMPLE | |
ca88561f | 198 | |
197a4d62 JS |
199 | Here, N and M are specified in milliseconds, but the full options for units |
200 | are seconds (s/sec), milliseconds (ms/msec), microseconds (us/usec), | |
201 | nanoseconds (ns/nsec), and hertz (hz). Randomization is not supported for | |
202 | hertz timers. | |
203 | ||
204 | The actual resolution of the timers depends on the target kernel. For | |
205 | kernels prior to 2.6.17, timers are limited to jiffies resolution, so | |
206 | intervals are rounded up to the nearest jiffies interval. After 2.6.17, | |
207 | the implementation uses hrtimers for tighter precision, though the actual | |
208 | resolution will be arch-dependent. In either case, if the "randomize" | |
209 | component is given, then the random value will be added to the interval | |
210 | before any rounding occurs. | |
39e57ce0 FCE |
211 | .PP |
212 | Profiling timers are also available to provide probes that execute on all | |
3ca1f652 FCE |
213 | CPUs at the rate of the system tick (CONFIG_HZ). |
214 | This probe takes no parameters. | |
ca88561f | 215 | |
39e57ce0 FCE |
216 | .SAMPLE |
217 | timer.profile | |
218 | .ESAMPLE | |
ca88561f | 219 | |
39e57ce0 FCE |
220 | Full context information of the interrupted process is available, making |
221 | this probe suitable for a time-based sampling profiler. | |
ba4a90fd FCE |
222 | |
223 | .SS DWARF | |
224 | ||
225 | This family of probe points uses symbolic debugging information for | |
226 | the target kernel/module/program, as may be found in unstripped | |
227 | executables, or the separate | |
228 | .I debuginfo | |
229 | packages. They allow placement of probes logically into the execution | |
230 | path of the target program, by specifying a set of points in the | |
231 | source or object code. When a matching statement executes on any | |
232 | processor, the probe handler is run in that context. | |
233 | .PP | |
234 | Points in a kernel, which are identified by | |
ca88561f | 235 | module, source file, line number, function name, or some |
6f05b6ab | 236 | combination of these. |
ba4a90fd FCE |
237 | .PP |
238 | Here is a list of probe point families currently supported. The | |
239 | .B .function | |
240 | variant places a probe near the beginning of the named function, so that | |
241 | parameters are available as context variables. The | |
242 | .B .return | |
39e3139a FCE |
243 | variant places a probe at the moment |
244 | .B after | |
245 | the return from the named function, so the return value is available | |
246 | as the "$return" context variable. The | |
54efe513 | 247 | .B .inline |
b8da0ad1 | 248 | modifier for |
54efe513 | 249 | .B .function |
b8da0ad1 FCE |
250 | filters the results to include only instances of inlined functions. |
251 | The | |
252 | .B .call | |
253 | modifier selects the opposite subset. Inline functions do not have an | |
254 | identifiable return point, so | |
54efe513 GH |
255 | .B .return |
256 | is not supported on | |
257 | .B .inline | |
258 | probes. The | |
ba4a90fd FCE |
259 | .B .statement |
260 | variant places a probe at the exact spot, exposing those local variables | |
261 | that are visible there. | |
ca88561f | 262 | |
ba4a90fd FCE |
263 | .SAMPLE |
264 | kernel.function(PATTERN) | |
265 | .br | |
b8da0ad1 FCE |
266 | kernel.function(PATTERN).call |
267 | .br | |
ba4a90fd FCE |
268 | kernel.function(PATTERN).return |
269 | .br | |
b8da0ad1 | 270 | kernel.function(PATTERN).inline |
54efe513 | 271 | .br |
592470cd SC |
272 | kernel.function(PATTERN).label(LPATTERN) |
273 | .br | |
ba4a90fd FCE |
274 | module(MPATTERN).function(PATTERN) |
275 | .br | |
b8da0ad1 FCE |
276 | module(MPATTERN).function(PATTERN).call |
277 | .br | |
ba4a90fd FCE |
278 | module(MPATTERN).function(PATTERN).return |
279 | .br | |
b8da0ad1 FCE |
280 | module(MPATTERN).function(PATTERN).inline |
281 | .br | |
54efe513 | 282 | .br |
ba4a90fd FCE |
283 | kernel.statement(PATTERN) |
284 | .br | |
37ebca01 FCE |
285 | kernel.statement(ADDRESS).absolute |
286 | .br | |
ba4a90fd FCE |
287 | module(MPATTERN).statement(PATTERN) |
288 | .ESAMPLE | |
ca88561f | 289 | |
ba4a90fd | 290 | In the above list, MPATTERN stands for a string literal that aims to |
592470cd SC |
291 | identify the loaded kernel module of interest and LPATTERN stands for |
292 | a source program label. Both MPATTERN and LPATTERN may include the "*" | |
293 | "[]", and "?" wildcards. | |
294 | PATTERN stands for a string literal that | |
6f05b6ab | 295 | aims to identify a point in the program. It is made up of three |
ca88561f MM |
296 | parts: |
297 | .IP \(bu 4 | |
298 | The first part is the name of a function, as would appear in the | |
ba4a90fd FCE |
299 | .I nm |
300 | program's output. This part may use the "*" and "?" wildcarding | |
ca88561f MM |
301 | operators to match multiple names. |
302 | .IP \(bu 4 | |
303 | The second part is optional and begins with the "@" character. | |
304 | It is followed by the path to the source file containing the function, | |
305 | which may include a wildcard pattern, such as mm/slab*. | |
79640c29 | 306 | If it does not match as is, an implicit "*/" is optionally added |
ea384b8c | 307 | .I before |
79640c29 FCE |
308 | the pattern, so that a script need only name the last few components |
309 | of a possibly long source directory path. | |
ca88561f | 310 | .IP \(bu 4 |
ba4a90fd | 311 | Finally, the third part is optional if the file name part was given, |
1bd128a3 SC |
312 | and identifies the line number in the source file preceded by a ":" |
313 | or a "+". The line number is assumed to be an | |
314 | absolute line number if preceded by a ":", or relative to the entry of | |
99a5f9cf SC |
315 | the function if preceded by a "+". |
316 | All the lines in the function can be matched with ":*". | |
317 | A range of lines x through y can be matched with ":x-y". | |
ca88561f | 318 | .PP |
ba4a90fd | 319 | As an alternative, PATTERN may be a numeric constant, indicating an |
ea384b8c FCE |
320 | address. Such an address may be found from symbol tables of the |
321 | appropriate kernel / module object file. It is verified against | |
322 | known statement code boundaries, and will be relocated for use at | |
323 | run time. | |
324 | .PP | |
325 | In guru mode only, absolute kernel-space addresses may be specified with | |
326 | the ".absolute" suffix. Such an address is considered already relocated, | |
327 | as if it came from | |
328 | .BR /proc/kallsyms , | |
329 | so it cannot be checked against statement/instruction boundaries. | |
ba4a90fd | 330 | .PP |
39e3139a | 331 | Some of the source-level context variables, such as function parameters, |
ba4a90fd FCE |
332 | locals, globals visible in the compilation unit, may be visible to |
333 | probe handlers. They may refer to these variables by prefixing their | |
334 | name with "$" within the scripts. In addition, a special syntax | |
335 | allows limited traversal of structures, pointers, and arrays. | |
336 | .TP | |
337 | $var | |
338 | refers to an in-scope variable "var". If it's an integer-like type, | |
7b9361d5 FCE |
339 | it will be cast to a 64-bit int for systemtap script use. String-like |
340 | pointers (char *) may be copied to systemtap string values using the | |
341 | .IR kernel_string " or " user_string | |
342 | functions. | |
ba4a90fd | 343 | .TP |
13d2ecdb | 344 | $var\->field |
ba4a90fd FCE |
345 | traversal to a structure's field. The indirection operator |
346 | may be repeated to follow more levels of pointers. | |
347 | .TP | |
a43ba433 FCE |
348 | $return |
349 | is available in return probes only for functions that are declared | |
350 | with a return value. | |
351 | .TP | |
352 | .TP | |
ba4a90fd FCE |
353 | $var[N] |
354 | indexes into an array. The index is given with a | |
355 | literal number. | |
2cb3fe26 SC |
356 | .TP |
357 | $$vars | |
358 | expands to a character string that is equivalent to | |
359 | sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x", parm1, ..., parmN, | |
360 | var1, ..., varN) | |
361 | .TP | |
362 | $$locals | |
a43ba433 | 363 | expands to a subset of $$vars for only local variables. |
2cb3fe26 SC |
364 | .TP |
365 | $$parms | |
a43ba433 FCE |
366 | expands to a subset of $$vars for only function parameters. |
367 | .TP | |
368 | $$return | |
369 | is available in return probes only. It expands to a string that | |
fd574705 | 370 | is equivalent to sprintf("return=%x", $return) |
a43ba433 | 371 | if the probed function has a return value, or else an empty string. |
39e3139a FCE |
372 | .PP |
373 | For ".return" probes, context variables other than the "$return" | |
374 | value itself are only available for the function call parameters. | |
375 | The expressions evaluate to the | |
376 | .IR entry-time | |
377 | values of those variables, since that is when a snapshot is taken. | |
378 | Other local variables are not generally accessible, since by the time | |
379 | a ".return" probe hits, the probed function will have already returned. | |
380 | ||
ba4a90fd | 381 | |
94c3c803 AM |
382 | .SS DWARFLESS |
383 | In absence of debugging information, entry & exit points of kernel & module | |
384 | functions can be probed using the "kprobe" family of probes. | |
385 | However, these do not permit looking up the arguments / local variables | |
386 | of the function. | |
387 | Following constructs are supported : | |
388 | .SAMPLE | |
389 | kprobe.function(FUNCTION) | |
390 | kprobe.function(FUNCTION).return | |
391 | kprobe.module(NAME).function(FUNCTION) | |
392 | kprobe.module(NAME).function(FUNCTION).return | |
393 | kprobe.statement.(ADDRESS).absolute | |
394 | .ESAMPLE | |
395 | .PP | |
396 | Probes of type | |
397 | .B function | |
398 | are recommended for kernel functions, whereas probes of type | |
399 | .B module | |
400 | are recommended for probing functions of the specified module. | |
401 | In case the absolute address of a kernel or module function is known, | |
402 | .B statement | |
403 | probes can be utilized. | |
404 | .PP | |
405 | Note that | |
406 | .I FUNCTION | |
407 | and | |
408 | .I MODULE | |
409 | names | |
410 | .B must not | |
411 | contain wildcards, or the probe will not be registered. | |
412 | Also, statement probes must be run under guru-mode only. | |
413 | ||
414 | ||
1ada6f08 | 415 | .SS USER-SPACE |
0a1c696d FCE |
416 | Support for user-space probing is available for kernels |
417 | that are configured with the utrace extensions. See | |
418 | .SAMPLE | |
419 | http://people.redhat.com/roland/utrace/ | |
420 | .ESAMPLE | |
421 | .PP | |
422 | There are several forms. First, a non-symbolic probe point: | |
1ada6f08 FCE |
423 | .SAMPLE |
424 | process(PID).statement(ADDRESS).absolute | |
425 | .ESAMPLE | |
426 | is analogous to | |
427 | .IR | |
428 | kernel.statement(ADDRESS).absolute | |
429 | in that both use raw (unverified) virtual addresses and provide | |
430 | no $variables. The target PID parameter must identify a running | |
431 | process, and ADDRESS should identify a valid instruction address. | |
432 | All threads of that process will be probed. | |
29cb9b42 | 433 | .PP |
0a1c696d FCE |
434 | Second, non-symbolic user-kernel interface events handled by |
435 | utrace may be probed: | |
29cb9b42 | 436 | .SAMPLE |
dd078c96 DS |
437 | process(PID).begin |
438 | process("PATH").begin | |
986e98de | 439 | process.begin |
dd078c96 DS |
440 | process(PID).thread.begin |
441 | process("PATH").thread.begin | |
986e98de | 442 | process.thread.begin |
dd078c96 DS |
443 | process(PID).end |
444 | process("PATH").end | |
986e98de | 445 | process.end |
dd078c96 DS |
446 | process(PID).thread.end |
447 | process("PATH").thread.end | |
986e98de | 448 | process.thread.end |
29cb9b42 DS |
449 | process(PID).syscall |
450 | process("PATH").syscall | |
986e98de | 451 | process.syscall |
29cb9b42 DS |
452 | process(PID).syscall.return |
453 | process("PATH").syscall.return | |
986e98de | 454 | process.syscall.return |
0afb7073 FCE |
455 | process(PID).insn |
456 | process("PATH").insn | |
457 | process(PID).insn.block | |
458 | process("PATH").insn.block | |
29cb9b42 DS |
459 | .ESAMPLE |
460 | .PP | |
461 | A | |
dd078c96 DS |
462 | .B .begin |
463 | probe gets called when new process described by PID or PATH gets created. | |
29cb9b42 | 464 | A |
dd078c96 DS |
465 | .B .thread.begin |
466 | probe gets called when a new thread described by PID or PATH gets created. | |
159cb109 | 467 | A |
dd078c96 DS |
468 | .B .end |
469 | probe gets called when process described by PID or PATH dies. | |
470 | A | |
471 | .B .thread.end | |
29cb9b42 DS |
472 | probe gets called when a thread described by PID or PATH dies. |
473 | A | |
474 | .B .syscall | |
475 | probe gets called when a thread described by PID or PATH makes a | |
6270adc1 MH |
476 | system call. The system call number is available in the |
477 | .BR $syscall | |
478 | context variable, and the first 6 arguments of the system call | |
479 | are available in the | |
480 | .BR $argN | |
481 | (ex. $arg1, $arg2, ...) context variable. | |
29cb9b42 DS |
482 | A |
483 | .B .syscall.return | |
484 | probe gets called when a thread described by PID or PATH returns from a | |
5d67b47c MH |
485 | system call. The system call number is available in the |
486 | .BR $syscall | |
487 | context variable, and the return value of the system call is available | |
488 | in the | |
489 | .BR $return | |
29cb9b42 | 490 | context variable. |
a96d1db0 | 491 | A |
0afb7073 FCE |
492 | .B .insn |
493 | probe gets called for every single-stepped instruction of the process described by PID or PATH. | |
494 | A | |
495 | .B .insn.block | |
496 | probe gets called for every block-stepped instruction of the process described by PID or PATH. | |
0a1c696d FCE |
497 | |
498 | .PP | |
499 | Third, symbolic static instrumentation compiled into programs and | |
500 | shared libraries may be | |
501 | probed: | |
502 | .SAMPLE | |
503 | process("PATH").mark("LABEL") | |
504 | .ESAMPLE | |
505 | .PP | |
f28a8c28 SC |
506 | A |
507 | .B .mark | |
508 | probe gets called via a static probe which is defined in the | |
509 | application by | |
592470cd | 510 | STAP_PROBE1(handle,LABEL,arg1), which is defined in sdt.h. The handle is an application handle, |
f28a8c28 SC |
511 | LABEL corresponds to the .mark argument, and arg1 is the argument. |
512 | STAP_PROBE1 is used for probes with 1 argument, STAP_PROBE2 is used | |
513 | for probes with 2 arguments, and so on. | |
514 | The arguments of the probe are available in the context variables | |
592470cd SC |
515 | $arg1, $arg2, ... An alternative to using the STAP_PROBE macros is to |
516 | use the dtrace script to create custom macros. | |
0a1c696d | 517 | |
29cb9b42 | 518 | .PP |
0a1c696d FCE |
519 | Finally, full symbolic source-level probes in user-space programs |
520 | and shared libraries are supported. These are exactly analogous | |
521 | to the symbolic DWARF-based kernel/module probes described above, | |
522 | and expose similar contextual $-variables. | |
523 | .SAMPLE | |
524 | process("PATH").function("NAME") | |
525 | process("PATH").statement("*@FILE.c:123") | |
526 | process("PATH").function("*").return | |
527 | process("PATH").function("myfun").label("foo") | |
528 | .ESAMPLE | |
529 | ||
530 | .PP | |
531 | Note that for all process probes, | |
29cb9b42 | 532 | .I PATH |
ea384b8c FCE |
533 | names refer to executables that are searched the same way shells do: relative |
534 | to the working directory if they contain a "/" character, otherwise in | |
535 | .BR $PATH . | |
986e98de | 536 | If a process probe is specified without a PID or PATH, all user |
0a1c696d FCE |
537 | threads are probed. PATH may sometimes name a shared library |
538 | in which case all processes that map that shared library may be | |
539 | probed. | |
1ada6f08 | 540 | |
9cb48751 DS |
541 | .SS PROCFS |
542 | ||
543 | These probe points allow procfs "files" in | |
544 | /proc/systemtap/MODNAME to be created, read and written | |
545 | .RI ( MODNAME | |
546 | is the name of the systemtap module). The | |
547 | .I proc | |
548 | filesystem is a pseudo-filesystem which is used an an interface to | |
549 | kernel data structures. There are four probe point variants supported | |
550 | by the translator: | |
ca88561f | 551 | |
9cb48751 DS |
552 | .SAMPLE |
553 | procfs("PATH").read | |
554 | procfs("PATH").write | |
555 | procfs.read | |
556 | procfs.write | |
557 | .ESAMPLE | |
ca88561f | 558 | |
9cb48751 DS |
559 | .I PATH |
560 | is the file name (relative to /proc/systemtap/MODNAME) to be created. | |
561 | If no | |
562 | .I PATH | |
563 | is specified (as in the last two variants above), | |
564 | .I PATH | |
565 | defaults to "command". | |
566 | .PP | |
567 | When a user reads /proc/systemtap/MODNAME/PATH, the corresponding | |
568 | procfs | |
569 | .I read | |
570 | probe is triggered. The string data to be read should be assigned to | |
571 | a variable named | |
572 | .IR $value , | |
573 | like this: | |
ca88561f | 574 | |
9cb48751 DS |
575 | .SAMPLE |
576 | procfs("PATH").read { $value = "100\\n" } | |
577 | .ESAMPLE | |
578 | .PP | |
579 | When a user writes into /proc/systemtap/MODNAME/PATH, the | |
580 | corresponding procfs | |
581 | .I write | |
582 | probe is triggered. The data the user wrote is available in the | |
583 | string variable named | |
584 | .IR $value , | |
585 | like this: | |
ca88561f | 586 | |
9cb48751 DS |
587 | .SAMPLE |
588 | procfs("PATH").write { printf("user wrote: %s", $value) } | |
589 | .ESAMPLE | |
590 | ||
6f05b6ab FCE |
591 | .SS MARKERS |
592 | ||
593 | This family of probe points hooks up to static probing markers | |
594 | inserted into the kernel or modules. These markers are special macro | |
595 | calls inserted by kernel developers to make probing faster and more | |
596 | reliable than with DWARF-based probes. Further, DWARF debugging | |
597 | information is | |
598 | .I not | |
599 | required to probe markers. | |
600 | ||
601 | Marker probe points begin with | |
f781f849 DS |
602 | .BR kernel . |
603 | The next part names the marker itself: | |
6f05b6ab FCE |
604 | .BR mark("name") . |
605 | The marker name string, which may contain the usual wildcard characters, | |
606 | is matched against the names given to the marker macros when the kernel | |
eb973c2a DS |
607 | and/or module was compiled. Optionally, you can specify |
608 | .BR format("format") . | |
37f6433e | 609 | Specifying the marker format string allows differentiation between two |
eb973c2a | 610 | markers with the same name but different marker format strings. |
6f05b6ab FCE |
611 | |
612 | The handler associated with a marker-based probe may read the | |
613 | optional parameters specified at the macro call site. These are | |
614 | named | |
615 | .BR $arg1 " through " $argNN , | |
616 | where NN is the number of parameters supplied by the macro. Number | |
617 | and string parameters are passed in a type-safe manner. | |
618 | ||
eb973c2a DS |
619 | The marker format string associated with a marker is available in |
620 | .BR $format . | |
37f6433e | 621 | And also the marker name string is available in |
bc54e71c | 622 | .BR $name . |
eb973c2a | 623 | |
bc724b8b JS |
624 | .SS TRACEPOINTS |
625 | ||
626 | This family of probe points hooks up to static probing tracepoints | |
627 | inserted into the kernel or modules. As with markers, these | |
628 | tracepoints are special macro calls inserted by kernel developers to | |
629 | make probing faster and more reliable than with DWARF-based probes, | |
630 | and DWARF debugging information is not required to probe tracepoints. | |
631 | Tracepoints have an extra advantage of more strongly-typed parameters | |
632 | than markers. | |
633 | ||
634 | Tracepoint probes begin with | |
635 | .BR kernel . | |
636 | The next part names the tracepoint itself: | |
637 | .BR trace("name") . | |
638 | The tracepoint name string, which may contain the usual wildcard | |
639 | characters, is matched against the names defined by the kernel | |
640 | developers in the tracepoint header files. | |
641 | ||
642 | The handler associated with a tracepoint-based probe may read the | |
643 | optional parameters specified at the macro call site. These are | |
644 | named according to the declaration by the tracepoint author. For | |
645 | example, the tracepoint probe | |
646 | .BR kernel.trace("sched_switch") | |
647 | provides the parameters | |
648 | .BR $rq ", " $prev ", and " $next . | |
649 | If the parameter is a complex type, as in a struct pointer, then a | |
650 | script can access fields with the same syntax as DWARF $target | |
651 | variables. Also, tracepoint parameters cannot be modified, but in | |
652 | guru-mode a script may modify fields of parameters. | |
653 | ||
654 | The name of the tracepoint is available in | |
655 | .BR $$name , | |
656 | and a string of name=value pairs for all parameters of the tracepoint | |
657 | is available in | |
046e7190 | 658 | .BR $$vars " or " $$parms . |
bc724b8b | 659 | |
47dd066d WC |
660 | .SS PERFORMANCE MONITORING HARDWARE |
661 | ||
662 | The perfmon family of probe points is used to access the performance | |
663 | monitoring hardware available in modern processors. This family of | |
664 | probes points needs the perfmon2 support in the kernel to access the | |
665 | performance monitoring hardware. | |
666 | .PP | |
667 | Performance monitor hardware points begin with a | |
668 | .BR perfmon ". " | |
669 | The next part of the names the event being counted | |
670 | .BR counter("event") . | |
671 | The event names are processor implementation specific with the | |
37f6433e | 672 | exception of the generic |
47dd066d WC |
673 | .BR cycles " and " instructions |
674 | events, which are available on all processors. This sets up a counter | |
37f6433e | 675 | on the processor to count the number of events occurring on the |
47dd066d WC |
676 | processor. For more details on the performance monitoring events |
677 | available on a specific processor use the command perfmon2 command: | |
ca88561f | 678 | |
47dd066d | 679 | .SAMPLE |
7455c074 | 680 | pfmon \-l |
47dd066d WC |
681 | .ESAMPLE |
682 | .TP | |
683 | $counter | |
684 | is a handle used in the body of the probe for operations | |
685 | involving the counter associated with the probe. | |
686 | .TP | |
687 | read_counter | |
688 | is a function that is passed the handle for the perfmon probe and returns | |
689 | the current count for the event. | |
690 | ||
ba4a90fd FCE |
691 | .SH EXAMPLES |
692 | .PP | |
693 | Here are some example probe points, defining the associated events. | |
694 | .TP | |
695 | begin, end, end | |
696 | refers to the startup and normal shutdown of the session. In this | |
697 | case, the handler would run once during startup and twice during | |
698 | shutdown. | |
699 | .TP | |
700 | timer.jiffies(1000).randomize(200) | |
13d2ecdb | 701 | refers to a periodic interrupt, every 1000 +/\- 200 jiffies. |
ba4a90fd FCE |
702 | .TP |
703 | kernel.function("*init*"), kernel.function("*exit*") | |
704 | refers to all kernel functions with "init" or "exit" in the name. | |
705 | .TP | |
706 | kernel.function("*@kernel/sched.c:240") | |
707 | refers to any functions within the "kernel/sched.c" file that span | |
708 | line 240. | |
709 | .TP | |
6f05b6ab FCE |
710 | kernel.mark("getuid") |
711 | refers to an STAP_MARK(getuid, ...) macro call in the kernel. | |
712 | .TP | |
ba4a90fd FCE |
713 | module("usb*").function("*sync*").return |
714 | refers to the moment of return from all functions with "sync" in the | |
715 | name in any of the USB drivers. | |
716 | .TP | |
717 | kernel.statement(0xc0044852) | |
718 | refers to the first byte of the statement whose compiled instructions | |
719 | include the given address in the kernel. | |
b4ceace2 | 720 | .TP |
a5ae3f3d | 721 | kernel.statement("*@kernel/sched.c:2917") |
1bd128a3 SC |
722 | refers to the statement of line 2917 within "kernel/sched.c". |
723 | .TP | |
724 | kernel.statement("bio_init@fs/bio.c+3") | |
725 | refers to the statement at line bio_init+3 within "fs/bio.c". | |
a5ae3f3d | 726 | .TP |
729286d8 | 727 | syscall.*.return |
b4ceace2 | 728 | refers to the group of probe aliases with any name in the third position |
ba4a90fd FCE |
729 | |
730 | .SH SEE ALSO | |
78db65bd | 731 | .IR stap (1), |
e97c0b29 WC |
732 | .IR stapprobes.iosched (3stap), |
733 | .IR stapprobes.netdev (3stap), | |
734 | .IR stapprobes.nfs (3stap), | |
735 | .IR stapprobes.nfsd (3stap), | |
736 | .IR stapprobes.pagefault (3stap), | |
737 | .IR stapprobes.process (3stap), | |
738 | .IR stapprobes.rpc (3stap), | |
739 | .IR stapprobes.scsi (3stap), | |
740 | .IR stapprobes.signal (3stap), | |
741 | .IR stapprobes.socket (3stap), | |
742 | .IR stapprobes.tcp (3stap), | |
743 | .IR stapprobes.udp (3stap), | |
744 | .IR proc (3stap) |