]> sourceware.org Git - systemtap.git/blob - stap.1.in
acfc64c334b0e5a6265430bf34b6ef5048381c5f
[systemtap.git] / stap.1.in
1 .\" -*- nroff -*-
2 .TH STAP 1 @DATE@ "Red Hat"
3 .SH NAME
4 stap \- systemtap script translator/driver
5
6 .\" macros
7 .de SAMPLE
8 .br
9 .RS
10 .nf
11 .nh
12 ..
13 .de ESAMPLE
14 .hy
15 .fi
16 .RE
17 ..
18
19 .SH SYNOPSIS
20
21 .br
22 .B stap
23 [
24 .I OPTIONS
25 ]
26 .I FILENAME
27 [
28 .I ARGUMENTS
29 ]
30 .br
31 .B stap
32 [
33 .I OPTIONS
34 ]
35 .B \-
36 [
37 .I ARGUMENTS
38 ]
39 .br
40 .B stap
41 [
42 .I OPTIONS
43 ]
44 .BI \-e " SCRIPT"
45 [
46 .I ARGUMENTS
47 ]
48
49 .SH DESCRIPTION
50
51 The
52 .IR stap
53 program is the front-end to the Systemtap tool. It accepts probing
54 instructions (written in a simple scripting language), translates
55 those instructions into C code, compiles this C code, and loads the
56 resulting kernel module into a running Linux kernel to perform the
57 requested system trace/probe functions. You can supply the script in
58 a named file, from standard input, or from the command line. The
59 program runs until it is interrupted by the user, or if the script
60 voluntarily invokes the
61 .I exit()
62 function, or by sufficient number of soft errors.
63 .PP
64 The language, which is described in a later section, is strictly typed,
65 declaration free, procedural, and inspired by
66 .IR awk .
67 It allows source code points or events in the kernel to be associated
68 with handlers, which are subroutines that are executed synchronously. It is
69 somewhat similar conceptually to "breakpoint command lists" in the
70 .IR gdb
71 debugger.
72 .PP
73 This manual corresponds to version @VERSION@.
74
75 .SH OPTIONS
76 The systemtap translator supports the following options. Any other option
77 prints a list of supported options.
78 .\" undocumented for now:
79 .\" \-t test mode
80 .TP
81 .B \-v
82 Increase verbosity. Produce a larger volume of informative (?) output
83 each time option repeated.
84 .TP
85 .B \-h
86 Show help message.
87 .TP
88 .B \-V
89 Show version message.
90 .TP
91 .B \-k
92 Keep the temporary directory after all processing. This may be useful
93 in order to examine the generated C code, or to reuse the compiled
94 kernel object.
95 .TP
96 .B \-g
97 Guru mode. Enable parsing of unsafe expert-level constructs like
98 embedded C.
99 .TP
100 .B \-P
101 Prologue-searching mode. Activate heuristics to work around incorrect
102 debbugging information for $target variables.
103 .TP
104 .B \-u
105 Unoptimized mode. Disable unused code elision during elaboration.
106 .TP
107 .B \-w
108 Suppressed warnings mode. Disable warning messages for elided code in user script.
109 .TP
110 .BI \-b
111 Use bulk mode (percpu files) for kernel-to-user data transfer.
112 .TP
113 .B \-t
114 Collect timing information on the number of times probe executes
115 and average amount of time spent in each probe.
116 .TP
117 .BI \-s NUM
118 Use NUM megabyte buffers for kernel-to-user data transfer. On a
119 multiprocessor in bulk mode, this is a per-processor amount.
120 .TP
121 .BI \-p " NUM"
122 Stop after pass NUM. The passes are numbered 1-5: parse, elaborate,
123 translate, compile, run. See the
124 .B PROCESSING
125 section for details.
126 .TP
127 .BI \-I " DIR"
128 Add the given directory to the tapset search directory. See the
129 description of pass 2 for details.
130 .TP
131 .BI \-D " NAME=VALUE"
132 Add the given C preprocessor directive to the module Makefile. These can
133 be used to override limit parameters described below.
134 .TP
135 .BI \-R " DIR"
136 Look for the systemtap runtime sources in the given directory.
137 .TP
138 .BI \-r " RELEASE"
139 Build for given kernel release instead of currently running one.
140 .TP
141 .BI \-m " MODULE"
142 Use the given name for the generated kernel object module, instead
143 of a unique randomized name. The generated kernel object module is
144 copied to the current directory.
145 .TP
146 .BI \-o " FILE"
147 Send standard output to named file. In bulk mode, percpu files will
148 start with FILE_ followed by the cpu number.
149 .TP
150 .BI \-c " CMD"
151 Start the probes, run CMD, and exit when CMD finishes.
152 .TP
153 .BI \-x " PID"
154 Sets target() to PID. This allows scripts to be written that filter on
155 a specific process.
156
157 .SH ARGUMENTS
158
159 Any additional arguments on the command line are passed to the script
160 parser for substitution. See below.
161
162 .SH SCRIPT LANGUAGE
163
164 The systemtap script language resembles
165 .IR awk .
166 There are two main outermost constructs: probes and functions. Within
167 these, statements and expressions use C-like operator syntax and
168 precedence.
169
170 .SS GENERAL SYNTAX
171 Whitespace is ignored. Three forms of comments are supported:
172 .RS
173 .br
174 .BR # " ... shell style, to the end of line, except for $# and @#"
175 .br
176 .BR // " ... C++ style, to the end of line"
177 .br
178 .BR /* " ... C style ... " */
179 .RE
180 Literals are either strings enclosed in double-quotes (passing through
181 the usual C escape codes with backslashes), or integers (in decimal,
182 hexadecimal, or octal, using the same notation as in C). All strings
183 are limited in length to some reasonable value (a few hundred bytes).
184 Integers are 64-bit signed quantities, although the parser also accepts
185 (and wraps around) values above positive 2**63.
186 .PP
187 In addition, script arguments given at the end of the command line may
188 be inserted. Use
189 .B $1 ... $<NN>
190 for insertion unquoted,
191 .B @1 ... @<NN>
192 for insertion as a string literal. The number of arguments may be accessed
193 through
194 .B $#
195 (as an unquoted number) or through
196 .B @#
197 (as a quoted number). These may be used at any place a token may begin,
198 including within the preprocessing stage. Reference to an argument
199 number beyond what was actually given is an error.
200
201 .SS PREPROCESSING
202 A simple conditional preprocessing stage is run as a part of parsing.
203 The general form is similar to the
204 .RB cond " ? " exp1 " : " exp2
205 ternary operator:
206 .SAMPLE
207 .BR %( " CONDITION " %? " TRUE-TOKENS " %)
208 .BR %( " CONDITION " %? " TRUE-TOKENS " %: " FALSE-TOKENS " %)
209 .ESAMPLE
210 The CONDITION is either an expression whose format is determined by its
211 first keyword, or a string literals comparison or a numeric literals
212 comparison.
213 .PP
214 If the first part is the identifier
215 .BR kernel_vr " or " kernel_v
216 to refer to the kernel version number, with ("2.6.13\-1.322FC3smp") or
217 without ("2.6.13") the release code suffix, then
218 the second part is one of the six standard numeric comparison operators
219 .BR < ", " <= ", " == ", " != ", " > ", and " >= ,
220 and the third part is a string literal that contains an RPM-style
221 version-release value. The condition is deemed satisfied if the
222 version of the target kernel (as optionally overridden by the
223 .BR \-r
224 option) compares to the given version string. The comparison is
225 performed by the glibc function
226 .BR strverscmp .
227 As a special case, if the operator is for simple equality
228 .RB ( == ),
229 or inequality
230 .RB ( != ),
231 and the third part contains any wildcard characters
232 .RB ( * " or " ? " or " [ "),"
233 then the expression is treated as a wildcard (mis)match as evaluated
234 by
235 .BR fnmatch .
236 .PP
237 If, on the other hand, the first part is the identifier
238 .BR arch
239 to refer to the processor architecture, then the second part
240 then the second part is one of the two string comparison operators
241 .BR == " or " != ,
242 and the third part is a string literal for matching it. This
243 comparison is a wildcard (mis)match.
244 .PP
245 Otherwise, the CONDITION is expected to be a comparison between two string
246 literals or two numeric literals. In this case, the arguments are the only
247 variables usable.
248 .PP
249 The TRUE-TOKENS and FALSE-TOKENS are zero or more general parser
250 tokens (possibly including nested preprocessor conditionals), and are
251 pasted into the input stream if the condition is true or false. For
252 example, the following code induces a parse error unless the target
253 kernel version is newer than 2.6.5:
254 .SAMPLE
255 %( kernel_v <= "2.6.5" %? **ERROR** %) # invalid token sequence
256 .ESAMPLE
257 The following code might adapt to hypothetical kernel version drift:
258 .SAMPLE
259 probe kernel.function (
260 %( kernel_v <= "2.6.12" %? "__mm_do_fault" %:
261 %( kernel_vr == "2.6.13*smp" %? "do_page_fault" %:
262 UNSUPPORTED %) %)
263 ) { /* ... */ }
264
265 %( arch == "ia64" %?
266 probe syscall.vliw = kernel.function("vliw_widget") {}
267 %)
268 .ESAMPLE
269
270 .SS VARIABLES
271 Identifiers for variables and functions are an alphanumeric sequence,
272 and may include "_" and "$" characters. They may not start with a
273 plain digit, as in C. Each variable is by default local to the probe
274 or function statement block within which it is mentioned, and therefore
275 its scope and lifetime is limited to a particular probe or function
276 invocation.
277 .\" XXX add statistics type here once it's supported
278 .PP
279 Scalar variables are implicitly typed as either string or integer.
280 Associative arrays also have a string or integer value, and a
281 a tuple of strings and/or integers serving as a key. Here are a
282 few basic expressions.
283 .SAMPLE
284 var1 = 5
285 var2 = "bar"
286 array1 [pid()] = "name" # single numeric key
287 array2 ["foo",4,i++] += 5 # vector of string/num/num keys
288 if (["hello",5,4] in array2) println ("yes") # membership test
289 .ESAMPLE
290 .PP
291 The translator performs
292 .I type inference
293 on all identifiers, including array indexes and function parameters.
294 Inconsistent type-related use of identifiers signals an error.
295 .PP
296 Variables may be declared global, so that they are shared amongst all
297 probes and live as long as the entire systemtap session. There is one
298 namespace for all global variables, regardless of which script file
299 they are found within. A global declaration may be written at the
300 outermost level anywhere, not within a block of code. The following
301 declaration marks a few variables as global. The translator will
302 infer for each its value type, and if it is used as an array, its key
303 types. Optionally, scalar globals may be initialized with a string
304 or number literal.
305 .RS
306 .BR global " var1" , " var2" , " var3=4"
307 .RE
308 .PP
309 Arrays are limited in size by the MAXMAPENTRIES variable -- see the
310 .B SAFETY AND SECURITY
311 section for details. Optionally, global arrays may be declared with a
312 maximum size in brackets, overriding MAXMAPENTRIES for that array only.
313 Note that this doesn't indicate the type of keys for the array, just the
314 size.
315 .RS
316 .BR global " tiny_array[10]" , " normal_array" , " big_array[50000]"
317 .RE
318 .\" XXX add statistics type here once it's supported
319
320 .SS STATEMENTS
321 Statements enable procedural control flow. They may occur within
322 functions and probe handlers. The total number of statements executed
323 in response to any single probe event is limited to some number
324 defined by a macro in the translated C code, and is in the
325 neighbourhood of 1000.
326 .TP
327 EXP
328 Execute the string- or integer-valued expression and throw away
329 the value.
330 .TP
331 .BR { " STMT1 STMT2 ... " }
332 Execute each statement in sequence in this block. Note that
333 separators or terminators are generally not necessary between statements.
334 .TP
335 .BR ;
336 Null statement, do nothing. It is useful as an optional separator between
337 statements to improve syntax-error detection and to handle certain
338 grammar ambiguities.
339 .TP
340 .BR if " (EXP) STMT1 [ " else " STMT2 ]"
341 Compare integer-valued EXP to zero. Execute the first (non-zero)
342 or second STMT (zero).
343 .TP
344 .BR while " (EXP) STMT"
345 While integer-valued EXP evaluates to non-zero, execute STMT.
346 .TP
347 .BR for " (EXP1; EXP2; EXP3) STMT"
348 Execute EXP1 as initialization. While EXP2 is non-zero, execute
349 STMT, then the iteration expression EXP3.
350 .TP
351 .BR foreach " (VAR " in " ARRAY [ "limit " EXP ]) STMT"
352 Loop over each element of the named global array, assigning current
353 key to VAR. The array may not be modified within the statement.
354 By adding a single
355 .BR + " or " \-
356 operator after the VAR or the ARRAY identifier, the iteration will
357 proceed in a sorted order, by ascending or descending index or value.
358 Using the optional
359 .BR limit
360 keyword limits the number of loop iterations to EXP times. EXP is
361 evaluted once at the beginning of the loop.
362 .TP
363 .BR foreach " ([VAR1, VAR2, ...] " in " ARRAY [ "limit " EXP ]) STMT"
364 Same as above, used when the array is indexed with a tuple of keys.
365 A sorting suffix may be used on at most one VAR or ARRAY identifier.
366 .TP
367 .BR break ", " continue
368 Exit or iterate the innermost nesting loop
369 .RB ( while " or " for " or " foreach )
370 statement.
371 .TP
372 .BR return " EXP"
373 Return EXP value from enclosing function. If the function's value is
374 not taken anywhere, then a return statement is not needed, and the
375 function will have a special "unknown" type with no return value.
376 .TP
377 .BR next
378 Return now from enclosing probe handler.
379 .TP
380 .BR delete " ARRAY[INDEX1, INDEX2, ...]"
381 Remove from ARRAY the element specified by the index tuple. The value will no
382 longer be available, and subsequent iterations will not report the element.
383 It is not an error to delete an element that does not exist.
384 .TP
385 .BR delete " ARRAY"
386 Remove all elements from ARRAY.
387 .TP
388 .BR delete " SCALAR"
389 Removes the value of SCALAR. Integers and strings are cleared to 0 and ""
390 respectively, while statistics are reset to the initial empty state.
391
392 .SS EXPRESSIONS
393 Systemtap supports a number of operators that have the same general syntax,
394 semantics, and precedence as in C and awk. Arithmetic is performed as per
395 typical C rules for signed integers. Division by zero or overflow is
396 detected and results in an error.
397 .TP
398 binary numeric operators
399 .B * / % + \- >> << & ^ | && ||
400 .TP
401 binary string operators
402 .B .
403 (string concatenation)
404 .TP
405 numeric assignment operators
406 .B = *= /= %= += \-= >>= <<= &= ^= |=
407 .TP
408 string assignment operators
409 .B = .=
410 .TP
411 unary numeric operators
412 .B + \- ! ~ ++ \-\-
413 .TP
414 binary numeric or string comparison operators
415 .B < > <= >= == !=
416 .TP
417 ternary operator
418 .RB cond " ? " exp1 " : " exp2
419 .TP
420 grouping operator
421 .BR ( " exp " )
422 .TP
423 function call
424 .RB "fn " ( "[ arg1, arg2, ... ]" )
425 .TP
426 array membership check
427 .RB exp " in " array
428 .br
429 .BR "[" exp1 ", " exp2 ", " ... "] in " array
430
431 .SS PROBES
432 The main construct in the scripting language identifies probes.
433 Probes associate abstract events with a statement block ("probe
434 handler") that is to be executed when any of those events occur. The
435 general syntax is as follows:
436 .SAMPLE
437 .BR probe " PROBEPOINT [" , " PROBEPOINT] " { " [STMT ...] " }
438 .ESAMPLE
439 .PP
440 Events are specified in a special syntax called "probe points". There
441 are several varieties of probe points defined by the translator, and
442 tapset scripts may define further ones using aliases. These are
443 listed in the
444 .IR stapprobes (5)
445 manual pages.
446 .PP
447 The probe handler is interpreted relative to the context of each
448 event. For events associated with kernel code, this context may
449 include
450 .I variables
451 defined in the
452 .I source code
453 at that spot. These "target variables" are presented to the script as
454 variables whose names are prefixed with "$". They may be accessed
455 only if the kernel's compiler preserved them despite optimization.
456 This is the same constraint that a debugger user faces when working
457 with optimized code. Some other events have very little context.
458 .PP
459 New probe points may be defined using "aliases". Probe point aliases
460 look similar to probe definitions, but instead of activating a probe
461 at the given point, it just defines a new probe point name as an alias
462 to an existing one. There are two types of alias, i.e. the prologue
463 style and the epilogue style which are identified by "=" and "+="
464 respectively.
465 .PP
466 For prologue style alias, the statement block that follows an alias
467 definition is implicitly added as a prologue to any probe that refers
468 to the alias. While for the epilogue style alias, the statement block
469 that follows an alias definition is implicitly added as an epilogue to
470 any probe that refers to the alias. For example:
471
472 .SAMPLE
473 probe syscall.read = kernel.function("sys_read") {
474 fildes = $fd
475 if (execname == "init") next # skip rest of probe
476 }
477 .ESAMPLE
478 defines a new probe point
479 .nh
480 .IR syscall.read ,
481 .hy
482 which expands to
483 .nh
484 .IR kernel.function("sys_read") ,
485 .hy
486 with the given statement as a prologue, which is useful to predefine
487 some variables for the alias user and/or to skip probe processing
488 entirely based on some conditions. And
489 .SAMPLE
490 probe syscall.read += kernel.function("sys_read") {
491 if (tracethis) println ($fd)
492 }
493 .ESAMPLE
494 defines a new probe point with the given statement as an epilogue, which
495 is useful to take actions based upon variables set or left over by the
496 the alias user.
497
498 An alias is used just like a built-in probe type.
499 .SAMPLE
500 probe syscall.read {
501 printf("reading fd=%d\n", fildes)
502 if (fildes > 10) tracethis = 1
503 }
504 .ESAMPLE
505
506 .SS FUNCTIONS
507 Systemtap scripts may define subroutines to factor out common work.
508 Functions take any number of scalar (integer or string) arguments, and
509 must return a single scalar (integer or string). An example function
510 declaration looks like this:
511 .SAMPLE
512 function thisfn (arg1, arg2) {
513 return arg1 + arg2
514 }
515 .ESAMPLE
516 Note the general absence of type declarations, which are instead
517 inferred by the translator. However, if desired, a function
518 definition may include explicit type declarations for its return value
519 and/or its arguments. This is especially helpful for embedded-C
520 functions. In the following example, the type inference engine need
521 only infer type type of arg2 (a string).
522 .SAMPLE
523 function thatfn:string (arg1:long, arg2) {
524 return sprint(arg1) . arg2
525 }
526 .ESAMPLE
527 Functions may call others or themselves
528 recursively, up to a fixed nesting limit. This limit is defined by
529 a macro in the translated C code and is in the neighbourhood of 10.
530
531 .SS PRINTING
532 There are a set of function names that are specially treated by the
533 translator. They format values for printing to the standard systemtap
534 output stream in a more convenient way. The
535 .IR sprint*
536 variants return the formatted string instead of printing it.
537 .TP
538 .BR print ", " sprint
539 Print one or more values of any type, concatenated directly together.
540 .TP
541 .BR println ", " sprintln
542 Print values like
543 .IR print " and " sprint ,
544 but also append a newline.
545 .TP
546 .BR printd ", " sprintd
547 Take a string delimiter and two or more values of any type, and print the
548 values with the delimiter interposed. The delimiter must be a literal
549 string constant.
550 .TP
551 .BR printdln ", " sprintdln
552 Print values with a delimiter like
553 .IR printd " and " sprintd ,
554 but also append a newline.
555 .TP
556 .BR printf ", " sprintf
557 Take a formatting string and a number of values of corresponding types,
558 and print them all. The format must be a literal string constant.
559 .PP
560 The
561 .IR printf
562 formatting directives similar to those of C, except that they are
563 fully type-checked by the translator.
564 .SAMPLE
565 x = sprintf("take %d steps forward, %d steps back\\n", 3, 2)
566 printf("take %d steps forward, %d steps back\\n", 3+1, 2*2)
567 bob = "bob"
568 alice = "alice"
569 print(bob)
570 print("hello")
571 print(10)
572 printf("%s phoned %s %.4x times\\n", bob, alice . bob, 3456)
573 printf("%s except after %s\\n",
574 sprintf("%s before %s",
575 sprint(1), sprint(3)),
576 sprint("C"))
577 id[bob] = 1234
578 id[alice] = 5678
579 foreach (name in id)
580 printdln("|", strlen(name), name, id[name])
581 .ESAMPLE
582
583 .SS STATISTICS
584 It is often desirable to collect statistics in a way that avoids the
585 penalties of repeatedly exclusive locking the global variables those
586 numbers are being put into. Systemtap provides a solution using a
587 special operator to accumulate values, and several pseudo-functions to
588 extract the statistical aggregates.
589 .PP
590 The aggregation operator is
591 .IR <<< ,
592 and resembles an assignment, or a C++ output-streaming operation.
593 The left operand specifies a scalar or array-index lvalue, which must
594 be declared global. The right operand is a numeric expression. The
595 meaning is intuitive: add the given number to the pile of numbers to
596 compute statistics of. (The specific list of statistics to gather
597 is given separately, by the extraction functions.)
598 .SAMPLE
599 foo <<< 1
600 stats[pid()] <<< memsize
601 .ESAMPLE
602 .PP
603 The extraction functions are also special. For each appearance of a
604 distinct extraction function operating on a given identifier, the
605 translator arranges to compute a set of statistics that satisfy it.
606 The statistics system is thereby "on-demand". Each execution of
607 an extraction function causes the aggregation to be computed for
608 that moment across all processors.
609 .PP
610 Here is the set of extractor functions. The first argument of each is
611 the same style of lvalue used on the left hand side of the accumulate
612 operation. The
613 .IR @count(v) ", " @sum(v) ", " @min(v) ", " @max(v) ", " @avg(v)
614 extractor functions compute the number/total/minimum/maximum/average
615 of all accumulated values. The resulting values are all simple
616 integers.
617 .PP
618 Histograms are also available, but are more complicated because they
619 have a vector rather than scalar value.
620 .I @hist_linear(v,start,stop,interval)
621 represents a linear histogram from "start" to "stop" by increments
622 of "interval". The interval must be positive. Similarly,
623 .I @hist_log(v)
624 represents a base-2 logarithmic histogram. Printing a histogram
625 with the
626 .I print
627 family of functions renders a histogram object as a tabular
628 "ASCII art" bar chart.
629 .SAMPLE
630 probe foo {
631 x <<< $value
632 }
633 probe end {
634 printf ("avg %d = sum %d / count %d\\n",
635 @avg(x), @sum(x), @count(x))
636 print (@hist_log(v))
637 }
638 .ESAMPLE
639
640 .SS EMBEDDED C
641 When in guru mode, the translator accepts embedded code in the
642 script. Such code is enclosed between
643 .IR %{
644 and
645 .IR %}
646 markers, and is transcribed verbatim, without analysis, in some
647 sequence, into the generated C code. At the outermost level, this may
648 be useful to add
649 .IR #include
650 instructions, and any auxiliary definitions for use by other embedded
651 code.
652 .PP
653 The other place where embedded code is permitted is as a function body.
654 In this case, the script language body is replaced entirely by a piece
655 of C code enclosed again between
656 .IR %{ " and " %}
657 markers.
658 This C code may do anything reasonable and safe. There are a number
659 of undocumented but complex safety constraints on atomicity,
660 concurrency, resource consumption, and run time limits, so this
661 is an advanced technique.
662 .PP
663 The memory locations set aside for input and output values
664 are made available to it using a macro
665 .IR THIS .
666 Here are some examples:
667 .SAMPLE
668 function add_one (val) %{
669 THIS\->__retvalue = THIS\->val + 1;
670 %}
671 function add_one_str (val) %{
672 strlcpy (THIS\->__retvalue, THIS\->val, MAXSTRINGLEN);
673 strlcat (THIS\->__retvalue, "one", MAXSTRINGLEN);
674 %}
675 .ESAMPLE
676 The function argument and return value types have to be inferred by
677 the translator from the call sites in order for this to work. The
678 user should examine C code generated for ordinary script-language
679 functions in order to write compatible embedded-C ones.
680
681 .SS BUILT-INS
682 A set of builtin functions and probe point aliases are provided
683 by the scripts installed under the
684 .nh
685 .IR @prefix@/share/systemtap/tapset
686 .hy
687 directory. These are described in the
688 .IR stapfuncs "(5) and " stapprobes (5)
689 manual pages.
690
691 .SH PROCESSING
692 The translator begins pass 1 by parsing the given input script,
693 and all scripts (files named
694 .IR *.stp )
695 found in a tapset directory. The directories listed
696 with
697 .BR \-I
698 are processed in sequence, each processed in "guru mode". For each
699 directory, a number of subdirectories are also searched. These
700 subdirectories are derived from the selected kernel version (the
701 .BR \-R
702 option),
703 in order to allow more kernel-version-specific scripts to override less
704 specific ones. For example, for a kernel version
705 .IR 2.6.12\-23.FC3
706 the following patterns would be searched, in sequence:
707 .IR 2.6.12\-23.FC3/*.stp ,
708 .IR 2.6.12/*.stp ,
709 .IR 2.6/*.stp ,
710 and finally
711 .IR *.stp
712 Stopping the translator after pass 1 causes it to print the parse trees.
713
714 .PP
715 In pass 2, the translator analyzes the input script to resolve symbols
716 and types. References to variables, functions, and probe aliases that
717 are unresolved internally are satisfied by searching through the
718 parsed tapset scripts. If any tapset script is selected because it
719 defines an unresolved symbol, then the entirety of that script is
720 added to the translator's resolution queue. This process iterates
721 until all symbols are resolved and a subset of tapset scripts is
722 selected.
723 .PP
724 Next, all probe point descriptions are validated
725 against the wide variety supported by the translator. Probe points that
726 refer to code locations ("synchronous probe points") require the
727 appropriate kernel debugging information to be installed. In the
728 associated probe handlers, target-side variables (whose names begin
729 with "$") are found and have their run-time locations decoded.
730 .PP
731 Next, all probes and functions are analyzed for optimization
732 opportunities, in order to remove variables, expressions, and
733 functions that have no useful value and no side-effect. Embedded-C
734 functions are assumed to have side-effects unless they include the
735 magic string
736 .BR /*\ pure\ */ .
737 Since this optimization can hide latent code errors such as type
738 mismatches or invalid $target variables, it sometimes may be useful
739 to disable the optimizations with the
740 .BR \-u
741 option.
742 .PP
743 Finally, all variable, function, parameter, array, and index types are
744 inferred from context (literals and operators). Stopping the
745 translator after pass 2 causes it to list all the probes, functions,
746 and variables, along with all inferred types. Any inconsistent or
747 unresolved types cause an error.
748
749 .PP
750 In pass 3, the translator writes C code that represents the actions
751 of all selected script files, and creates a
752 .IR Makefile
753 to build that into a kernel object. These files are placed into a
754 temporary directory. Stopping the translator at this point causes
755 it to print the contents of the C file.
756
757 .PP
758 In pass 4, the translator invokes the Linux kernel build system to
759 create the actual kernel object file. This involves running
760 .IR make
761 in the temporary directory, and requires a kernel module build
762 system (headers, config and Makefiles) to be installed in the usual
763 spot
764 .IR /lib/modules/VERSION/build .
765 Stopping the translator after pass 4 is the last chance before
766 running the kernel object. This may be useful if you want to
767 archive the file.
768
769 .PP
770 In pass 5, the translator invokes the systemtap auxiliary program
771 .I staprun
772 program for the given kernel object. This program arranges to load
773 the module then communicates with it, copying trace data from the
774 kernel into temporary files, until the user sends an interrupt signal.
775 Any run-time error encountered by the probe handlers, such as running
776 out of memory, division by zero, exceeding nesting or runtime limits,
777 results in a soft error indication. Soft errors in excess of
778 MAXERRORS block of all subsequent probes, and terminate the session.
779 Finally,
780 .I staprun
781 unloads the module, and cleans up.
782
783 .SH EXAMPLES
784 See the
785 .IR stapex (5)
786 manual page for a collection of samples.
787
788 .SH CACHING
789 The systemtap translator caches the pass 3 output (the generated C
790 code) and the pass 4 output (the compiled kernel module) if pass 4
791 completes successfully. This cached output is reused if the same
792 script is translated again assuming the same conditions exist (same kernel
793 version, same systemtap version, etc.). Cached files are stored in
794 the
795 .I $SYSTEMTAP_DIR/cache
796 directory, which may be periodically cleaned/erased by the user.
797
798 .SH SAFETY AND SECURITY
799 Systemtap is an administrative tool. It exposes kernel internal data
800 structures and potentially private user information.
801 It acquires
802 either root privileges
803
804 To actually run the kernel objects it builds, a user must be one of
805 the following:
806 .IP \(bu 4
807 the root user;
808 .IP \(bu 4
809 a member of the
810 .I stapdev
811 group; or
812 .IP \(bu 4
813 a member of the
814 .I stapusr
815 group. Members of the
816 .I stapusr
817 group can only use modules located in
818 the /lib/modules/VERSION/systemtap directory. This directory
819 must be owned by root and not be world writable.
820 .PP
821 The kernel modules generated by
822 .I stap
823 program are run by the
824 .IR staprun
825 program. The latter is a part of the Systemtap package, dedicated to
826 module loading and unloading (but only in the white zone), and
827 kernel-to-user data transfer. Since
828 .IR staprun
829 does not perform any additional security checks on the kernel objects
830 it is given, it would be unwise for a system administrator to add
831 untrusted users to the
832 .I stapdev
833 or
834 .I stapusr
835 groups.
836 .PP
837 The translator asserts certain safety constraints. It aims to ensure
838 that no handler routine can run for very long, allocate memory,
839 perform unsafe operations, or in unintentionally interfere with the
840 kernel. Use of script global variables is suitably locked to protect
841 against manipulation by concurrent probe handlers. Use of guru mode
842 constructs such as embedded C can violate these constraints, leading
843 to kernel crash or data corruption.
844 .PP
845 The resource use limits are set by macros in the generated C code.
846 These may be overridden with the
847 .BR \-D
848 flag. A selection of these is as follows:
849 .TP
850 MAXNESTING
851 Maximum number of recursive function call levels, default 10.
852 .TP
853 MAXSTRINGLEN
854 Maximum length of strings, default 128.
855 .TP
856 MAXTRYLOCK
857 Maximum number of iterations to wait for locks on global variables
858 before declaring possible deadlock and skipping the probe, default 1000.
859 .TP
860 MAXACTION
861 Maximum number of statements to execute during any single probe hit
862 (with interrupts disabled),
863 default 1000.
864 .TP
865 MAXACTION_INTERRUPTIBLE
866 Maximum number of statements to execute during any single probe hit
867 which is executed with interrupts enabled (such as begin/end probes),
868 default (MAXACTION * 10).
869 .TP
870 MAXMAPENTRIES
871 Maximum number of rows in any single global array, default 2048.
872 .TP
873 MAXERRORS
874 Maximum number of soft errors before an exit is triggered, default 0, which
875 means that the first error will exit the script.
876 .TP
877 MAXSKIPPED
878 Maximum number of skipped reentrant probes before an exit is triggered, default 100.
879 .TP
880 MINSTACKSPACE
881 Minimum number of free kernel stack bytes required in order to
882 run a probe handler, default 1024. This number should be large enough
883 for the probe handler's own needs, plus a safety margin.
884
885 .PP
886 Multipule scripts can write data into a relay buffer concurrently. A host
887 script provides an interface for accessing its relay buffer to guest scripts.
888 Then, the output of the guests are merged into the output of the host.
889 To run a script as a host, execute stap with
890 .BR \-DRELAYHOST[=name]
891 option. The
892 .BR name
893 identifies your host script among several hosts.
894 While running the host, execute stap with
895 .BR \-DRELAYGUEST[=name]
896 to add a guest script to the host.
897 Note that you must unload guests before unloading a host. If there are some
898 guests connected to the host, unloading the host will be failed.
899
900 .PP
901 In case something goes wrong with
902 .IR stap " or " staprun
903 after a probe has already started running, one may safely kill both
904 user processes, and remove the active probe kernel module with
905 .IR rmmod .
906 Any pending trace messages may be lost.
907
908 .PP
909 In addition to the methods outlined above, the generated kernel module
910 also uses overload processing to make sure that probes can't run for
911 too long. If more than STP_OVERLOAD_THRESHOLD cycles (default
912 500000000) have been spent in all the probes on a single cpu during
913 the last STP_OVERLOAD_INTERVAL cycles (default 1000000000), the probes
914 have overloaded the system and an exit is triggered.
915 .PP
916 By default, overload processing is turned on for all modules. If you
917 would like to disable overload processing, define STP_NO_OVERLOAD.
918
919 .SH FILES
920 .\" consider autoconf-substituting these directories
921 .TP
922 ~/.systemtap
923 Systemtap data directory for cached systemtap files, unless overridden
924 by the
925 .I SYSTEMTAP_DIR
926 environment variable.
927 .TP
928 /tmp/stapXXXXXX
929 Temporary directory for systemtap files, including translated C code
930 and kernel object.
931 .TP
932 @prefix@/share/systemtap/tapset
933 The automatic tapset search directory, unless overridden by
934 the
935 .I SYSTEMTAP_TAPSET
936 environment variable.
937 .TP
938 @prefix@/share/systemtap/runtime
939 The runtime sources, unless overridden by the
940 .I SYSTEMTAP_RUNTIME
941 environment variable.
942 .TP
943 /lib/modules/VERSION/build
944 The location of kernel module building infrastructure.
945 .TP
946 @prefix@/lib/debug/lib/modules/VERSION
947 The location of kernel debugging information when packaged into the
948 .IR kernel\-debuginfo
949 RPM, unless overridden by the
950 .I SYSTEMTAP_DEBUGINFO_PATH
951 environment variable. The default value for this variable is
952 .IR \-:.debug:/usr/lib/debug .
953 This path is interpreted by elfutils as a list of base directories of
954 which various subdirectories will be searched. The \- at the front
955 means to skip CRC matching for separated debug objects and is a small
956 performance win if no possible corruption is suspected.
957 .TP
958 @prefix@/bin/staprun
959 The auxiliary program supervising module loading, interaction, and
960 unloading.
961
962 .SH SEE ALSO
963 .IR stapprobes (5),
964 .IR stapfuncs (5),
965 .IR stapex (5),
966 .IR awk (1),
967 .IR gdb (1)
968
969 .SH BUGS
970 Use the Bugzilla link off of the project web page or our mailing list.
971 .nh
972 .BR http://sources.redhat.com/systemtap/ , <systemtap@sources.redhat.com> .
973 .hy
This page took 0.0740960000000001 seconds and 4 git commands to generate.