]>
Commit | Line | Data |
---|---|---|
5bb3c2a0 FCE |
1 | The Systemtap Translator - a tour on the inside |
2 | ||
3 | Outline: | |
4 | - general principles | |
5 | - main data structures | |
6 | - pass 1: parsing | |
7 | - pass 2: semantic analysis (parts 1, 2, 3) | |
8 | - pass 3: translation (parts 1, 2) | |
9 | - pass 4: compilation | |
10 | - pass 5: run | |
11 | ||
12 | ------------------------------------------------------------------------ | |
13 | Translator general principles | |
14 | ||
15 | - written in standard C++ | |
16 | - mildly O-O, sparing use of C++ features | |
17 | - uses "visitor" concept for type-dependent (virtual) traversal | |
18 | ||
19 | ------------------------------------------------------------------------ | |
20 | Main data structures | |
21 | ||
22 | - abstract syntax tree <staptree.h> | |
23 | - family of types and subtypes for language parts: expressions, | |
24 | literals, statements | |
25 | - includes outermost constructs: probes, aliases, functions | |
26 | - an instance of "stapfile" represents an entire script file | |
27 | - each annotated with a token (script source coordinates) | |
28 | - data persists throughout run | |
29 | ||
30 | - session <session.h> | |
31 | - contains run-time parameters from command line | |
32 | - contains all globals | |
33 | - passed by reference to many functions | |
34 | ||
35 | ------------------------------------------------------------------------ | |
36 | Pass 1 - parsing | |
37 | ||
38 | - hand-written recursive-descent <parse.cxx> | |
39 | - language specified in man page <stap.1> | |
40 | - reads user-specified script file | |
41 | - also searches path for all <*.stp> files, parses them too | |
42 | - => syntax errors are caught immediately, throughout tapset | |
43 | - now includes baby preprocessor | |
44 | probe kernel. | |
45 | %( kernel_v == "2.6.9" %? inline("foo") %: function("bar") %) | |
46 | { } | |
47 | - enforces guru mode for embedded code %{ C %} | |
48 | ||
49 | ------------------------------------------------------------------------ | |
50 | Pass 2 - semantic analysis - step 1: resolve symbols | |
51 | ||
52 | - code in <elaborate.cxx> | |
53 | - want to know all global and per-probe/function local variables | |
54 | - one "vardecl" instance interned per variable | |
55 | - fills in "referent" field in AST for nodes that refer to it | |
56 | - collect "needed" probe/global/function list in session variable | |
57 | - loop over file queue, starting with user script "stapfile" | |
58 | - add to "needed" list this file's globals, functions, probes | |
59 | - resolve any symbols used in this file (function calls, variables) | |
60 | against "needed" list | |
61 | - if not resolved, search through all tapset "stapfile" instances; | |
62 | add to file queue if matched | |
63 | - if still not resolved, create as local scalar, or signal an error | |
64 | ||
65 | ------------------------------------------------------------------------ | |
66 | Pass 2 - semantic analysis - step 2: resolve types | |
67 | ||
68 | - fills in "type" field in AST | |
69 | - iterate along all probes and functions, until convergence | |
70 | - infer types of variables from usage context / operators: | |
71 | a = 5 # a is a pe_long | |
72 | b["foo",a]++ # b is a pe_long array with indexes pe_string and pe_long | |
73 | - loop until no further variable types can be inferred | |
74 | - signal error if any still unresolved | |
75 | ||
76 | ------------------------------------------------------------------------ | |
77 | Pass 2 - semantic analysis - step 3: resolve probes | |
78 | ||
79 | - probe points turned to "derived_probe" instances by code in <tapsets.cxx> | |
80 | - derived_probes know how to talk to kernel API for registration/callbacks | |
81 | - aliases get expanded at this point | |
82 | - some probe points ("begin", "end", "timer*") are very simple | |
83 | - dwarf ("kernel*", "module*") implementation very complicated | |
84 | - target-variables "$foo" expanded to getter/setter functions | |
85 | with synthesized embedded-C | |
86 | ||
87 | ------------------------------------------------------------------------ | |
88 | Pass 3 - translation - step 1: data | |
89 | ||
90 | - <translate.cxx> | |
91 | - we now know all types, all variables | |
92 | - strings are everywhere copied by value (MAXSTRINGLEN bytes) | |
93 | - emit data storage mega-struct "context" for all probes/functions | |
94 | - array instantiated per-CPU, per-nesting-level | |
95 | - can be pretty big static data | |
96 | ||
97 | ------------------------------------------------------------------------ | |
98 | Pass 3 - translation - step 2: code | |
99 | ||
100 | - map script functions to C functions taking a context pointer | |
101 | - map probes to two C functions: | |
102 | - one to interface with the probe point infrastructure (kprobes, | |
103 | kernel timer): reserves per-cpu context | |
104 | - one to implement probe body, just like a script function | |
105 | - emit global startup/shutdown routine to manage orderly | |
106 | registration/deregistration of probes | |
107 | - expressions/statements emitted in "natural" evaluation sequence | |
108 | - emit code to enforce activity-count limits, simple safety tests | |
109 | - global variables protected by locks | |
110 | global k | |
111 | function foo () { k ++ } # write lock around increment | |
112 | probe bar { if (k>5) ... } # read lock around read | |
113 | - same thing for arrays, except foreach/sort take longer-duration locks | |
114 | ||
115 | ------------------------------------------------------------------------ | |
116 | Pass 4 - compilation | |
117 | ||
118 | - <buildrun.cxx> | |
119 | - write out C code in a temporary directory | |
120 | - call into kbuild makefile to build module | |
121 | ||
122 | Pass 5 - running | |
123 | ||
98aab489 | 124 | - run "staprun" |
5bb3c2a0 FCE |
125 | - clean up temporary directory |
126 | ||
127 | - nothing to it! |