]> sourceware.org Git - systemtap.git/blob - INTERNALS
step-prep: on debian/ubuntu machines, attempt "apt-get -y install"
[systemtap.git] / INTERNALS
1 The Systemtap Translator - a tour on the inside
2
3 Outline:
4 - general principles
5 - main data structures
6 - pass 1: parsing
7 - pass 2: semantic analysis (parts 1, 2, 3)
8 - pass 3: translation (parts 1, 2)
9 - pass 4: compilation
10 - pass 5: run
11
12 ------------------------------------------------------------------------
13 Translator general principles
14
15 - written in standard C++
16 - mildly O-O, sparing use of C++ features
17 - uses "visitor" concept for type-dependent (virtual) traversal
18
19 ------------------------------------------------------------------------
20 Main data structures
21
22 - abstract syntax tree <staptree.h>
23 - family of types and subtypes for language parts: expressions,
24 literals, statements
25 - includes outermost constructs: probes, aliases, functions
26 - an instance of "stapfile" represents an entire script file
27 - each annotated with a token (script source coordinates)
28 - data persists throughout run
29
30 - session <session.h>
31 - contains run-time parameters from command line
32 - contains all globals
33 - passed by reference to many functions
34
35 ------------------------------------------------------------------------
36 Pass 1 - parsing
37
38 - hand-written recursive-descent <parse.cxx>
39 - language specified in man page <stap.1>
40 - reads user-specified script file
41 - also searches path for all <*.stp> files, parses them too
42 - => syntax errors are caught immediately, throughout tapset
43 - now includes baby preprocessor
44 probe kernel.
45 %( kernel_v == "2.6.9" %? inline("foo") %: function("bar") %)
46 { }
47 - enforces guru mode for embedded code %{ C %}
48
49 ------------------------------------------------------------------------
50 Pass 2 - semantic analysis - step 1: resolve symbols
51
52 - code in <elaborate.cxx>
53 - want to know all global and per-probe/function local variables
54 - one "vardecl" instance interned per variable
55 - fills in "referent" field in AST for nodes that refer to it
56 - collect "needed" probe/global/function list in session variable
57 - loop over file queue, starting with user script "stapfile"
58 - add to "needed" list this file's globals, functions, probes
59 - resolve any symbols used in this file (function calls, variables)
60 against "needed" list
61 - if not resolved, search through all tapset "stapfile" instances;
62 add to file queue if matched
63 - if still not resolved, create as local scalar, or signal an error
64
65 ------------------------------------------------------------------------
66 Pass 2 - semantic analysis - step 2: resolve types
67
68 - fills in "type" field in AST
69 - iterate along all probes and functions, until convergence
70 - infer types of variables from usage context / operators:
71 a = 5 # a is a pe_long
72 b["foo",a]++ # b is a pe_long array with indexes pe_string and pe_long
73 - loop until no further variable types can be inferred
74 - signal error if any still unresolved
75
76 ------------------------------------------------------------------------
77 Pass 2 - semantic analysis - step 3: resolve probes
78
79 - probe points turned to "derived_probe" instances by code in <tapsets.cxx>
80 - derived_probes know how to talk to kernel API for registration/callbacks
81 - aliases get expanded at this point
82 - some probe points ("begin", "end", "timer*") are very simple
83 - dwarf ("kernel*", "module*") implementation very complicated
84 - target-variables "$foo" expanded to getter/setter functions
85 with synthesized embedded-C
86
87 ------------------------------------------------------------------------
88 Pass 3 - translation - step 1: data
89
90 - <translate.cxx>
91 - we now know all types, all variables
92 - strings are everywhere copied by value (MAXSTRINGLEN bytes)
93 - emit data storage mega-struct "context" for all probes/functions
94 - array instantiated per-CPU, per-nesting-level
95 - can be pretty big static data
96
97 ------------------------------------------------------------------------
98 Pass 3 - translation - step 2: code
99
100 - map script functions to C functions taking a context pointer
101 - map probes to two C functions:
102 - one to interface with the probe point infrastructure (kprobes,
103 kernel timer): reserves per-cpu context
104 - one to implement probe body, just like a script function
105 - emit global startup/shutdown routine to manage orderly
106 registration/deregistration of probes
107 - expressions/statements emitted in "natural" evaluation sequence
108 - emit code to enforce activity-count limits, simple safety tests
109 - global variables protected by locks
110 global k
111 function foo () { k ++ } # write lock around increment
112 probe bar { if (k>5) ... } # read lock around read
113 - same thing for arrays, except foreach/sort take longer-duration locks
114
115 ------------------------------------------------------------------------
116 Pass 4 - compilation
117
118 - <buildrun.cxx>
119 - write out C code in a temporary directory
120 - call into kbuild makefile to build module
121
122 ------------------------------------------------------------------------
123 Pass 5 - running
124
125 - run "staprun"
126 - clean up temporary directory
127
128 - nothing to it!
129
130 ------------------------------------------------------------------------
131 Peculiarities
132
133 - We tend to use visitor idioms for polymorphic traversals of parse
134 trees, in preference to dynamic_cast<> et al. The former is a
135 little more future-proof and harder to break accidentally.
136 {reinterpret,static}_cast<> should definitely be avoided.
137
138 - We use our interned_string type (a derivative of boost::string_ref)
139 to use shareable references to strings that may be used in duplicate
140 many times. It can slide in for std::string most of the time. It
141 can save RAM and maybe even CPU, if used judiciously: such as for
142 frequently duplicated strings, duplicated strings, duplicated strings,
143 duplicated.
144
145 OTOH, it costs CPU (for management of the interned string set, or if
146 copied between std::string and interned_string unnecessarily), and
147 RAM (2 pointers when empty, vs. 1 for std::string), and its
148 instances are not modifiable, so tradeoffs must be confirmed with
149 tools like memusage, massif, perf-stat, etc.
This page took 0.041907 seconds and 5 git commands to generate.