]> sourceware.org Git - systemtap.git/blob - INTERNALS
Don't compile csclient.cxx and cscommon.cxx when HAVE_NSS is false.
[systemtap.git] / INTERNALS
1 The Systemtap Translator - a tour on the inside
2
3 Outline:
4 - general principles
5 - main data structures
6 - pass 1: parsing
7 - pass 2: semantic analysis (parts 1, 2, 3)
8 - pass 3: translation (parts 1, 2)
9 - pass 4: compilation
10 - pass 5: run
11
12 ------------------------------------------------------------------------
13 Translator general principles
14
15 - written in standard C++
16 - mildly O-O, sparing use of C++ features
17 - uses "visitor" concept for type-dependent (virtual) traversal
18
19 ------------------------------------------------------------------------
20 Main data structures
21
22 - abstract syntax tree <staptree.h>
23 - family of types and subtypes for language parts: expressions,
24 literals, statements
25 - includes outermost constructs: probes, aliases, functions
26 - an instance of "stapfile" represents an entire script file
27 - each annotated with a token (script source coordinates)
28 - data persists throughout run
29
30 - session <session.h>
31 - contains run-time parameters from command line
32 - contains all globals
33 - passed by reference to many functions
34
35 ------------------------------------------------------------------------
36 Pass 1 - parsing
37
38 - hand-written recursive-descent <parse.cxx>
39 - language specified in man page <stap.1>
40 - reads user-specified script file
41 - also searches path for all <*.stp> files, parses them too
42 - => syntax errors are caught immediately, throughout tapset
43 - now includes baby preprocessor
44 probe kernel.
45 %( kernel_v == "2.6.9" %? inline("foo") %: function("bar") %)
46 { }
47 - enforces guru mode for embedded code %{ C %}
48
49 ------------------------------------------------------------------------
50 Pass 2 - semantic analysis - step 1: resolve symbols
51
52 - code in <elaborate.cxx>
53 - want to know all global and per-probe/function local variables
54 - one "vardecl" instance interned per variable
55 - fills in "referent" field in AST for nodes that refer to it
56 - collect "needed" probe/global/function list in session variable
57 - loop over file queue, starting with user script "stapfile"
58 - add to "needed" list this file's globals, functions, probes
59 - resolve any symbols used in this file (function calls, variables)
60 against "needed" list
61 - if not resolved, search through all tapset "stapfile" instances;
62 add to file queue if matched
63 - if still not resolved, create as local scalar, or signal an error
64
65 ------------------------------------------------------------------------
66 Pass 2 - semantic analysis - step 2: resolve types
67
68 - fills in "type" field in AST
69 - iterate along all probes and functions, until convergence
70 - infer types of variables from usage context / operators:
71 a = 5 # a is a pe_long
72 b["foo",a]++ # b is a pe_long array with indexes pe_string and pe_long
73 - loop until no further variable types can be inferred
74 - signal error if any still unresolved
75
76 ------------------------------------------------------------------------
77 Pass 2 - semantic analysis - step 3: resolve probes
78
79 - probe points turned to "derived_probe" instances by code in <tapsets.cxx>
80 - derived_probes know how to talk to kernel API for registration/callbacks
81 - aliases get expanded at this point
82 - some probe points ("begin", "end", "timer*") are very simple
83 - dwarf ("kernel*", "module*") implementation very complicated
84 - target-variables "$foo" expanded to getter/setter functions
85 with synthesized embedded-C
86
87 ------------------------------------------------------------------------
88 Pass 3 - translation - step 1: data
89
90 - <translate.cxx>
91 - we now know all types, all variables
92 - strings are everywhere copied by value (MAXSTRINGLEN bytes)
93 - emit data storage mega-struct "context" for all probes/functions
94 - array instantiated per-CPU, per-nesting-level
95 - can be pretty big static data
96
97 ------------------------------------------------------------------------
98 Pass 3 - translation - step 2: code
99
100 - map script functions to C functions taking a context pointer
101 - map probes to two C functions:
102 - one to interface with the probe point infrastructure (kprobes,
103 kernel timer): reserves per-cpu context
104 - one to implement probe body, just like a script function
105 - emit global startup/shutdown routine to manage orderly
106 registration/deregistration of probes
107 - expressions/statements emitted in "natural" evaluation sequence
108 - emit code to enforce activity-count limits, simple safety tests
109 - global variables protected by locks
110 global k
111 function foo () { k ++ } # write lock around increment
112 probe bar { if (k>5) ... } # read lock around read
113 - same thing for arrays, except foreach/sort take longer-duration locks
114
115 ------------------------------------------------------------------------
116 Pass 4 - compilation
117
118 - <buildrun.cxx>
119 - write out C code in a temporary directory
120 - call into kbuild makefile to build module
121
122 Pass 5 - running
123
124 - run "staprun"
125 - clean up temporary directory
126
127 - nothing to it!
This page took 0.039102 seconds and 5 git commands to generate.