This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

embedded-C proposal


Hi -

Here's a possible C extension mechanism for systemtap "guru-mode"
scripts.  It's a synthesis of ideas from zanussi, graydon, and others.
There would be just two constructs: two places where a script can
include embedded C code: at the top level, and as a function body.

C code at the top level would be enclosed between "%{" and "%}"
markers, sort of like yacc.  The translator would transcribe, without
analysis, all such blocks to near the beginning of the synthesized C
code.  These would be suitable for adding "#include's" or defining
types / auxiliary functions.  One could look like this:

%{
#include <linux/zoo.h>

animal_t find_the_zebra () {
   /* assumes valid context */
   return current->zoo[ZEBRA];
}
%}

The other place for embedded C is as bodies of script-level functions.
So, the developer could use the same "%{" "%}" brackets to indicate a
C body:

function find(z) %{
   THIS.__retvalue = (long long) find_the_zebra (atoi (z));
   printk (KERN_DEBUG "z=%s", THIS.z);
%}

where THIS is a macro supplied by the translator, and stands for
   c->locals[c->nesting].<my-function-name>

The macro allows reuse of the exact same calling convention already
used for the translation of normal script functions.  The calling
convention would have to be stoned, or, cast into stone to allow a
library of such embedded-C scripts to be accumulated.

As before, the translator would simply transcribe the enclosed text,
without analysis, into the synthesized C output.  Because of this, it
cannot perform type inference on the return/parameter types, so this
information must be inferred from another context (call sites
elsewhere in the script).  

Type or syntax errors would be caught during the compilation pass
(-p4).  Logic errors in the embedded C code, like infinite loops,
resource exhaustion, synchronization bugs, etc., might crash the
machine during the run pass (-p5).


And that's it!  This facility seems sufficient to express unprotected
traversal of target pointers, extraction of special kernel data, being
able to do many dangerous and exciting things.  And it requires very
little extra theory and nearly no work from the translator (to me,
those are the best parts).

Some notable corollaries.  I don't think we'll need an analogous
C-embedding method for probe handlers.  Let script probe handlers call
into embedded-C functions if they need the help, and pass any
target-side values necessary.

Also, it should be unnecessary for the embedded-C code to interact
with the rest of the translator's output.  In particular, that code
should not play around with the script's global variables.  It could
perhaps make calls into the runtime ... or ideally, it shouldn't, in
order to reduce interface coupling.

Comments?


- FChE


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]