This is the mail archive of the
systemtap@sources.redhat.com
mailing list for the systemtap project.
RE: Script tapsets.
- From: "Chen, Brad" <brad dot chen at intel dot com>
- To: "Frank Ch. Eigler" <fche at redhat dot com>, "Vara Prasad" <prasadav at us dot ibm dot com>
- Cc: <systemtap at sources dot redhat dot com>
- Date: Mon, 9 May 2005 12:54:40 -0700
- Subject: RE: Script tapsets.
For the record I'm somewhat sympathetic to Vara's
position. I expect Systemtap to be more successful
if we can lower the barriers to tapset developers
by allowing them to work in a familiar environment
and even re-use C code they've already written for
less elegant kernel monitoring infrastructure. I
see three sets of people involved:
- Systemtap developers: us
- Tapset developers: kernel experts
- Systemtap users: performance experts but not
kernel experts
Frank's interests here sometimes confuse me because he
seems to blur the distinction between the tapset developers
and Systemtap users, even though they have very different
perspectives and different expertice. We absolutely should
work towards a rich script interface that makes C unnecessary;
why though should we prevent tapset developers from working
where they feel most comfortable?
>This is part of the burden of championing a two-language
>solution to the problem: ...
This doesn't seem entirely fair. The "one-language solution"
requires investing all our marbles in an unproven language
that doesn't exist today. And the second language is still
there, as the target of translation of the first language
and as the language kernel experts will be thinking in when
they are trying to craft their scripts. Perhaps we should
be thinking of "one-language" as a goal to work towards
rather than a defacto requirement for the first release.
It appears to me from the discussion below that we are
headed towards a reasonable compromise, a fully capable
scripting language and a C interface for folks who prefer
to write in C.
Brad
-----Original Message-----
From: systemtap-owner@sources.redhat.com
[mailto:systemtap-owner@sources.redhat.com] On Behalf Of Frank Ch.
Eigler
Sent: Sunday, May 08, 2005 7:46 AM
To: Vara Prasad
Cc: systemtap@sources.redhat.com
Subject: Re: Script tapsets.
Hi -
It seems to me that there is a large philosophical divide between your
and my preferred approaches to writing systemtap extensions. You
expect people (some subset of users or developers) to prefer raw C as
much as possible, and I expect the opposite. You would like a rich C
interface that makes script unnecessary, I would like a rich script
interface that makes C unnecessary. You ask for justification for
doing something in script when also perhaps possible in C, and I vice
versa. You appear interested in systemtap as being a library for
writing kprobes routines in C; I am interested in kprobes as one of a
number of implicit backends for writing scripts.
Luckily, we don't have to agree to make progress.
varap wrote:
> [...]
> You mentioned tapsets are stored in a library used in elaboration
> phase. When you say library does this mean compiled .o or .a form or
> just script sources themselves?
Just the script sources, since there is no "partial compilation"
facility being contemplated for scripts. This makes sense since
the translator needs global program information in order to
perform type inferences, to compute proper declarations for
all the supporting structs, and probably other reasons.
> How about we let folks write the tapset functions in scripts or "C"
> but we will generate the code to "C" form and compile it into a
> module that can be loaded independently [or] at least make them in
> the form of a library [...]
Once a C interface to the translator/runtime is designed, such
packaging options are likely to be supported.
> [...] In your write up you mentioned "The following script defines
> a new "event" and supplies some variables for use by its handlers" I
> am thinking "event" in the above statement means a "probe point", is
> that right.
I was referring to the event of a probe point being fired.
> [...]
> victim_tgid = $tsk->tgid;
> [...]
> The main interesting piece of code in the above is code generated to
get
> local variables tsk and address.
> If these local variables are made available let us say through a
> function or macro call writing the above code in C is trivial as well.
As you agree later, this is far from trivial. An introspection
library for C is beyond what systemtap needs to offer to script users.
> The problems with this script based approach that i can see are
> 1) These scripts are going to leave outside the kernel code hence
> maintenance is a major problem.
It would be good to gather data to support and quantify this
hypothesis.
> The problem is even more severe as we access datastructures directly
> not through an advertised API.
True, but at the same time some advertised APIs are not suitable for
traversal from within contexts such as from interrupt handlers.
> 2) It is not easy if not impossible to convince kernel developers to
> learn new scripting language for dynamic tracing when the have the
> luxury of rebuilding the kernel at will.
And yet I expect even they would prefer to avoid a rebuild/reboot,
other things being equal.
> I personally think without the help of kernel developers we can not
> come up with good tapsets in all the areas as we are not experts in
> each subsystem.
The group of "kernel developers" is too amorphous to agree or disagree
with this. The set of experts for any given area may or may not match
the set of people who might refuse to use script, or who may be
willing to maintain instrumentation code in their area.
> 3) Another problem is if the variables needed in the probe handlers
are
> declared local to the "C" files, script based tapsets can not be used.
> [...]
Why do you think so? Consider the model of a debugger supervising a
stopped program, not another C program linking to another. A debugger
can make references to "local" (static?) variables. For systemtap, we
just need a syntax for making references to symbols outside the
default lookup algorithm.
> 4) If you take the above example it is not clear to me how are we
going
> to figure out which header file has the definition of struct task and
> what are all the dependent headerfiles that we have to include in
order
> to compile the above generated code.
To access "$globalptr->field", the translator need emit *no* #includes
for the declarations of typeof(globalptr). That's because this
dereference operation would be expanded to the same sort of dwarf
walking expression already shown to access function parameters and
locals. Field names and types would be resolved within the
translator, and would show up only as machine level
pointer/offset/dereference operations.
> 5) The example you have provided is simple enough hence it doesn't
> really matter if we write in "C" or script but if we have a
complicated
> one where one might have to do some locking and traverse a list and
> compute some values etc., i am not sure it is easy to express that in
> systemtap limited language.
Yes, these operations are still missing. They may end up with some
respectable expression in the script language, or else may force
descent into C.
> If you look from an existing kprobes users point of view, as they are
> potential tapset writers, what they want out of systemtap is
> 1) A Convenient way to access local variables and arguments any where
in
> the function. Function entry is achieved through jprobes now.
> 2) An enhancement to Kprobes API so that they can specify the probe
> point location in a more portable fashion than the current hex address
> format.
> [...]
... and yet neither of these is practical without access to the
debugging information. Is this the sort of person for whom dprobes
was written?
> [...]
> One way to solve the first problem is let systemtap consult debug data
> and pass the required variable to the handlers. The above example
would
> look like the following.
>
> dopgflt_outofmem_handler (struct pt_regs *regs, struct task *tsk,
> unsigned long addr; void *buf, int bufsize) {
>
> if (task->uid != 0)
> {
> copy relevent variables to buf;
> }
> }
Your examples need more meat. If this probe handler was written in C,
how is the translator supposed to know what variables it might like to
have extracted; how it is supposed to decode the "buf" contents; how
the script code may call to it, or be called by it; ...
This is part of the burden of championing a two-language solution to
the problem: you must work out how it should look from both ends, what
information is available on each side, how concepts map, what the
build implications are, and so on.
> [...] Looking at Dtrace papers it makes me feel fairly certain that
> Dtrace providers are written in C but i dont know how they solved
> this problem of accessing local variables [...]
They do not access general local variables. They can access certain
specially designated variables: function arguments and return values
(since the ABI fixes their location), and others identified by a
static instrumentation macro. Look up the dtrace "probe site"
mechanism that uses the DTRACE_PROBE* family of macros. A variant of
this can be supported by systemtap, even without the extra
"provider { ... }" declarations on the script side.
- FChE