Bug 10204 - Place userspace markers in systemtap itself
Summary: Place userspace markers in systemtap itself
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: tapsets (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-26 21:22 UTC by Josh Stone
Modified: 2009-08-05 19:23 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
patch adding static probes to stap, staprun, stapio (2.71 KB, patch)
2009-06-12 21:01 UTC, Kent Sebastian
Details | Diff
updated patch for static probes (2.96 KB, patch)
2009-07-24 20:57 UTC, Kent Sebastian
Details | Diff
tapset for stap/staprun/stapio static probes (1.70 KB, text/plain)
2009-07-24 21:00 UTC, Kent Sebastian
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Josh Stone 2009-05-26 21:22:55 UTC
Perhaps we should eat our own dogfood, and start placing userspace markers in
our own programs.  Some simple candidates off the top of my head:

= stap
  - start/end of each phase
  - launching other processes [stap_system()]
  - various cache activity

= staprun
  - loading/unloading the module

= stapio
  - send/recv control messages
  - recv data messages

Not all of these would necessarily need markers, but they could all go into a
stap.* tapset in whatever form makes sense.
Comment 1 Kent Sebastian 2009-06-12 21:01:11 UTC
Created attachment 3992 [details]
patch adding static probes to stap, staprun, stapio

This patch adds static probes in the following places, with the variables
available:

stap
-main.cxx
--start/end of pass0 (systemtap_session s)
--start of pass1 user script parse (systemtap_session s)
--start of pass1 library script parse (systemtap_session s)
--end of pass1 (systemtap_session s)
--start/end of pass2,3,4,5,6 (systemtap_session s)
-util.cxx
--start of stap_system (char* command)
--just after posix_spawn in stap_system (int ret, spawned_pid)
-cache.cxx
--start of add_to_cache (systemtap_session s)
--start of get_from_cache (systemtap_session s)
--start of clean_cache (systemtap_session s)

where the pass start is after the gettimeofday for that pass, and the end is
before a failure point for that pass

staprun
-staprun_funcs.c
--just before init_module in insert_module (no vars)
-staprun.c
--just before delete_module in remove_module (no vars)

stapio
-common.c
--start of send_request (void* data, int len)
-mainloop.c
--just after a received message (int type, void* data, ssize_t nb)

A quick test and they look to be working. Comments on the number of the probes?
Placement? Variables? Something else I'm doing wrong? :)
Comment 2 Frank Ch. Eigler 2009-06-16 08:46:34 UTC
Kent, thanks, good start.

A test case that attaches to the markers (while a target stap -p5
runs hello-world) would be useful, so that you can inspect the results.

The marker parameters need to be chosen so that based upon those values
alone one can perform basic analysis.  For example, we don't just want
to know that a module was inserted/removed by staprun -- which module?

The stap-side markers are probably not so interesting - except as a demo.
Maybe it can be used as a supplement to "stap -v" in order to trace
known trouble spots, such as major elfutils interactions (focus_on_module?).

But staprun-side ones are perhaps useful in themselves for auditing.
So events that a sysadmin may be interested in should be listed.  So again,
adding/removing modules, forking system() processes, plus important log 
buffering events, signature verification (if any code is in there already),
attach/detach, and so on.

Comment 3 Josh Stone 2009-06-17 02:33:29 UTC
(In reply to comment #1)
> Created an attachment (id=3992)
> patch adding static probes to stap, staprun, stapio
> 
> This patch adds static probes in the following places, with the variables
> available:
> 
> stap
> -main.cxx
> --start/end of pass0 (systemtap_session s)
> --start of pass1 user script parse (systemtap_session s)
> --start of pass1 library script parse (systemtap_session s)
> --end of pass1 (systemtap_session s)
> --start/end of pass2,3,4,5,6 (systemtap_session s)

These are good.

> -util.cxx
> --start of stap_system (char* command)
> --just after posix_spawn in stap_system (int ret, spawned_pid)

How about one more for the after command has completed?

> -cache.cxx
> --start of add_to_cache (systemtap_session s)
> --start of get_from_cache (systemtap_session s)
> --start of clean_cache (systemtap_session s)

In each of these, I would be interested to know what files are affected, not
just that we called the function.

> staprun
> -staprun_funcs.c
> --just before init_module in insert_module (no vars)
> -staprun.c
> --just before delete_module in remove_module (no vars)

As Frank said, the module path would be useful.

> stapio
> -common.c
> --start of send_request (void* data, int len)
> -mainloop.c
> --just after a received message (int type, void* data, ssize_t nb)

Can't you provide the type for send too?

> A quick test and they look to be working. Comments on the number of the probes?
> Placement? Variables? Something else I'm doing wrong? :)

Generally these look good, but I'm seeing a lot of skipped probes when I try to
trace even a simple hello world.  That's probably not your fault, but we may
need to robustify the uprobes a bit.  If nothing else, we now have a
semi-complicated test case. :)
Comment 4 Kent Sebastian 2009-07-24 20:57:13 UTC
Created attachment 4083 [details]
updated patch for static probes

Changes are as follows:

* cache__add split into: cache__add__module, cache__add__nss,
cache__add__source with each one having the source and destination path of the
file in question
* cache__get has the full path to the .c and .ko it grabbed
* cache__clean has the full path to the entry about to be removed, and probe
point moved to right before the call to unlink
* send__ctlmsg includes the type of message 
* remove__module has the name of the module to be removed
* insert__module has the full path to the module about to be inserted
* stap__system__complete placed after waitpid() with the return code 

For the send/recv control messages, a pointer to the data blob is provided --
would it be more useful to copy the bytes for the user to use or assume that if
the user wants the data they can do it themselves?

Also, is there a use for a probe in remove_all_modules()?

I'll attach a tapscript with all these aliased to nice names and with
appropriate var names so people can play with it and suggest improvements :)
Comment 5 Kent Sebastian 2009-07-24 21:00:57 UTC
Created attachment 4084 [details]
tapset for stap/staprun/stapio static probes

Another question: should the path be /usr/local/bin or /usr/bin, or both?
Currently it's /usr/local..
Comment 6 Mark Wielaard 2009-07-26 11:16:28 UTC
(In reply to comment #5)
> Another question: should the path be /usr/local/bin or /usr/bin, or both?
> Currently it's /usr/local..

Personally I install in /usr/local/systemtap, so neither would work for me.
Would it be possible to make this depend on the given --prefix?

Something like renaming it to stap_staticprobes.stp.in, replacing the hardcoded
paths with @prefix@ and adding it to AC_CONFIG_FILES in configure.ac?
Comment 7 Frank Ch. Eigler 2009-07-26 11:28:15 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Another question: should the path be /usr/local/bin or /usr/bin, or both?
> > Currently it's /usr/local..

We have $PATH searching in process probes.  Just use process("stap").

Comment 8 Josh Stone 2009-07-28 18:14:56 UTC
(In reply to comment #5)
> Created an attachment (id=4084)
> tapset for stap/staprun/stapio static probes
> 
> probe staprun.send_control_message =
process("/usr/local/bin/staprun").mark("send__ctlmsg") {

Isn't this probe really in stapio?

(In reply to comment #7)
> We have $PATH searching in process probes.  Just use process("stap").

That will work for stap and staprun, but we keep stapio in a private libexec. 
Since that one will need @prefix@ munging, we might as do it for stap and
staprun too just to be precise.
Comment 9 Kent Sebastian 2009-07-31 20:56:16 UTC
Committed as of 67a31a9f57db3f8108e31522b3cf5a6d7d604f22. After talking it over
a bit with fche, due to how the tapsets are processed (at make install time vs
configure time) I commented out the stapio probe in the tapset for now. The
corresponding static marker is still committed however.

I'll leave the bug open but reassign it.

(In reply to comment #8)
> > probe staprun.send_control_message =
> process("/usr/local/bin/staprun").mark("send__ctlmsg") {
> 
> Isn't this probe really in stapio?

The marker is in the send_request function, which is in
runtime/staprun/common.c, so I guess not :)
Comment 10 Josh Stone 2009-07-31 22:18:27 UTC
> (In reply to comment #8)
> > > probe staprun.send_control_message =
> > process("/usr/local/bin/staprun").mark("send__ctlmsg") {
> > 
> > Isn't this probe really in stapio?
> 
> The marker is in the send_request function, which is in
> runtime/staprun/common.c, so I guess not :)

Here "common" means shared by both stapio and staprun, so the mark should
actually manifest in both.  staprun's use of send_request is only for loading
kernel module relocations; stapio's messages are more interesting IMO.
Comment 11 Frank Ch. Eigler 2009-08-05 19:23:28 UTC
committed