This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

1st draft - Tapset Writer's Guide

From: Mike Mason <mmlnx at us dot ibm dot com>
To: systemtap at sources dot redhat dot com
Date: Wed, 04 Apr 2007 13:47:21 -0700
Subject: 1st draft - Tapset Writer's Guide

Some time ago I said I'd write a Tapset Writer's Guide. I started out trying to do something quite ambitious, but was later encouraged to trim it way down so someone might actually read it :-) Here's the first draft of the trimmed down version. The intent is to give the basic info someone needs to get started writing a tapset. Please look it over and let me know what additions or changes you'd like made.

Thanks,
Mike Mason

+++++++++++++++++++++++++++++++++++

TAPSET WRITER'S GUIDE
---------------------

Tapsets encapsulate knowledge about a kernel subsystem into pre-written probes and functions that can be used by user scripts. Tapsets are analogous to libraries for C programs. They hide the underlying details of a kernel area while exposing the key information needed to manage and monitor that aspect of the kernel. They are typically developed by kernel subject-matter experts.

NOTE: Tapsets are currently implemented as SystemTap scripts and distributed with SystemTap. We hope to convert them to straight C modules at some point and include them in the mainline kernel to make maintenance easier.

This document assumes you are already familiar with SystemTap and the basics of writing a SystemTap script.

The REFERENCE MATERIAL section below lists other sources of SystemTap information. At a minimum, you should study the SystemTap Tutorial, the HACKING file and some of the existing tapsets before attempting to write a tapset yourself.

WHAT SHOULD A TAPSET CONTAIN?

A tapset should expose the high-level data and state transitions of a subsystem. Assume the audience knows little to nothing about the subsystem's low-level details and probably doesn't care. Users who need low-level data typically bypass the tapsets and write custom scripts targeted to a specific problem.

The first step is to create a simple model of your subject area. For example, a model of the process subsystem might include the following:

Key data:

* process ID
* parent process ID
* thread group ID

State transitions:

* created
* running
* stopped
* terminated

NOTE: This is a simple example and not meant to be a complete list.

Use your subsystem expertise to find probe points (function entries and exits) that expose the elements of the model, then define probe aliases for those points. Try to place probes on stable interfaces whenever possible (i.e., functions that are unlikely to change at the interface level). This makes it less likely that the tapset will break due to changes in the kernel. Where kernel version or architecture dependencies are unavoidable, use preprocessing conditionals (see the stap(1) man page for details).

For example, process creation can be tracked by probing the copy_process() function. The following defines a probe alias called process.create that inserts a probe at the end of copy_process():

probe process.create = kernel.function("copy_process").return {
   < probe body >
}

This probe point has access to the entry parameters and the return value from copy_process().

In some cases, the same state transition may occur at more than one probe point. For example, data sent through a socket goes through either the sock_sendmsg() function or the do_sock_write() function. A probe to track sends on sockets could be defined as follows:

probe socket.send = kernel.function("sock_sendmsg"), kernel.function("do_sock_write) { <probe body> }

Fill in the probe bodies with the key data available at the probe points. Convert the data into meaningful forms where appropriate (e.g., bytes to kilobytes, state values to strings, etc). You may need to use auxillary functions to access or convert some of the data. Auxillary functions often use embedded C to do things that cannot be done in the SystemTap language, like access structure fields in some contexts, follow linked lists, etc. You can use auxillary functions defined in other tapsets or write your own.

In the example, copy_process() returns a pointer to the task_struct for the new process. Note that the process ID of the new process is retrieved by calling task_pid() and passing it the task_struct pointer. In this case, the auxillary function is an embedded C function that's defined in the task tapset (task.stp).

probe process.create = kernel.function("copy_process").return {
   task = $return
   new_pid = task_pid(task)
}

Avoid the temptation to write probes for every function. Most SystemTap users won't need or understand them. Keep your tapset simple and high- level.

ELEMENTS OF A TAPSET

Tapset files ------------ Tapset files are stored in src/tapset. Most are kept at that level. If you have code that only works on a specific architecture or kernel- version, you may choose to put that in the corresponding subdirectories.

Namespace --------- Probe alias names should take the form <tapset_name>.<probe_name>. For example, the probe for sending a signal could be named "signal.send".

Global symbol names (probes, functions and variables) should be unique across all tapsets. This helps avoid namespace collisions in scripts that use multiple tapsets. To ensure this, use tapset-specific prefixes in your global symbols.

Internal symbol names should be prefixed with "_".

Comments -------- All probes and functions should include comment blocks that describe their purpose, the data they provide and the context in which they run (e.g., interrupt, process, etc.). Also use comments in areas where your intent may not be clear from reading the code.

Documentation ------------- Every tapset should have its own man page called stapprobes.<tapset>(5). See src/man for examples. In addition, the SEE ALSO section in the stapprobes(5) man page should be updated to refer to your tapset's man page.

External functions defined in your tapset should be added to the stapfuncs(5) man page.

Config & Makefiles ------------------ Add your tapset man page to the AC_CONFIG_FILES line in src/configure.ac, then regenerate the src/configure script by running autoconf.

Add your tapset man page to dist_man_MANS line in src/Makefile.am, then regenerate src/Makefile.in by running automake.

Update other Makefiles as necessary.

Test cases ---------- All tapsets should be accompanied by test scripts. The tests are kept in src/testsuite in CVS and based on dejagnu. You must have dejagnu and expect installed on your system to run the tests.

Your tests should validate that:

- the tapset can be parsed and built
- all probes and functions work as expected
- all potential errors are handled appropriately

See the "test suites" section of the HACKING file and the existing tests for details.

Example Scripts --------------- Provide at least one example script that uses the probe aliases and functions in your tapset. This serves two purposes. First, it shows script writers how you envisioned the tapset being used. Second, and most important, it validates that the tapset can actually be used for something useful. If you can't write a script that uses the tapset in a meaningful way, perhaps you should rethink what the tapset provides.

Example scripts are stored in src/examples in CVS.

Change Logs ----------- Update the appropriate ChangeLog files with a brief description of your additions and changes.

Note that the change description you enter during a "cvs commit" does not get added to the ChangeLog files. You must edit the ChangeLog files directly and commit them as well.

EMBEDDED C & SAFETY

As mentioned previously, you can use embedded C (raw C code) to do things not supported by the SystemTap language. Please so this carefully and sparingly. Embedded C bypasses all the safety features built into SystemTap. Be especially careful when dereferencing pointers. Use the kread() macro to dereference any pointers that could potentially be invalid. If you're not sure, error on the side of caution. The cost of using kread() is small compared to the cost of your tapset inadvertently crashing a system!

REVIEW & SUBMISSION

All new tapsets and major changes should be reviewed "early and often" by the SystemTap community. You can sign up for the systemtap mailing list at http://sources.redhat.com/systemtap/getinvolved.html. The mailing list archive is found at http://sources.redhat.com/ml/systemtap/. The systemtap-cvs mailing list archive is at http://sources.redhat.com/ml/systemtap-cvs/.

You can request CVS write access at
http://sources.redhat.com/cgi-bin/pdw/ps_form.cgi.

REFERENCE MATERIAL

The following documents, web sites and mailing lists will familiarize you with SystemTap:

- SystemTap Tutorial. A good introduction to SystemTap. (html format: http://sourceware.org/systemtap/tutorial/, PDF format: http://sourceware.org/systemtap/tutorial.pdf)

- SystemTap project home page (http://sourceware.org/systemtap/index.html)

- SystemTap mailing lists, IRC channels and CVS instructions (http://sourceware.org/systemtap/getinvolved.html)

- CVS repository (http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/?cvsroot=systemtap)

- HACKING file in the source directory. This file outlines what's expected of project contributors.

- SystemTap Wiki. Contains papers and presentations, setup instructions for various distributions, and a growing set of example scripts. (http://sourceware.org/systemtap/wiki)

- Existing tapsets. On systems with SystemTap installed, tapsets are stored in /usr/share/systemtap/tapset/ or /usr/local/share/systemtap/tapset. In the CVS source tree, tapsets are in src/tapset.

- SystemTap Language Reference (in development, will be added to wiki
 when released)

- SystemTap Man Pages (use "apropos stap" to print a list)

Follow-Ups:
- Re: 1st draft - Tapset Writer's Guide
  - From: Sébastien Dugué

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]