Differences between revisions 19 and 20
Revision 19 as of 2008-01-11 18:53:41
Size: 4966
Editor: DavidSmith
Comment:
Revision 20 as of 2008-01-11 18:54:22
Size: 4965
Editor: DavidSmith
Comment:
Deletions are marked like this. Additions are marked like this.
Line 64: Line 64:
Not well. Because systemtap has no DWARF information for marker arguments, it really doesn't know what types they are. In a DWARF-based probe, you could write something like this: Not well. Because systemtap has no DWARF information for marker arguments, it really doesn't know what type they are. In a DWARF-based probe, you could write something like this:

Using Markers

What are markers?

Here is some text taken from the kernel documentation that describes markers:

  • A marker placed in code provides a hook to call a function (probe) that you can provide at runtime. A marker can be "on" (a probe is connected to it) or "off" (no probe is attached). When a marker is "off" it has no effect, except for adding a tiny time penalty (checking a condition for a branch) and space penalty (adding a few bytes for the function call at the end of the instrumented function and adds a data structure in a separate section). When a marker is "on", the function you provide is called each time the marker is executed, in the execution context of the caller. When the function provided ends its execution, it returns to the caller (continuing from the marker site).
  • You can put markers at important locations in the code. Markers are lightweight hooks that can pass an arbitrary number of parameters, described in a printk-like format string, to the attached probe function.
  • They can be used for tracing and performance accounting.

What do markers in kernel code look like?

#include <linux/marker.h>
//...
int kfunc(struct inode *i, int op)
{
        int rc = 0;        // return code
        trace_mark(kfunc_entry, "inode %p op %d", inode, op);

        //... bulk of kfunc() here...

        trace_mark(kfunc_exit, "rc %d", rc);
        return(rc);
}

This mythical function (named 'kfunc'), has 2 markers present in it. The first one has a subsystem_event of "kfunc_entry" and the second marker has a subsystem_event of "kfunc_exit". The kernel documentation suggests treating the first argument of trace_mark as having 2 parts: a 'subsystem' and an 'event'. In our example, the name of the function is used as the subsystem, and 'entry' and 'exit' are used as the events. As you can see, the second argument to trace_mark() is a format string (similar to one used by printk()), and the rest of the arguments depend on the format string.

How do I turn on marker support in my kernel?

The marker subsystem and the kernel markers themselves must be compiled into your kernel. Initial marker support is present in 2.6.24, but at this time you must also add 3 patches from the -mm tree to get full marker functionality.

linux-kernel-markers-create-modpost-file.patch: adds support for multiple probes per markers.

linux-kernel-markers-support-multiple-probes-update.patch: updates the previous patch.

linux-kernel-markers-support-multiple-probes.patch: adds support for creating a file called Module.markers which lists all markers present in a kernel and its modules (similar to the Module.symvers file).

After those patches have been applied to your kernel source, when running "make menuconfig", besides the normal options needed by systemtap, you'll also need to enable markers.

Instrumentation Support  --->
    [*] Activate markers

Selecting this option turns on the CONFIG_MAKERS define.

How do I use markers in systemtap?

To hook up a systemtap probe to a kernel marker, your systemtap script would use the 'kernel.mark("NAME")' facility. Using the example from above:

probe kernel.mark("kfunc_entry") { printf("kfunc_entry marker hit\n") }
probe kernel.mark("kfunc_exit") { printf("kfunc_exit marker hit\n") }

When kfunc() gets called by the kernel, both systemtap probes would be hit and you would see the appropriate output.

How do I access marker arguments?

The handler associated with a marker-based probe may read the optional parameters specified at the marker call site. These are named $arg1 through $argNN, where 'NN' is the number of parameters supplied by the marker. Number and string parameters are handled in a type-safe manner.

Using the example from above:

probe kernel.mark("kfunc_entry") { printf("kfunc_entry marker hit: %p, %d\n", $arg1, $arg2) }
probe kernel.mark("kfunc_exit") { print("kfunc_exit marker hit: %d\n", $arg1) }

How are marker arguments handled that are structure pointers?

Not well. Because systemtap has no DWARF information for marker arguments, it really doesn't know what type they are. In a DWARF-based probe, you could write something like this:

probe kernel.function("kfunc") { printf("inode number: %d\n", i->i_ino) }

However, if you try something similar with a marker-based probe, you'll get an error because systemtap doesn't know that $arg1 is a pointer or what $arg1 points to. So, to work around this problem, you'll have to write an access function.

function get_inode_i_ino:long (i:long) %{ /* pure */
        struct inode *inode = (struct inode *)(long)THIS->i;
        THIS->__retvalue = kread(&(inode->i_ino));
        CATCH_DEREF_FAULT();
%}
probe kernel.mark("kfunc_entry") { printf("inode number: %d\n", get_inode_i_ino($arg1)) }

None: UsingMarkers (last edited 2009-02-21 01:51:00 by JoshStone)