Introduction

This describes the sections created by the <sys/sdt.h> macros, for use by tool writers. Only version 3 is described here. Users of <sys/sdt.h> should read AddingUserSpaceProbingToApps instead.

This document assumes you generally understand ELF and in particular how ELF notes are laid out.

Overview

SDT probes are designed to have a tiny runtime code and data footprint and no dynamic relocations. They are usable from assembly, C, and C++. <sys/sdt.h> can be included and used on any architecture as long as it provides a nop assembler instruction. Some background is available in the v3 feature request.

Each SDT probe in the source expands to a single nop in the generated code. There may be additional overhead for computing the probe arguments, but this can be avoided by careful argument choices (i.e., via the choice of values which are already live at that location).

Each SDT probe also expands into a non-allocated ELF note. You can find this by looking at SHT_NOTE sections and decoding the format; see below for details. Because the note is non-allocated, it means there is no runtime cost, and also preserved in both stripped files and .debug files.

However, this means that prelink won't adjust the note's contents for address offsets. Instead, this is done via the .stapsdt.base section. This is a special section that is added to the text. We will only ever have one of these sections in a final link and it will only ever be one byte long. Nothing about this section itself matters, we just use it as a marker to detect prelink address adjustments.

Each probe note records the link-time address of the .stapsdt.base section alongside the probe PC address. The decoder compares the base address stored in the note with the .stapsdt.base section's sh_addr. Initially these are the same, but the section header will be adjusted by prelink. So the decoder applies the difference to the probe PC address to get the correct prelinked PC address; the same adjustment is applied to the semaphore address, if any.

SDT Notes

An SDT note is given the vendor string "stapsdt" with type==3 (for "sdt v3").

In the below, the "address size" is either 4 or 8 bytes, found in the ELF headers in the usual ways. After the note header, the n_descsz bytes are:

Argument Format

The argument format describes how to find the arguments to the probe, at the probe point.

If the format is the empty string, or the string ":", then there are no arguments.

Otherwise, the format consists of a sequence of arguments.

For compiler-generated code, each argument will be of the form Nf@OP. For hand-written assembly, or for inline assembly in C or C++, the initial Nf@ may be missing.

If N is present, it describes the size of the argument. It will be one of:

1

8 bits unsigned

-1

8 bits signed

2

16 bits unsigned

-2

16 bits signed

4

32 bits unsigned

-4

32 bits signed

8

64 bits unsigned

-8

64 bits signed

This may be extended with other values, for example for integer vector types.

If N is omitted, the argument size is the natural size of the operand; usually this is the size of the register or the word size of the machine. In this case, the signedness is ambiguous.

If f is specified, the argument is considered to be a floating point value. If f is omitted, the argument is considered to be an integer type.

OP is the actual assembly operand. Note that any operand accepted by gas may appear here, and in particular this means that gas integer expressions may appear. On x86, the gas syntax distinguishes integer literals with a $ prefix. PowerPC does not, so we use %I0%0 in the template to generate "i3" for integer constant 3 and "3" for register r3. If the assembly operand is generated by GCC, it is probably constrained by the character codes in the STAP_SDT_ARG_CONSTRAINT macro, which a developer may override during compilation. Overriding may be necessary, should GCC emit unusable expressions for some operands, such as references to local .Lnnnn labels that are by default dropped by GAS. (See also GAS --keep-locals.) Or use register local variables: register int param asm("a5") = value; STAP_PROBE(... param);

Each argument is separated by a single space, but some care is required when parsing the arguments because the OP may itself include whitespace.

These bugs are useful reading if you are implementing your own probe argument parser:

Semaphore Handling

If a semaphore is associated with a probe, it will be of type unsigned short. A semaphore may gate invocations of a probe; it must be set to a non-zero value to guarantee that the probe will be hit. Semaphores are treated as a counter; your tool should increment the semaphore to enable it, and decrement the semaphore when finished. Overflow is currently ignored, so a large number of attaches could actually cause the probe to be disabled. This is not likely to happen in practice.

None: UserSpaceProbeImplementation (last edited 2021-10-17 15:54:43 by StanCox)