This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [ltt-dev] LTTng-UST vs SystemTap userspace tracing benchmarks

From: Roland McGrath <roland at redhat dot com>
To: Mark Wielaard <mjw at redhat dot com>
Cc: Stefan Hajnoczi <stefanha at gmail dot com>, "Frank Ch. Eigler" <fche at redhat dot com>, Julien Desfossez <julien dot desfossez at polymtl dot ca>, dominique dot toupin at ericsson dot com, ltt-dev at lists dot casi dot polymtl dot ca, Mathieu Desnoyers <mathieu dot desnoyers at efficios dot com>, systemtap at sources dot redhat dot com
Date: Wed, 16 Feb 2011 10:50:56 -0800 (PST)
Subject: Re: [ltt-dev] LTTng-UST vs SystemTap userspace tracing benchmarks
References: <4D5AA164.1050607@polymtl.ca> <y0mvd0ltgba.fsf@fche.csb> <AANLkTi=Nsy6fXE9=Njxs9LPHuohHzf=q5kD+fK765Rht@mail.gmail.com> <1297853778.3224.90.camel@springer.wildebeest.org>

Stefan was referring to #4 in your taxonomy.

It's indeed the case that what UST uses today is an always-there normal
C code sequence that loads global variables to decide whether to make
indirect function calls.  I don't recall off hand how many layers of
function calls to the libust DSO and such there are in either the
disabled or enabled cases.  At best, there is the always the overhead of
several instructions and at least one load in the hot code path, and the
i-cache pollution that goes with that.

It's indeed the cast that what Systemtap uses today is a
sometimes-inserted normal breakpoint instruction, which is indeed a
software interrupt that requires kernel mediation.  When disabled, there
is as close to zero overhead as you can have, being a tiny placeholder
instruction sequence (currently just one nop), so the runtime overhead
is under a cycle and the i-cache pollution is the smallest possible unit
(one instruction, being just one byte on x86).

The "sweet spot" between the two is to have overhead close to
Systemtap's epsilon for a disabled probe, while having overhead close to
UST's pure-user method when a probe is enabled.  In the in-kernel
context, this is what the Linux kernel's latest code (still being hashed
out, but mostly done) has for kernel tracepoints using the so-called
"jump label" method.  That is also possible for sdt markers with some
careful consideration and attention to machine-specific details for each
machine architecture of concern.  It entails making the placeholder in
the hot code path slightly larger (at least for x86, it has to be a
"long nop", being probably neglibly more runtime overhead, and a few
bytes more i-cache pollution), and adding some additional static code
outside the hot path.  The work to enable or disable a probe becomes
just as costly as the current Systemtap method, since it involves
modifying the program text in place (inserting jump instructions rather
than breakpoint ones).  Once enabled, the runtime work of the probes
firing can be very much like what UST does today.


Thanks,
Roland

Follow-Ups:
- Re: [ltt-dev] LTTng-UST vs SystemTap userspace tracing benchmarks
  - From: Stefan Hajnoczi

References:
- LTTng-UST vs SystemTap userspace tracing benchmarks
  - From: Julien Desfossez
- Re: LTTng-UST vs SystemTap userspace tracing benchmarks
  - From: Frank Ch. Eigler
- Re: [ltt-dev] LTTng-UST vs SystemTap userspace tracing benchmarks
  - From: Stefan Hajnoczi
- Re: [ltt-dev] LTTng-UST vs SystemTap userspace tracing benchmarks
  - From: Mark Wielaard

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]