This is the mail archive of the
mailing list for the systemtap project.
Systemtap vs Dtrace web page corrections.
- From: "James Dickens" <jamesd dot wi at gmail dot com>
- To: SystemTAP <systemtap at sources dot redhat dot com>
- Date: Tue, 12 Sep 2006 22:24:32 -0500
- Subject: Systemtap vs Dtrace web page corrections.
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=Hb0+bj49w3MfbZzJW5KnYwOqVQGmNTl+aV2djrRBw+Ja1aABtw57c9yfW9uwFKxRuzwr+6IxUDbEHMdQHTCxii76KoXsiFZnQlWdBB4zWBoH0FkGcJGqM7vHMq19RZvO/yOyXwqTUN5PRN+iHEnvoL8t6n+NLI6lzemC6BgTw+0=
This is in response to this web page posted
I have went line by line and commented on its content and in accuracies.
Going line by line...
Licenses: are correct.
Not sure if Apple, or the FreeBSD guys ported to PPC or not, but if
Apple doesn't support, the Polaris project will support DTrace on its
PPC OpenSolaris port.
Of course it should be pointed out that Systemtap only supports Linux,
where Dtrace now supports Solaris and OSX, with the FreeBSD DTrace
port rapidly proceeding, in fact last I checked they are furher along
in the process that Systemtap is.
Kernel Lock in:
DTrace kernel lock step is highly limited when compared to Systemtap.
Remember that Linux doesn't believe in API/ABI stability. I just ran a
complex DTrace script I wrote over 2 years ago on a modern Solaris
express release and it worked just as it always has,
Systemtap isn't even stable enough to run a complex script written 6
months ago, even if you were on the same kernel, of course you have to
update your elfutils quite often as well, I don't really see anything
that is stable about Systemtap. So on this line, it would be that
Systemtap has the greater amount of lock in, not just in the kernel
but Systemtap it self and possibly even the tools used to compile the
The majority core developers and active contributors on the Systemtap
mailing list are IBM and RedHat employees.
DTrace is currently developed by a number of people outside of sun,
including the people that have ported DTrace to OSX, as well as the
team that is working on porting it to FreeBSD. There are also many
users of DTrace that submit bug reports and static probes for
scripting languages and applications.
Are you accounting for kprobes that was the base of Systemtap
developed for a while before Systemtap started?
DTrace development began in October 2001, and first released in
September 2003 along with submission of the USENIX paper.
Yes Systemtap is development is rapid, but at the same time they claim
to have no kernel lock in? Exactly which is it, we have Linux that has
no guaranteed API/ABI (rapidly changing) including the kprobes kernel
module that is still under active development, and a project that is
rapidly changing, so how can it have little kernel lock in? Of course
Systemtap needs to have rapid evolution to try and match where DTrace
DTrace on the other hand, has met most of its goals and is now adding
features and fixing bugs, where the active development is adding
static probes to languages and apps making it easier for the end user
and developers. DTrace currently has accomplished 99% of its goals it
stated at the beginning of the project
Systemtap has seemed to miss its target audience, its current audience
is Kernel developers; I know of no users that use Systemtap on a daily
basis, the same statement holds true for Sysadmins as well. There are
hardly any pre-made Systemtap scripts, at the moment so your only
users are kernel Coders. I've not heard of any userland developers
using Systemtap to solve problems since userland probes are not
included. Nor is support for any other application or scripting
DTrace is targeted at Developers including ones that specialize in
kernel and userland, System administrators, and end users. DTrace is
so stable that they include scripts that are providing performance
data as part of the operating system that are used daily in production
systems with out fear. Brendan Gregg's DTrace tool kit provides ready
made script so that even newbies that can't program a line can benefit
from DTrace. DTrace is also working to integrate static probes in
scripting languages including Java, Ruby, Php, Perl, Python, Postgres,
Apache, Xserver these and one line DTrace script allows end users to
figure out performance bottlenecks in applications even though they
can't code in either DTrace or the target language.
DTrace is targeted at debugging, well if it wasn't it sure does a job
great at it anyway, please see
http://uadmin.blogspot.com/2006/05/what-is-dtrace.html for examples of
bugs being solved by DTrace, these bugs are not just limited to
Solaris applications, DTrace has been used to solve new and long
existing bugs in userland applications including, NTP, Gnome, Java
Applets, Mozilla, Star Office/Open Office. This is not a complete list
just the ones I dug up on the internet a while ago.
Systemtap seems limited to solving Linux Kernel problems since it has
no userland probes, and no non-kernel developers or system
administrators using it on a daily basis.
Since when did C language turn into a scripting language? Your scripts
are basically C code modified to make it compatible with aC compiler.
If one turns on guru mode, you are writing 100% C code, no way to
consider it anything but C.
Systemtap has full control structures as stated, but if a bug happens
in the Systemtap script it can cause the box to crash.
DTrace: doesn't have functions or loops per se, but you can work
around this with a little thought. In return even if a DTrace user
makes a horrendous mistake, it doesn't take down the box. Seems like a
fair trade, doesn't it? It does to me, you should ask your target
users, are any of them perfect? In that they never make a mistake? Are
they willing to crash there production box because of a mistake in
their systemtap script?
Systemtap's use of implicit type control seems more of a limitation
that a feature,
Because by implicitly setting the type of the variable you are now
locked into that type, if the code you are probing changes a variable
type your script no longer works. Thus eliminates the possibility of
providing ready made scripts long term. This is made even worse since
Linux doesn't have a stable API or ABI, so the script developed today
will have to be recompiled and possibly ported with the next kernel
release or the next release of a program.
When Systemtap gets userland probes its inferred variable typing will
become even more of a hindrance, lots of programs don't ship with
debugging code embedded, so if you don't have the source code to
recompile with debugging information Systemtap is useless, with DTrace
it will try and make guesses at the data structure and include files
and allow the user to type cast variables as needed, Systemtap does
not have the native ability to process include files or handling data
of unknown types.
If Systemtap's report generating ability is so great why is there work
done on a dashboard that is designed to make the reports look better?
What limitations are you seeing in DTrace's report generating capability?
Systemtap: seems to have added these as an after thought, as justified
by the need for disclaimer.
DTrace this is a built in feature, not an after thought.
Systemtap: you can't just wish this requirement a way, you can't judge
whether or not you need a piece of data, in a complex system when the
first event occurs, you have to store it until possibly many other
events have occurred, and so unless your script can predict the
future, it really should provide a way to abandon information it
collected that it really didn't care about. Your users may not feel
confident enough with the framework to write complex scripts, that
need this functionality, where this is a problem.
This seems vague at best, and an unimportant implementation detail
that the end user really won't care about.
For Systemtap its compiled C code, especially in guru mode.
Number of probe points:
System tap, has too much bloat to have a truly unlimited number of
probes, the overhead of each probe, last I checked was 64 bytes of
code per script code segment, attached to a probe at minimum, plus
additional storage requirements for each probe, so it does have a
finite number of probes, 1 million active probes requires 64MB of
kernel space allocated, since each Systemtap module is independent of
all others, you get multiple copies of kernel code that is shared in
What is the highest number of Systemtap probes ever activated in an
active script with out the machine falling over? In my tests of
DTrace it is over 500,000. The reason for stopping at 500,000 wasn't
because I hit any limit in the system, just decided any more was
pointless. I could easily run multiple copies of the same script with
out a problem. By the way the limit of 50,000 pre-defined probe points
in DTrace are just for the kernel, you can probe any line of userland
code, with out limit. Really for all but hard core kernel coders it is
an unlimited number of probes because 99.99% of your users will not
need to probe a specific line in the kernel.
Of course a good question is, what kind of kernel coder can't debug a
problem when he knows what functions are being called, and by what
function, how often and how much time its taking, and what functions
it calls and complete userland and kernel stack tracing. Seems to be a
pretty silly to do all that extra work and risk system stability in
the name of probing every line in the kernel.
Probe arbitrary points in code:
DTrace can probe entry and exit points of any function whether it is
userland or kernel space, you can also define static probe points in
both userland and kernel code, Arbitrary probe points in kernel code
really seems to be of very limited importance by 99.9999% of coders as
explained in the explanation above. DTrace can probe arbitrary points
in userspace applications should it be required.
Can Systemtap still probe arbitrary points in code if you don't have a
binary with debugging information intact?
Dynamic loaded kernel objects
Yes DTrace can probe dynamically loaded kernel module, as well as
userland libs that are loaded dynamically.
Concurrent probes on multiprocessors:
DTrace can probe multiple processors, and multiple tasks and multiple
threads regardless if they are userland or kernel space. You can even
run multiple copies of the same script with out problems, last time I
tried this simple test on Systemtap, it failed.
Extract arbitrary data at probe point:
DTrace can read any location in memory be it in kernel space or
userland, during probe execution. DTrace also handles any traps that
occur as the result of the attempt of reading the data on its entire
platforms, last I heard this feat has not been accomplished in all of
SystemTap's platforms. DTrace also understands the C struct construct
so it can also read data stored in structs and use pointers stored in
structs to access other data even if the program isn't compiled with
debugging information. Systemtap requires debugging information to
access data in structures even if the user knows the layout of the
End user extendible libraries:
Systemtap is at the same state as DTrace script is a compiled program
requiring a compiler on the target system; there is no way for a user
to extend the probe library with precompiled scripts that are
distributed without a compiler installed.
Hardware performance counters probing:
DTrace is working on adding access to performance counters that are
available in the processors.
The safety category is correct mostly on the DTrace side.
The Systemtap side, it talks a good game, but the proof is in the
pudding. Even given Systemtap's limited use in by the general public,
when was the last week, that no bug report was filed against Systemtap
that involved the system falling over. In the last 2 years, DTrace has
perhaps had a handful of such reports.
Has anyone used Systemtap on a daily basis to solve problems on
production systems without fear of it falling over?