This is the mail archive of the
mailing list for the systemtap project.
Re: Systemtap for NFSv4
- From: Vara Prasad <prasadav at us dot ibm dot com>
- To: Gerrit Huizenga <gh at us dot ibm dot com>
- Cc: Trond Myklebust <trond dot myklebust at fys dot uio dot no>, Tony Reix <tony dot reix at bull dot net>, Li Guanglei <guanglei at cn dot ibm dot com>, Vara Prasad <varap at us dot ibm dot com>, Jose Santos <jrs at us dot ibm dot com>, "systemtap at sourceware dot org" <systemtap at sourceware dot org>, xuepengl at cn dot ibm dot com
- Date: Wed, 23 Aug 2006 15:18:12 -0700
- Subject: Re: Systemtap for NFSv4
- References: <E1GFvn3email@example.com>
Gerrit Huizenga wrote:
On Wed, 23 Aug 2006 12:10:55 EDT, Trond Myklebust wrote:[...]
On Wed, 2006-08-23 at 08:43 -0700, Gerrit Huizenga wrote:
:)Ah, but you appear to be missing an important point. Tony was asking if
we wanted to _replace_ the current set of dprintk()s with systemtap. I'm
just saying that I'm not going to start throwing dprintks until we have
a viable replacement. SystemTap is not (yet?) convincing as a candidate
for that role.
I'm not convinced that I'm missing the point. I think this is a
good question to start thinking about. Ultimately, if you can do
the same thing (and more) with SystemTap than you can do with static
probe points, and all dprintk()'s are macro replaced with SystemTap
static probe points, you might have a try-before-you-buy and general
validation approach. Not sure if that is completely feasible today
but might be a cheap option.
With systemtap you don't need to place explicit markers in function
entry and exit points, but as a mantainer if you want to have a control
on what information can be collected at these points you can do that
though using static probe markers, (more about that below). If you want
to collect information somewhere in the middle of a function there are
ways to specify that in systemtap language but they are not portable,
suggested method to do so is static markers.
Static markers: We have couple of proposals on how one can go about
specifying a very low overhead static marker. Here are the links to the
Trond, you can help us in defining these macros so that it makes it easy
to replace your dprintk's with the macros. The goal is to satisfy your
need to get the debug information you need even in production systems
without paying much performance penalty and more efficiently using
aggregation etc. features of systemtap, i guess it will be a win-win.
Debugging tools are user space utilities so i am not sure, what you
mean by needs to be in Linus's tree. All the kernel pieces like kprobes,
relay fs etc. are already in the mainline. If you are talking about
maintaining the tools just like all the tools in the user space we have
committed set of people maintaining them and we will continue to do
going forward as well.
Note: if we do want to replace dprintks with SystemTap, then I think
that another precondition would be that we include the main systemtap
debugging scripts in Linus' tree.
That has also been discussed - and I think that is a valid proposal.
Not sure what the current status is but if the SystemTap folks agree,
that is an easy discussion to start on LKML or with akpm.
I see several conditions that would need to be fulfilled before we even
start to consider that:
1) SystemTap would have to go into most (all?) distributions.
Most distributions are already carrying it SLES, RHEL, gentoo. There is
an older version in Debian. If you have any other distro you would like
this to be adopted into, please send us a contact info and we will help
Getting there. Why wait. Just like all other things open-source,
get in on the development, be one of the first to use cool toys,
and YES, systemtap is heading there. And, if you find a distro that
doesn't have it, ask here and suggest they include it. This is OPEN
source, after all. ;-)
2) There must be support for _all_ processor architectures: I'm not
sure that they all have kprobes support for instance.
The kernel portion is already supported on x86, x86_64, PPC 64, S390,
IA64 and Sparc. SystemTap is also supported in all but Sparc from the
above list. What other main stream platforms are important?
Oh, you know better than that! All 38 processor architectures (okay,
I forgot the actual number)? Again, when you find one that doesn't
work, let the arch maintainers know. kprobes supports most of the
mainstream archs today, nothing fundamental standing in the way of
SystemTap supporting all of those. When you find one that you care
about that doesn't work, let this list or the arch maintainer know.
3) The NFS code would have to stabilise considerably: if the code to be
debugged keeps moving around, then maintaining a parallel set of
SystemTap scripts would be a nightmare.
If you want to probe in some part of the NFS that is still evolving we
can use static probes which makes developers to change them along with
the code. Like i said, if the probe points are at well known function
boundaries, we don't need markers in the code and it is easy to
maintain. The kind of interfaces we would like to get started
instrumenting as proposed in the NFS tapset patch are very stable
interfaces on both server and client side, i don't think those
interfaces (at least in NFS v2 and V3) have changed much in the last 2
or 3 years. Having instrumentation that tracks the basic protocol
operations between the server and client that can be used in PRODUCTION
environment seems to help track down better than what is available
through network traces. SystemTap gives you ability to filter the data
based on connection etc., which is kind of difficult to do without some
other post processing when using other network tracing tools. In
SystemTap you can even know back trace of all the callers to a function
that is not possible with dprintk's and network traces. In other words
SystemTap gives you one tool to consolidate information that one can
possibly obtain today using multiple tools and debug statements. I can
go on with other features of SystemTap like flight recorder etc., but i
guess this gives you an idea of SystemTap capabilities.
I can't agree more here.
SystemTap can help stabilize the code! It gives end users the ability
to help debug things without having your genius downloaded into their
Treo. You'll get more intelligent bug reports and can ask for more
detail as generated by SystemTap. Win-Win, I say!