This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Evaluating SystemTap for Network Response Times

From: fche at redhat dot com (Frank Ch. Eigler)
To: Nathan DeBardeleben <ndebard at lanl dot gov>
Cc: systemtap at sources dot redhat dot com
Date: 31 Jan 2006 12:21:29 -0500
Subject: Re: Evaluating SystemTap for Network Response Times
References: <43DF95CB.8070201@lanl.gov>

Nathan DeBardeleben <ndebard@lanl.gov> writes:

> [...] Specifically, we want to time the point
> when a socket send operation leaves user space, entering kernel space,
> down to the point where the kernel says "it's done, sent".  [...]
> 
> Initially this looks just like the kind of thing I could do with
> SystemTap but I worry that the scripting language will be too
> restrictive to allow me to allocate these types of data structures
> to do record keeping.

I hope it is exactly this kind of complex instrumentation with which
systemtap could show its prowess.  I would like to help you make it
work.

> When it comes down to it - I want to observe a system and recognize
> outliers ("hey, this operation took 20 times longer than the rest")
> through statistical means.

Expressing that condition should be no problem at all.  If for example
you elect to use a statistics value to store elapsed times

     times <<< time  /* or an array indexed however necessary */

then a probe can compare the current average to a new value like this:

     if (@avg(times) > EXPR) { /* process further */ }

Over time, I foresee the variety of statistical calculations growing
to include goodies like standard deviations, random sampling, and
whatever else can be efficiently computed per-CPU and then aggregated
across CPUs.

> [...]  I hope I can add some value to the SystemTap community by
> testing it out in these environments.  If this first step goes well,
> I will be looking at using SystemTap for monitoring parallel file
> systems and studying potential performance bottlenecks.

That all sounds great.

- FChE

Follow-Ups:
- Re: Evaluating SystemTap for Network Response Times
  - From: Nathan DeBardeleben

References:
- Evaluating SystemTap for Network Response Times
  - From: Nathan DeBardeleben

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]