This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Measure the Accept Queueing Time


For 80% of the performance problem cases in a production system, you just need to identify the device or queue with the longest service times to give a first order approximation of where you need to improve the system. The service and queue times you get from a steady state system are accurate enough to use in any modeling effort to predict future performance.

Yes, for engineering level benchmarks you may need to collect the individual response time and look at boxplots of the response times, but that is only needed in a smaller subset of the cases. As someone who has done both type of performance measurements and tuning, getting the first part correctly has a much bigger impact than delaying or destabilizing the product to do the second.

Regards,
Peter

 -------------- Original message ----------------------
From: "Ken Robson" <ken@robson.net>
> Whilst I agree that the execution path dissection is useful, I do not agree
> that averages are entirely useful, to put them into context you need some
> indication of standard deviation during the sample period and a median
> calculation to give some indication of outliers.
> 
> There have been some hints (unless I missed something more concrete) about
> aggregations. Statistical functions that seem useful that do not aggregate
> well seem to be mode, median and standard deviation - each of these would
> seem to require that you retain the entire data set until the function is
> applied which seems to reduce some of the value of the aggregations,
> particularly when doing long runs.
> 
> What built-in data 'visualisation' functions do people envisage and how
> efficient (in terms of shipping data to user space or retaining data in the
> kernel) do people think they will be, or have I misunderstood the issue
> altogether?
> 
> -----Original Message-----
> From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org]
> On Behalf Of Frank Ch. Eigler
> Sent: 14 February 2006 22:28
> To: Peter Bach
> Cc: systemtap@sources.redhat.com
> Subject: Re: Measure the Accept Queueing Time
> 
> 
> peterzbach wrote:
> 
> > [...]
> > Most everything inside the kernel is a queue or system of queues, but
> > as Frank noted, the trick is to find the proper probe points to
> > measure them.
> 
> > As for queues, the only three useful measurements are average
> > service time, average wait time, and average queue length. [...]
> > The most useful way to measure these values in a lightweight way was
> > detailed by Adrian Cockcroft about a decade ago [...]
> 
> Thanks for the reference.  I put a simplified first sketch of this
> into a new "queue_stats.stp" tapset.  It comes with an accompanying
> pass-5 test case / demo that simulates six concurrent threads modeling
> traffic to two separate queues.
> 
> % stap ../tests/testsuite/systemtap.samples/queue_demo.stp
> block-read: 9 ops/s, 1.191 qlen, 207754 await, 81453 svctm, 74% wait, 51%
> util
> block-write: 8 ops/s, 0.785 qlen, 224397 await, 132798 svctm, 56% wait, 72%
> util
> block-read: 9 ops/s, 0.882 qlen, 215635 await, 122062 svctm, 63% wait, 75%
> util
> block-write: 8 ops/s, 1.010 qlen, 237468 await, 119600 svctm, 68% wait, 67%
> util
> block-read: 10 ops/s, 0.741 qlen, 158623 await, 88998 svctm, 54% wait, 66%
> util
> block-write: 9 ops/s, 0.997 qlen, 188266 await, 88399 svctm, 76% wait, 68%
> util
> 
> 
> The new code provides these generic calculations only.  The client
> code must still plop probes into the appropriate place (say, in the
> block I/O stack, or the scheduler's run queue, or ...?) and call the
> provided queue state-change tracking functions.
> 
> 
> - FChE
> 
> 



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]