This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
RE: tapset taxonomy

From: "Chen, Brad" <brad dot chen at intel dot com>
To: "William Cohen" <wcohen at redhat dot com>
Cc: "Frank Ch. Eigler" <fche at redhat dot com>, <systemtap at sources dot redhat dot com>
Date: Sun, 17 Apr 2005 07:19:21 -0700
Subject: RE: tapset taxonomy
This is great feedback; thanks. Let me try to respond to a few
of your specific thoughts: 
>Hardware counters have been around for a while. There have been various

>mechanisms incorporated into the kernel to use them, e.g. perfmon, 
>perfctr, and oprofile.  What ways do you see the performance monitoring

>hardware being used?  There are a number of different modes of 
>operation, e.g. stop watch (caliper mode), sampling, and event logging.
I'm assuming that Systemtap will support time-based statistical call
graph profiling with a stack-walk facility and associative arrays
comparable to those of DTrace. A natural way to support this is by
counting
and interrupting after some number of clock cycles. I'd like for this to
be generalized to other hardware events as well, such as instructions
executed, cache misses etc. The stop watch mode would probably also be
useful. I imagine enabling script writers to implement logging
themselves.

There is a related and messy problem of dealing with the heterogeneity
of 
event counting hardware. I'm not sure how much influence we can have on 
that in the short term, but there is growing support at Intel for 
establishing open Linux drivers, APIs and data standards for the
software 
part. This is important, but somewhat peripheral to Systemtap. 
None-the-less I'm hoping Systemtap will be an effective driver to make 
progress in this area.

>Brad, how do you see power management hardware affecting the 
>instrumentation? Doing things like realizing the memory bus is the 
>limiting factor and the processor clock rate can be reduced while still

>getting the same level of performance? Or looking at the power state 
>transitions and trying to schedule processes to reduce the number of 
>transitions between different power states?
Without carefully thinking through specific use cases, I'd like to 
expose the basic power transitions and other events in the hardware and 
OS so that script writers can use them. I guess I'm just trusting that
if
we expose the basic info that's there, tool-builders can decide what to
do with it. Someday, if Systemtap is very successful, it might be nice
to 
support a facility for explicit control of power states from a script, 
rather than relying on the OS can figure out the right thing to do.

>Virtualization is currently available in systems like Xen. 
>Virtualization of the special processor hardware such as performance 
>monitoring and debugging hardware is an issue there. Should users be 
>able to use the performance these processors resources independently of

>other virtual machines. Another issue is getting an overall view of
what 
>is going on physical machine and the logical processors.
Yes to the PMU virtualization and physical machine monitoring.
I've also wondered about scripts that might do something useful
monitoring multiple VMs and the physical machine simultaneously.
Virtualization of the PMU is a message problem and their will be 
some smart guys joining us this week who might help us think 
about what would make sense here.

>Brad, what did you mean by "strange new hardware to support multicore 
>architectures"? Is this things like shared caches or other shared 
>resources on the die? Or special mailbox instructions to reduce 
>interprocess communications?
The only thing that seems certain enough to build use cases around is
shared caches. Other things being discussed are too early to build firm
plans around, but it seems clear to me that things are going to get 
pretty weird pretty fast.

Brad

-----Original Message-----
From: William Cohen [mailto:wcohen@redhat.com] 
Sent: Friday, April 15, 2005 7:54 AM
To: Chen, Brad
Cc: Frank Ch. Eigler; systemtap@sources.redhat.com
Subject: Re: tapset taxonomy

Chen, Brad wrote:
>>
>>In order to let us decide how many of these extensibility points we
>>need to support, we need ... drumroll ... more detailed examples.  My
>>intuition says that the basic library and translator points I marked
>>above with [***] are sufficient for the forseeable future.
>>notion of priviledged script tapsets located in a library

Here are some typical low-level I see for tap set support:

-How many times did that section of code get executed
-What is the context this code is being run in?
-Which process is using this resource.
-Where in the user code triggered use of this resource (e.g. the
syscall)
-How long did it take to complete an action?

These actions would be mapped to things that the developer could do 
something about, e.g. generation of ethernet traffic, by the tapset
writer.

> 
> I think the list of mine you refer to above is this one:
> - hardware performance counters
> - power management hardware
> - virtualization technologies
> - strange new hardware to support multicore architectures
> Unfortunately, this is not the future; the first three are happening
> now, and the fourth will be with us within a year or so. Dealing with
> some of these can require special libraries, linking conventions or 
> compilation options. It's easy to imagine the compiler support for 
> such options might be somewhat tempermental and inflexible at first.
> I don't think we will understand all the implications before we commit
> to a plan for tapset authoring.

I would like to discuss how we the technologies listed above are going 
to influence the instrumentation and how they might be used in the data 
collection and performance analysis.

Hardware counters have been around for a while. There have been various 
mechanisms incorporated into the kernel to use them, e.g. perfmon, 
perfctr, and oprofile.  What ways do you see the performance monitoring 
hardware being used?  There are a number of different modes of 
operation, e.g. stop watch (caliper mode), sampling, and event logging.

Using the performance monitoring hardware can provide really good 
insight into why there is a performance problem. However, the 
performance monitoring hardware is very processor specific. developing 
instrumentation for this limits the hardware that it can be run on. In 
other cases it is difficult to map the collected data back to something 
the programmer has some control over. One example of this problem is the

hyperthread P4 do not distinguish which thread a floating point 
instruction came from, making it impossible to figure out which process 
to assign the sample to when doing sampled based profiling.

Brad, how do you see power management hardware affecting the 
instrumentation? Doing things like realizing the memory bus is the 
limiting factor and the processor clock rate can be reduced while still 
getting the same level of performance? Or looking at the power state 
transitions and trying to schedule processes to reduce the number of 
transitions between different power states?

Virtualization is currently available in systems like Xen. 
Virtualization of the special processor hardware such as performance 
monitoring and debugging hardware is an issue there. Should users be 
able to use the performance these processors resources independently of 
other virtual machines. Another issue is getting an overall view of what

is going on physical machine and the logical processors.

Brad, what did you mean by "strange new hardware to support multicore 
architectures"? Is this things like shared caches or other shared 
resources on the die? Or special mailbox instructions to reduce 
interprocess communications?

-Will
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]