This is the mail archive of the
systemtap@sources.redhat.com
mailing list for the systemtap project.
RE: tapset taxonomy
- From: "Chen, Brad" <brad dot chen at intel dot com>
- To: "William Cohen" <wcohen at redhat dot com>
- Cc: "Frank Ch. Eigler" <fche at redhat dot com>, <systemtap at sources dot redhat dot com>
- Date: Sun, 17 Apr 2005 07:19:21 -0700
- Subject: RE: tapset taxonomy
This is great feedback; thanks. Let me try to respond to a few
of your specific thoughts:
>Hardware counters have been around for a while. There have been various
>mechanisms incorporated into the kernel to use them, e.g. perfmon,
>perfctr, and oprofile. What ways do you see the performance monitoring
>hardware being used? There are a number of different modes of
>operation, e.g. stop watch (caliper mode), sampling, and event logging.
I'm assuming that Systemtap will support time-based statistical call
graph profiling with a stack-walk facility and associative arrays
comparable to those of DTrace. A natural way to support this is by
counting
and interrupting after some number of clock cycles. I'd like for this to
be generalized to other hardware events as well, such as instructions
executed, cache misses etc. The stop watch mode would probably also be
useful. I imagine enabling script writers to implement logging
themselves.
There is a related and messy problem of dealing with the heterogeneity
of
event counting hardware. I'm not sure how much influence we can have on
that in the short term, but there is growing support at Intel for
establishing open Linux drivers, APIs and data standards for the
software
part. This is important, but somewhat peripheral to Systemtap.
None-the-less I'm hoping Systemtap will be an effective driver to make
progress in this area.
>Brad, how do you see power management hardware affecting the
>instrumentation? Doing things like realizing the memory bus is the
>limiting factor and the processor clock rate can be reduced while still
>getting the same level of performance? Or looking at the power state
>transitions and trying to schedule processes to reduce the number of
>transitions between different power states?
Without carefully thinking through specific use cases, I'd like to
expose the basic power transitions and other events in the hardware and
OS so that script writers can use them. I guess I'm just trusting that
if
we expose the basic info that's there, tool-builders can decide what to
do with it. Someday, if Systemtap is very successful, it might be nice
to
support a facility for explicit control of power states from a script,
rather than relying on the OS can figure out the right thing to do.
>Virtualization is currently available in systems like Xen.
>Virtualization of the special processor hardware such as performance
>monitoring and debugging hardware is an issue there. Should users be
>able to use the performance these processors resources independently of
>other virtual machines. Another issue is getting an overall view of
what
>is going on physical machine and the logical processors.
Yes to the PMU virtualization and physical machine monitoring.
I've also wondered about scripts that might do something useful
monitoring multiple VMs and the physical machine simultaneously.
Virtualization of the PMU is a message problem and their will be
some smart guys joining us this week who might help us think
about what would make sense here.
>Brad, what did you mean by "strange new hardware to support multicore
>architectures"? Is this things like shared caches or other shared
>resources on the die? Or special mailbox instructions to reduce
>interprocess communications?
The only thing that seems certain enough to build use cases around is
shared caches. Other things being discussed are too early to build firm
plans around, but it seems clear to me that things are going to get
pretty weird pretty fast.
Brad
-----Original Message-----
From: William Cohen [mailto:wcohen@redhat.com]
Sent: Friday, April 15, 2005 7:54 AM
To: Chen, Brad
Cc: Frank Ch. Eigler; systemtap@sources.redhat.com
Subject: Re: tapset taxonomy
Chen, Brad wrote:
>>
>>In order to let us decide how many of these extensibility points we
>>need to support, we need ... drumroll ... more detailed examples. My
>>intuition says that the basic library and translator points I marked
>>above with [***] are sufficient for the forseeable future.
>>notion of priviledged script tapsets located in a library
Here are some typical low-level I see for tap set support:
-How many times did that section of code get executed
-What is the context this code is being run in?
-Which process is using this resource.
-Where in the user code triggered use of this resource (e.g. the
syscall)
-How long did it take to complete an action?
These actions would be mapped to things that the developer could do
something about, e.g. generation of ethernet traffic, by the tapset
writer.
>
> I think the list of mine you refer to above is this one:
> - hardware performance counters
> - power management hardware
> - virtualization technologies
> - strange new hardware to support multicore architectures
> Unfortunately, this is not the future; the first three are happening
> now, and the fourth will be with us within a year or so. Dealing with
> some of these can require special libraries, linking conventions or
> compilation options. It's easy to imagine the compiler support for
> such options might be somewhat tempermental and inflexible at first.
> I don't think we will understand all the implications before we commit
> to a plan for tapset authoring.
I would like to discuss how we the technologies listed above are going
to influence the instrumentation and how they might be used in the data
collection and performance analysis.
Hardware counters have been around for a while. There have been various
mechanisms incorporated into the kernel to use them, e.g. perfmon,
perfctr, and oprofile. What ways do you see the performance monitoring
hardware being used? There are a number of different modes of
operation, e.g. stop watch (caliper mode), sampling, and event logging.
Using the performance monitoring hardware can provide really good
insight into why there is a performance problem. However, the
performance monitoring hardware is very processor specific. developing
instrumentation for this limits the hardware that it can be run on. In
other cases it is difficult to map the collected data back to something
the programmer has some control over. One example of this problem is the
hyperthread P4 do not distinguish which thread a floating point
instruction came from, making it impossible to figure out which process
to assign the sample to when doing sampled based profiling.
Brad, how do you see power management hardware affecting the
instrumentation? Doing things like realizing the memory bus is the
limiting factor and the processor clock rate can be reduced while still
getting the same level of performance? Or looking at the power state
transitions and trying to schedule processes to reduce the number of
transitions between different power states?
Virtualization is currently available in systems like Xen.
Virtualization of the special processor hardware such as performance
monitoring and debugging hardware is an issue there. Should users be
able to use the performance these processors resources independently of
other virtual machines. Another issue is getting an overall view of what
is going on physical machine and the logical processors.
Brad, what did you mean by "strange new hardware to support multicore
architectures"? Is this things like shared caches or other shared
resources on the die? Or special mailbox instructions to reduce
interprocess communications?
-Will