Bug 2293 - confirm preemption/interrupt blocking in systemtap probes
Summary: confirm preemption/interrupt blocking in systemtap probes
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-07 17:14 UTC by Frank Ch. Eigler
Modified: 2006-05-01 14:20 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2006-02-07 17:14:49 UTC
For each probe type supported in tapsets.cxx, confirm that the probe handlers
are run in an atomic manner: preemption + interrupts disabled.

This could tie in with bug #1884, by detecting whether we are already in an
interrupt context, and if so, reducing the MAXACTION budget for this probe hit.
Comment 1 Frank Ch. Eigler 2006-02-22 15:05:59 UTC
Added a local_irq_save/restore into every probe handler prologue/epilogue.  The
runtime should no longer need to do this sort of thing for functions only called
from within probe handlers.
Comment 2 Martin Hunt 2006-02-22 20:07:51 UTC
(In reply to comment #1)
>The
> runtime should no longer need to do this sort of thing for functions only called
> from within probe handlers.

The runtime functions callable by the translator never disable interrupts. 

Comment 3 Frank Ch. Eigler 2006-02-23 20:42:04 UTC
Other runtime protective measure still exist though: get_cpu/put_cpu can simply
be smp_processor_id.
Comment 4 Martin Hunt 2006-02-27 17:32:49 UTC
(In reply to comment #3)
> Other runtime protective measure still exist though: get_cpu/put_cpu can simply
> be smp_processor_id.

Yeah, I forgot about those.

I've been trying to solve a  problem created by this change. The problem is with
scripts that collect a large amount of data then dump it all at probe end. You
have now disabled interrupts, so the data fills the buffers but stpd cannot
empty them. So we have effectively limited systemtap output to whatever space is
remaining in  buffer when "probe end" is hit.  Which might not be much.

That's fine for little test scripts, but a real solution is needed. Ideas?



Comment 5 Martin Hunt 2006-03-29 07:06:02 UTC
Still trying to figure out the transport problem. It appears to not be solvable;
if you disable interrupts, stpd will not be processing any output and output
from "probe end" will get truncated if the user tries to write too much.

However, simply removing the local_irq_save() for "probe begin" and "probe end"
would still leave us vulnerable to the possibility of preemption causing the cpu
to change during the probe execution. One possibility would be to replace all
smp_processor_id() calls with _stp_processor_id() which would do

if (STAP_SESSION_STARTING or STAP_SESSION_STOPPING)
    return 0
else
    return smp_processor_id()


Comment 6 Frank Ch. Eigler 2006-03-29 11:54:58 UTC
I am skeptical that games with the processor ID are appropriate, or that
allowing end probes to block is appropriate.

I recommend opening a new bug against the runtime, addressing specifically the
issue of I/O buffering near the time of shutdown.  I recall suggesting looking
into whether stpd and the kernel-side runtime message handler can work together
to drain the buffers before starting the module_exit process, to provide the
maximum static space to the end probes.  (That space amount would
uncoincidentally match the command line option "-s NUM" to the initial
compilation stage, and thus make some intuitive sense to the user.)  Did you try
that?
Comment 7 Martin Hunt 2006-03-29 18:05:29 UTC
Subject: Re:  confirm preemption/interrupt blocking in
	systemtap probes

On Wed, 2006-03-29 at 11:54 +0000, fche at redhat dot com wrote:

> I recommend opening a new bug against the runtime, addressing specifically the
> issue of I/O buffering near the time of shutdown.  I recall suggesting looking
> into whether stpd and the kernel-side runtime message handler can work together
> to drain the buffers before starting the module_exit process, to provide the
> maximum static space to the end probes.  (That space amount would
> uncoincidentally match the command line option "-s NUM" to the initial
> compilation stage, and thus make some intuitive sense to the user.)  Did you try
> that?

I think I originally suggested it, and I have thought further about it.
I hoped to find a better solution than imposing another limit users have
to compute. For collecting large amounts of data, MAXMAPENTRIES needs
raised and then you have to calculate how much space that data will take
up when "printed" into the output buffers.  Another problem is that for
relayfs the buffer is divided into per-cpu sub-buffers. So the maximum
data that can be sent is NUM/cpus. 

Martin

Comment 8 Frank Ch. Eigler 2006-05-01 14:20:42 UTC
Please open a new bug for more work along the lines of
<http://sourceware.org/ml/systemtap/2006-q1/msg00653.html>
Comment 9 prasadav@us.ibm.com 2006-05-05 16:09:50 UTC
Subject: Re:  confirm preemption/interrupt blocking in
 	systemtap probes

Martin Hunt wrote:

>On Wed, 2006-03-29 at 11:54 +0000, fche at redhat dot com wrote:
>
>  
>
>>I recommend opening a new bug against the runtime, addressing specifically the
>>issue of I/O buffering near the time of shutdown.  I recall suggesting looking
>>into whether stpd and the kernel-side runtime message handler can work together
>>to drain the buffers before starting the module_exit process, to provide the
>>maximum static space to the end probes.  (That space amount would
>>uncoincidentally match the command line option "-s NUM" to the initial
>>compilation stage, and thus make some intuitive sense to the user.)  Did you try
>>that?
>>    
>>
>
>I think I originally suggested it, and I have thought further about it.
>I hoped to find a better solution than imposing another limit users have
>to compute. For collecting large amounts of data, MAXMAPENTRIES needs
>raised and then you have to calculate how much space that data will take
>up when "printed" into the output buffers.  Another problem is that for
>relayfs the buffer is divided into per-cpu sub-buffers. So the maximum
>data that can be sent is NUM/cpus. 
>
>Martin
>
>  
>
There is a generic problem that we have to solve in SystemTap to support 
long running or large number of probes. The common problem with these 
scenarios is, they generate lot more data than the maps can hold. There 
are two solutions i can think of to help address this area
1) We should have capability to say truncate the map by leaving only the 
top "n" entries based on the key. If one is looking to get general 
trends top few is more than enough hence this solution could be useful.
2) Second solution is able to periodically dump the maps to userspace 
and then stpd during the post processing can reconstruct the full maps 
from the dumps. I have not looked at if there are any maps that have 
some specific mathematical properties that doesn't lend into post 
aggregation, we have to look into that aspect.

Coming to relayfs i believe stpd has an option to specify the overall 
buffer size but not the no. of sub buffers and size of each. As Frank 
mentioned I think it may be a good idea able to specify the no. of sub 
buffers as well along with the overall buffer size.  I think script 
writers are likely to have a better idea of how much data their script 
collects rather than the executor of the script, that makes me think we 
should also have an option for the script writer to specify the the type 
of transport used procfs or relayfs and if it is relayfs they should 
also have an option to specify the no. of sub buffers and size of the 
total buffers. If the size is specified on the command line and as well 
in the script, i think we should use the max of the two.

bye,
Vara Prasad