Bug 1564

Summary: need safe IO that works in all possible contexts
Product: systemtap Reporter: Frank Ch. Eigler <fche>
Component: runtimeAssignee: Martin Hunt <hunt>
Status: RESOLVED FIXED    
Severity: critical CC: wcohen
Priority: P1    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Frank Ch. Eigler 2005-10-26 21:03:20 UTC
See bug #908 comment #7.  The problem appears to be the runtime's eagerness to
wake up stpd regardless of context.  It needs to protect itself against being
invoked from such places, for example by deferring such signalling until it's
safe to do so (say, using a background timer).

Generally, you should review each kernel call made from within the runtime to
assess whether it is universally safe, or can/should be detected otherwise. 
Those functions that are only called during module init/exit need only a lesser
degree of care.  But those callable from probe handlers need to be paranoid.
Comment 1 Martin Hunt 2005-10-26 22:55:46 UTC
Looks nasty.  From within __switch_to() we cannot  do printk(), log() or
schedule anything to happen later, AFAICT. That's OK, because if we can detect
unsafe conditions, we can always put the data in a buffer and let the next IO
trigger the wake_up().

What exactly do you think should have been detected to determine that IO was
unsafe from within this function? Because I am not sure.
Comment 2 Frank Ch. Eigler 2005-10-27 00:30:54 UTC
I don't know if there is a runtime test for safety.  The most pessimistic
approach is to always buffer, and use a background task of some sort to do all I/O.

We might be able to do some blacklisty thing with the assistance of the
translator.  The embedded-C code can tell where the probe point was inserted.  A
refined form would be able to test whether sensitive files like *sched.* were
involved in a kprobe.  That could trigger special behavior.
Comment 3 Martin Hunt 2005-11-10 17:28:23 UTC
This is being worked on.
Comment 4 Frank Ch. Eigler 2005-11-12 12:25:25 UTC
The problem may not be limited to I/O.  I believe the entire runtime needs to be
audited to enumerate and analyze **all** kernel functions used from within probe
context.  Any of these that contain critical sections, or call sensitive
subsidiary functions, need to be avoided if at all possible.  This avoidance can
include replication of kernel code if needed (since we will guarantee that no
introspective probe can be placed on a systemtap probe module).
Comment 5 Frank Ch. Eigler 2005-11-12 12:32:35 UTC
*** Bug 1837 has been marked as a duplicate of this bug. ***
Comment 6 Frank Ch. Eigler 2005-11-12 12:55:19 UTC
Another method of ameliorating the problem is consistent use of auxiliary worker
threads to carry out any kernel services that cannot be safely called and thus
need to be deferred.  Signalling between the probes and these worker threads (or
just one) would of course have to be simple and not itself involve kernel services.
Comment 7 Frank Ch. Eigler 2005-11-24 21:15:01 UTC
*** Bug 1919 has been marked as a duplicate of this bug. ***
Comment 8 Martin Hunt 2005-11-30 11:15:30 UTC
*** Bug 1594 has been marked as a duplicate of this bug. ***
Comment 9 Martin Hunt 2005-11-30 11:21:18 UTC
Checked in fix. Needs better testing, of course.
Comment 10 Frank Ch. Eigler 2005-11-30 21:35:38 UTC
Please construct some tests that aim to stress this new code.  For example, some
systemtap script to probe & log printk, hw interrupt handlers, and the like.