Summary: | need safe IO that works in all possible contexts | ||
---|---|---|---|
Product: | systemtap | Reporter: | Frank Ch. Eigler <fche> |
Component: | runtime | Assignee: | Martin Hunt <hunt> |
Status: | RESOLVED FIXED | ||
Severity: | critical | CC: | wcohen |
Priority: | P1 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: |
Description
Frank Ch. Eigler
2005-10-26 21:03:20 UTC
Looks nasty. From within __switch_to() we cannot do printk(), log() or schedule anything to happen later, AFAICT. That's OK, because if we can detect unsafe conditions, we can always put the data in a buffer and let the next IO trigger the wake_up(). What exactly do you think should have been detected to determine that IO was unsafe from within this function? Because I am not sure. I don't know if there is a runtime test for safety. The most pessimistic approach is to always buffer, and use a background task of some sort to do all I/O. We might be able to do some blacklisty thing with the assistance of the translator. The embedded-C code can tell where the probe point was inserted. A refined form would be able to test whether sensitive files like *sched.* were involved in a kprobe. That could trigger special behavior. This is being worked on. The problem may not be limited to I/O. I believe the entire runtime needs to be audited to enumerate and analyze **all** kernel functions used from within probe context. Any of these that contain critical sections, or call sensitive subsidiary functions, need to be avoided if at all possible. This avoidance can include replication of kernel code if needed (since we will guarantee that no introspective probe can be placed on a systemtap probe module). *** Bug 1837 has been marked as a duplicate of this bug. *** Another method of ameliorating the problem is consistent use of auxiliary worker threads to carry out any kernel services that cannot be safely called and thus need to be deferred. Signalling between the probes and these worker threads (or just one) would of course have to be simple and not itself involve kernel services. *** Bug 1919 has been marked as a duplicate of this bug. *** *** Bug 1594 has been marked as a duplicate of this bug. *** Checked in fix. Needs better testing, of course. Please construct some tests that aim to stress this new code. For example, some systemtap script to probe & log printk, hw interrupt handlers, and the like. |