This is the mail archive of the
mailing list for the systemtap project.
Re: whitelist for safe-mode probes (or just a better blacklist?)
David Smith wrote:
Martin Hunt wrote:I would like to chime in..
On Wed, 2006-09-20 at 11:14 -0400, Frank Ch. Eigler wrote:
Martin Hunt <email@example.com> writes:
[...] To guarantee a probe will not crash the kernel it is going to
be necessary to generate a whitelist of probe points.
Sure, except that this guarantee is only as good as the method used to
generate the whitelist.
[...] How would this all work? The whitelist and blacklist would be
files distributed with Systemtap. They would be updated
automatically with a test script. [...]
How do you imagine this test script working? Could it generate a list
roughly matching the "in-our-experience-so-far-safe" set in a
reasonable timeframe? (It would not be very helpful if it took months
to run, or resulted in a small list.)
I imagine this would be a list that would be checked into CVS of
functions that have been tested and never caused problems. The only
reason to use a whitelist instead of a blacklist is because we should be
paranoid and not assume as new functions get added to the kernel, they
are safely probeable, as we do now.
Writing a script to do this testing is not difficult, except for the
problems with lockups which require a way to remotely reboot a system.
This requires we assume the existence of special hardware or that the
test system is running on a specific virtualization system. This needs
done regardless of what we decide about the need for a whitelist. I
hoped to provoke some discussion about this. We've talked about it, but
has anyone actually written any test scripts to test all the kernel
functions this way?
I can tell you that looking into the problems probing
'kernel.function("*")' on x86 over the last couple of days I've
rebooted my test system (what seems like) countless times. I
certainly agree with you that we'll need special hardware (perhaps x10
could be a simple start) or virtualization to get this going using a
script. I do think that this testing would be extremely useful, even
without a whitelist feature.
I wonder if we really might need various levels of "whitelists" to
satisfy customer concerns. Something like anyone in group A can only
probe syscalls, users in group B can probe syscalls + exported kernel
Let us think of a white list not as a tool to increase systemtap
stability but as a tool to decrease tap script debug time.
If I were a system manager in an environment where my next house payment
depended on system-up time, I would never run any tap script that I
had not fully tested, or was supplied by my ldp. Therefor the white
list only helps me in a test environment by speeding up the testing of
scripts to be use later in production. In other words the white list
helps me from falling in pitfalls by using untested tap points. But it
wont eliminate finding new pitfalls during my testing.
But thinking about it now, that is the same thing the black list is
Testing is a good thing, but we should match the effort with the correct
paradigm and work on maintaining just the black list.
IBM Linux Technology Center
Beaverton, Oregon, USA