Systemtap High Level Test Plan Brad Chen brad dot chen at intel dot com May 2005 1. OVERVIEW Testing is going to be really important in this project. 1.1 Prioritizing Test Activities Safety is a critical success factor for Systemtap; many people will criticize or reject the system if it is conspicuously less safe than DTrace. It must be improbable/impossible that a system crash occur because of Systemtap. Quality is a key concern for safety, and that makes testing a high priority for this project. It must be improbable that a system crash occur because of a Systemtap bug. This tends to suggests giving highest priority to tests and bugs related to a potential system corruption issue. 1.2 Recruiting Collaborators It would be incredibly helpful if existing kernel and performance tools test teams would consider incorporating Systemtap tests into their existing test infrastructure. Who might we work with on this? What kind of bribes would it require? Candidates: - Distribution Vendors (RedHat, ...) - Big OEMs (IBM, SGI, HP, ...) - ??? 1.3 Setting Realistic Goals The sad truth is we probably won't be able to do all the testing we'd like to. How can we prioritize tests effectively so our limited resources go to doing the most important testing first? 2. HIGH LEVEL TEST PLAN 2.1 Unit Tests. Confirm correct functioning of each component in isolation Purpose: Test components in isolation. Find most bugs (as many as possible) during software development rather than end-to-end testing. 2.2 Kernel Tests Purpose: Make sure the kernel builds, boots, runs correctly for standard kernel test suite, and is crashproof. Should include an additional set of unit tests and regression tests that require installing Systemtap and booting the kernel. 2.3 Integration Test Purpose: Comprehensive test of the correct functioning of Systemtap Installation/config testing. That the system can be installed and configured on all platforms. Functional Test matrix defined by platforms x scripts x workloads Scaleability testing defined by platforms x (scripts, workloads) pairs Error handling test matrix defined by platforms x error cases -- includes portability, scaleability, error case testing 2.4 Stress Test Stress test matrix defined by platforms x scripts x workloads 2.5 Platforms All tests are assumed executed independently on each platform by the platform owner. - Processor Architectures: x86 x86-64 Itanium PPC - Operating Systems: RedHat EL4 U2 other Linux releases to follow... - MP configurations: uniprocessors, dual-core, HT, multiprocessors up to 8-way Responsibilities - IA32: RedHat, IBM, Intel to test - x86-64: RedHat, IBM, Intel to test - PPC: IBM to test - Itanium: Intel to test 3. UNIT TESTS * All major components will have unit testing infrastructure, maintained by the component owner. * Unit tests will be executable as a part of the build process. Suggested target: "make unittest" * Kernel components should provide test jigs to allow user-level testing whenever possible. * Unit testing will include a regression suite. The regression suite should cover all fixed bugs for the component. * Unit tests must cover both functionality for correct scripts as well as error handling Each of the following components should create and maintain their own unit tests and unit testing plan: 3.1 parser 3.2 translator 3.3 runtime 3.4 kprobes 3.5 validator 3.6 tapset infrastructure 3.8 tapsets 4. Kernel Test 4.1 Build, boot, and Systemtap unit tests 4.2 Pass on a subset of ltp Suggestion: ltp-20050405 http://ltp.sourceforge.net http://prdownloads.sourceforge.net/ltp/ltp-full-20050405.tgz?download 4.3 Systemtap small: Pass on ltp subset with each of the following Systemtap scripts: - systemcall profiler - scheduler profiler - statistical call-graph 5. Integration Test 5.1 Installation testing 5.2 Functional test matrix 5.2.1 Test Scripts - System call profiler - systemwide - by process - other variants - Statistical call graph profiler - time-based - event-based - Scheduler activity tracer - Lock activity tracer - Arbitrary simultaneous combinations of the above 5.2.2 Workloads subset of ltp 5.3 Scaleability testing Test MP platforms, largest first, on the following script/workload pairs: - TBD... 5.4 Error case testing. A growing list of error cases to check: - infinite recursive: direct, indirect - infinite loop - division by zero - underflow, overflow - stack overflow - excessively large array - invalid pointers, array bounds errors (hopefully impossible to test) - memory read/write restrictions - control flow restrictions - bogus runtime - illegal instrumentation (in Systemtap runtime, etc.) - kernel stack overflow - large number of threads - frequent thread affinity changes - ... 6. Stress Testing Stress testing uses the same platforms and test scripts as the Integration tests, with some additional test scripts and a different set of workloads. 6.1 Scenarios for Stress testing - Increase interrupt frequency until failure, single timer-based probe - Increase interrupt frequency until failure, all kernel functions (plus timer-based probe if necessary) - Instrument all possible taps and then run LTP subset. - Find the worst-case case signal-stack kernel code (probably a driver), and instrument it with kprobes/systemtap to see what happens. - max number of probes - max probe frequency: one probe - max probe frequency: many probes ====================================================================== NOTES Test workloads - Oracle - PostgreSQL - MySQL - kernel stress tests References: http://ltp.sourceforge.net http://ltp.sourceforge.net/25_testplan.php http://ltp.sourceforge.net/execmatrix.php http://prdownloads.sourceforge.net/ltp/ltp-full-20050405.tgz?download