This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Updating from 0303 to 0505 has problems
On Wed, May 09, 2007 at 04:41:33PM -0500, Quentin Barnes wrote:
Here's a run that works:
================
Pass 5: starting run.
systemtap starting probe
PASS: systemtap.base/be_order.stp startup
PASS: systemtap.base/be_order.stp load generation
systemtap ending probe
systemtap test success
systemtap test success
PASS: systemtap.base/be_order.stp shutdown and output
Pass 5: run completed in 400usr/1280sys/2862real ms.
metric: systemtap.base/be_order.stp 8240 8680 16931 180 70
250 0 0 0 0 0 0 400 1280 2862
testcase /usr/src/systemtap-20070505/testsuite/systemtap.base/be_order.exp
completed in 22 seconds
================
Here's the same test, but from a run when it fails:
================
Pass 5: starting run.
systemtap starting probe
PASS: systemtap.base/be_order.stp startup
PASS: systemtap.base/be_order.stp load generation
poll warning: Interrupted system call
systemtap ending probe
systemtap test success
systemtap test success
Pass 5: run completed in 440usr/1520sys/3159real ms.^M
FAIL: systemtap.base/be_order.stp shutdown (eof)
testcase /usr/src/systemtap-20070505/testsuite/systemtap.base/be_order.exp
completed in 22 seconds
================
Looking at stap_run.exp, it's looking for "systemtap ending probe\n
systemtap test success". When I run stap directly on the .stp file,
it always outputs the right string. Is this some sort of race condition
where sometimes "expect" is seeing the end-of-file condition on the file
descriptor before expect has a chance to match the pattern? Any ideas
to try out?
I cracked the problem and came up with a fix/workaround that solves
the "eof" failures. (Well, most all of them. There are other
similar ones in the testsuite that are still unaddressed, but not
due to this particular problem exactly.)
The problem is the "poll warning: Interrupted system call"
diagnostic message from staprun that's throwing off expect. Below
is a patch that fixes this problem for this message. Now whether
this is a fix or a workaround I don't know. There are two possible
fixes: 1) change stap_run.exp to account for the extra diagnostic,
2) change staprun's reader_thread() not to generate the diagnostic
in the first place. I chose the former for this patch. Which
approach would be the better fix?
Index: testsuite/lib/stap_run.exp
===================================================================
--- testsuite/lib/stap_run.exp (revision 155)
+++ testsuite/lib/stap_run.exp (working copy)
@@ -50,7 +50,7 @@ proc stap_run { TEST_NAME {LOAD_GEN_FUNC
send "\003"
# check the output to see if it is sane
- set output "^systemtap ending probe\r\n$OUTPUT_CHECK_STRING$"
+ set output "^(poll warning: \[^\r\]*\r\n)?systemtap ending probe\r\n$OUTPUT_CHECK_STRING$"
expect {
-re $output {
Quentin