This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: testsuite and hardcoded timeouts


On Tue, May 15, 2007 at 06:47:34PM -0400, Frank Ch. Eigler wrote:
Quentin Barnes <qbarnes@urbana.css.mot.com> writes:

[...]
Ah, maybe there is some middle ground here.  Instead of putting the
effort into figuring out some portable method for dynamic timeouts,
just change the behavior for a timeout to be user-settable [...]

It can be even easier than that. dejagnu's "timeout" tcl variable is exactly the default timeout duration in seconds. The .exp files under testsuite/config or even testsuite/lib could set this global variable based on the "ishost" predicate - leave it for i686, double it for s390x, dedicule (!) it for arm. Then we just need to police the test cases to avoid messing with this value.

Ah, ha! I _finally_ cracked why I was having so many erratic timeout problems with the test suite on ARM! It is because of existing use of "set timeout" in the test suite scripts.

I kept having timeout failures, so I kept bumping up the numbers,
and rerunning that subsection.  The problems would often go away.
Then they'd come back. Then I'd bump the timeout values up higher,
and they'd go away again.  However, when I looked at the timing
information from the tests when they'd finally pass, the numbers I
had bumped them up to made no sense at all.

What was going on is that tests using the TCL procedures "stap_run"
and "stap_run2" would get sporadic timeouts based on what the global
variable "timeout" happened to be set to last.  It is set in the
stap_run2.exp file, but the variable is a global setting.  The next
time it is set by any script anywhere, that becomes the new global
setting that stap_run and stap_run2 would just happen to use, not
the value set in the stap_run2.exp file.

Once I figured out this was the source of the erratic timeout
failures, I was able to tweak timeout values back to something much
more sane and still get tests to pass on ARM.

What I would recommend is that every expect statement that uses a
"timeout" clause inside a TCL procedure _always_ specifies its own
"-timeout" parameter never relying on the global setting.  Would
this be a reasonable change?

Expect statements at the global level outside of procedures can
continue to use the global timeout setting, but even then I'd
recommend against it.

Now that the timeouts for ARM are coming back down to reasonable
values, I think other platforms could live with them too without too
much trouble.  I'm still looking into this though.

I've included an experimental patch that makes the changes I've
recommended with adding "-timeout" parameters (and fixing the $pass4
problem mentioned below).  I have not adjusted any of these values
for ARM though, but kept the timeout's original value for comparison
to the current code base.  Sometimes there was no clear indication
of what the timeout should be, so I put one in that is probably
reasonable.

I also found and fixed what appeared to be an intermittant problem
where tests run individually would occasionally blow up with the
error:
=============
ERROR: can't read "pass4": no such variable
   while executing
"verbose -log "metric:\t$TEST_NAME $pass1$pass2$pass3$pass4$pass5""
=============

I figured out this would happen when the first test run happened to
be in the cache.  Any other time a test with a cached module would
be encountered, it would just happen to work since $pass4 was still
set to some old value.

This bug is because stap_run and stap_run2 rule for detecting cached
references was broken.  When "Pass 3: using cached ..." was matched,
the rule would swallow the "Pass 4: using cached ..." output as well
leaving the variable unset.  This was also a problem in overload.exp.

Note that these changes are very fresh and I couldn't test them
directly on ARM since they are untweaked.  I wouldn't recommend
integrating them into the mainline without testing them on a couple
of different platforms first.


Comments?


- FChE

Quentin



diff -uprN systemtap-20070512-ref/testsuite/lib/stap_run2.exp systemtap-20070512/testsuite/lib/stap_run2.exp --- systemtap-20070512-ref/testsuite/lib/stap_run2.exp 2007-05-07 13:31:59.000000000 -0500 +++ systemtap-20070512/testsuite/lib/stap_run2.exp 2007-05-17 18:45:55.000000000 -0500 @@ -8,8 +8,6 @@ # # global result_string must be set to the expected output

-set timeout 20
-
proc stap_run2 { TEST_NAME args } {
  # zap the srcdir prefix
  set test_file_name $TEST_NAME
@@ -28,9 +26,10 @@ proc stap_run2 { TEST_NAME args } {
  eval spawn $cmd

  expect {
+    -timeout 20
    -re {^Pass\ ([1234]):[^\r]*\ in\ ([0-9]+)usr/([0-9]+)sys/([0-9]+)real\ ms.\r\n}
    {set pass$expect_out(1,string) "\t$expect_out(2,string)\t$expect_out(3,string)\t$expect_out(4,string)"; exp_continue}
-    -re {^Pass\ ([34]): using cached .+\r\n}
+    -re {^Pass\ ([34]): using cached [^\r]+\r\n}
    {set pass$expect_out(1,string) "\t0\t0\t0"; exp_continue}
    -re {^Pass 5: starting run.\r\n} {exp_continue}
    -ex $output {
diff -uprN systemtap-20070512-ref/testsuite/lib/stap_run.exp systemtap-20070512/testsuite/lib/stap_run.exp
--- systemtap-20070512-ref/testsuite/lib/stap_run.exp	2007-05-07 13:31:59.000000000 -0500
+++ systemtap-20070512/testsuite/lib/stap_run.exp	2007-05-17 18:52:10.000000000 -0500
@@ -31,12 +31,13 @@ proc stap_run { TEST_NAME {LOAD_GEN_FUNC
    }
    eval spawn $cmd
    expect {
+	-timeout 30
	-re {^Pass\ ([1234]):[^\r]*\ in\ ([0-9]+)usr/([0-9]+)sys/([0-9]+)real\ ms.\r\n}
	{set pass$expect_out(1,string) "\t$expect_out(2,string)\t$expect_out(3,string)\t$expect_out(4,string)"; exp_continue}
-	-re {^Pass\ ([34]): using cached .+\r\n}
+	-re {^Pass\ ([34]): using cached [^\r]+\r\n}
	{set pass$expect_out(1,string) "\t0\t0\t0"; exp_continue}
	-re {^Pass 5: starting run.\r\n} {exp_continue}
-	-timeout 30 -re "^systemtap starting probe\r\n" {
+	-re "^systemtap starting probe\r\n" {
	    pass "$TEST_NAME startup"
	    if {$LOAD_GEN_FUNCTION != ""} then {
		#run the interesting test here
@@ -53,6 +54,7 @@ proc stap_run { TEST_NAME {LOAD_GEN_FUNC
	    set output "^systemtap ending probe\r\n$OUTPUT_CHECK_STRING$"

	    expect {
+		-timeout 20
		-re  $output {
		    pass "$TEST_NAME shutdown and output"
		    expect {
diff -uprN systemtap-20070512-ref/testsuite/lib/systemtap.exp systemtap-20070512/testsuite/lib/systemtap.exp
--- systemtap-20070512-ref/testsuite/lib/systemtap.exp	2007-05-07 13:31:59.000000000 -0500
+++ systemtap-20070512/testsuite/lib/systemtap.exp	2007-05-17 18:48:21.000000000 -0500
@@ -82,6 +82,7 @@ proc stap_run_batch {args} {
    }

expect { + -timeout -1
-re {[^\r]*\r} { verbose -log $expect_out(0,string); exp_continue } eof { }
timeout { exp_continue } diff -uprN systemtap-20070512-ref/testsuite/systemtap.base/alternatives.exp systemtap-20070512/testsuite/systemtap.base/alternatives.exp
--- systemtap-20070512-ref/testsuite/systemtap.base/alternatives.exp 2007-05-07 13:31:59.000000000 -0500
+++ systemtap-20070512/testsuite/systemtap.base/alternatives.exp 2007-05-17 18:57:04.000000000 -0500
@@ -26,6 +26,7 @@ proc stap_run_alternatives {args} {
verbose -log "starting $args"
eval spawn $args
expect { + -timeout 20
-re {semantic error: .+ \(alternatives: [a-zA-Z_]}
{ set alternatives_found 1 }
-re {[^\r]*\r} { verbose -log $expect_out(0,string); exp_continue } diff -uprN systemtap-20070512-ref/testsuite/systemtap.base/overload.exp systemtap-20070512/testsuite/systemtap.base/overload.exp
--- systemtap-20070512-ref/testsuite/systemtap.base/overload.exp 2007-05-07 13:31:59.000000000 -0500
+++ systemtap-20070512/testsuite/systemtap.base/overload.exp 2007-05-17 19:18:30.000000000 -0500
@@ -25,8 +25,9 @@ proc stap_run_overload { TEST_NAME EXPEC
set cmd [concat {stap -v} $args]
eval spawn $cmd
expect {
- -re {^Pass\ [1234]: .+real\ ms.\r\n} {exp_continue}
- -re {^Pass\ ([34]): using cached .+\r\n} {exp_continue}
+ -timeout 30
+ -re {^Pass\ [1234]: [^\r]+real\ ms.\r\n} {exp_continue}
+ -re {^Pass\ ([34]): using cached [^\r]+\r\n} {exp_continue}
-re {^Pass 5: starting run.\r\n} {exp_continue}
-re {ERROR: probe overhead exceeded threshold\r\n} {
if {$EXPECT_OVERLOAD} {
@@ -35,7 +36,7 @@ proc stap_run_overload { TEST_NAME EXPEC
fail "$TEST_NAME unexpected overload"
}
}
- -timeout 30 -re "^systemtap starting probe\r\n" {
+ -re "^systemtap starting probe\r\n" {
send "\003"


expect {
diff -uprN systemtap-20070512-ref/testsuite/systemtap.maps/foreach_fail.exp systemtap-20070512/testsuite/systemtap.maps/foreach_fail.exp
--- systemtap-20070512-ref/testsuite/systemtap.maps/foreach_fail.exp 2006-10-17 17:10:58.000000000 -0500
+++ systemtap-20070512/testsuite/systemtap.maps/foreach_fail.exp 2007-05-17 18:59:10.000000000 -0500
@@ -7,6 +7,7 @@ if {![installtest_p]} { untested $test;


spawn stap $srcdir/$subdir/$test.stp
expect {
+ -timeout 20
timeout { fail "$test timed out" }
eof { diff -uprN systemtap-20070512-ref/testsuite/systemtap.stress/conversions.exp systemtap-20070512/testsuite/systemtap.stress/conversions.exp
--- systemtap-20070512-ref/testsuite/systemtap.stress/conversions.exp 2007-05-07 13:31:59.000000000 -0500
+++ systemtap-20070512/testsuite/systemtap.stress/conversions.exp 2007-05-17 19:03:17.000000000 -0500
@@ -9,6 +9,7 @@ foreach value {0 0xffffffff 0xffffffffff
set errs 0
verbose -log "exp $test $errs"
expect {
+ -timeout 20
-re {ERROR[^\r\n]*\r\n} { incr errs; exp_continue }
-re {WARNING[^\r\n]*\r\n} { incr errs; exp_continue }
eof { }
diff -uprN systemtap-20070512-ref/testsuite/systemtap.stress/whitelist.exp systemtap-20070512/testsuite/systemtap.stress/whitelist.exp
--- systemtap-20070512-ref/testsuite/systemtap.stress/whitelist.exp 2007-05-07 13:31:59.000000000 -0500
+++ systemtap-20070512/testsuite/systemtap.stress/whitelist.exp 2007-05-17 19:32:15.000000000 -0500
@@ -301,8 +301,8 @@ proc whitelist_run { TEST_NAME {LOAD_GEN
catch {eval spawn $cmd}
set stap_id $spawn_id
set failed 1
- set timeout 600
expect {
+ -timeout 1800
-i $stap_id -re {^Pass\ ([1234]):\ [^\r]*\r\n} {
set error_msg "pass$expect_out(1,string)";
exp_continue
@@ -313,7 +313,7 @@ proc whitelist_run { TEST_NAME {LOAD_GEN
send -i $stap_id "\003"
exp_continue
}
- -timeout 1800 -re {Pass\ 5:\ run\ completed} {
+ -re {Pass\ 5:\ run\ completed} {
set failed 0
}
-re {parse\ error|semantic\ error} { set detail "$expect_out(0,string)" }
@@ -327,7 +327,6 @@ proc whitelist_run { TEST_NAME {LOAD_GEN


proc runbenchs {} {
    global benchs
-    set timeout 900
    set runningcount 0

    foreach bench $benchs {
@@ -346,6 +345,7 @@ proc runbenchs {} {

    while {$runningcount > 0} {
    	expect {
+		-timeout 900
    		-i $idlist -re {LTP\ Version:\ LTP-([0-9])+\r\n$} {
    			set from $expect_out(spawn_id)
    			lappend benchres($from) $expect_out(buffer)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]