This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: testsuite and hardcoded timeouts

From: David Wilder <dwilder at us dot ibm dot com>
To: William Cohen <wcohen at redhat dot com>
Cc: Quentin Barnes <qbarnes at urbana dot css dot mot dot com>, systemtap at sources dot redhat dot com
Date: Mon, 14 May 2007 14:42:26 -0700
Subject: Re: testsuite and hardcoded timeouts
References: <20070511191420.GA12285@urbana.css.mot.com> <4644DB46.1070705@redhat.com> <46489CF5.6010705@us.ibm.com> <4648C9B1.30307@redhat.com>

William Cohen wrote:

David Wilder wrote:

I ran into this issue on s390. When a time out occurs if the test would simply produce a warning message then restarts the timer, allowing the timeout to be restarted say 4 or 5 times before finally reporting a failure. Then if something breaks the test will still report a failure. On slower system the test would still pass. If a system/test normally passes with one or two restarts of the timer then something changes and it starts taking 3 or 4 restarts we will know that investigation is needed.

You might luck out with the caching helping the later attempts skip some of the phases of the translator and avoid those times on the later runs. However, restarting 4 or 5 times is probably not going to help that much if the time required to generate the module is way larger than the time out.

I was not thinking that the expiration of the time out would restart generating the module. Just warn the user that the test is taking longer than expected. So the purpose of the timer is just to print "hay I am taking too long". The real timeout that would cause the test to fail happens after 4 or 5 warning messages have been printed. This way the user is given a heads up that something may be wrong before waiting for a timeout that is long enough for even the slowest system to normally complete the test.

The timeout is there to make sure that forward progress is made on the testing. We would prefer to have the test fail in a reasonable amount of time than to have a test hang for an unreasonable amount of time and not get any results at all. The translator internals are pretty much a black box to the testing harness, so the timer is used to judge when the the test isn't making forward progress. Too bad there couldn't be an equivalent to a watchdog for the testing harness, e.g. if the test is making forward progress, leave the test be.

-Will

-- David Wilder IBM Linux Technology Center Beaverton, Oregon, USA dwilder@us.ibm.com (503)578-3789

Follow-Ups:
- Re: testsuite and hardcoded timeouts
  - From: Quentin Barnes

References:
- testsuite and hardcoded timeouts
  - From: Quentin Barnes
- Re: testsuite and hardcoded timeouts
  - From: William Cohen
- Re: testsuite and hardcoded timeouts
  - From: David Wilder
- Re: testsuite and hardcoded timeouts
  - From: William Cohen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]