10791 – parallelize systemtap testsuite

Bug 10791 - parallelize systemtap testsuite

Summary: parallelize systemtap testsuite

Status:	RESOLVED FIXED

Alias:	None

Product:	systemtap
Classification:	Unclassified
Component:	testsuite (show other bugs)
Version:	unspecified

Importance:	P2 normal
Target Milestone:	---
Assignee:	Martin Cermak

URL:
Keywords:

Depends on:
Blocks:

Reported:	2009-10-16 20:18 UTC by Frank Ch. Eigler
Modified:	2016-04-13 13:35 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
list of slow tests (5+ mins) (12.90 KB, application/gzip) 2016-02-29 09:43 UTC, Martin Cermak	Details
list of files written by systemtap testcases (38.11 KB, application/gzip) 2016-02-29 10:22 UTC, Martin Cermak	Details
working version of a patch (7.36 KB, patch) 2016-03-01 21:25 UTC, Martin Cermak	Details \| Diff
list of files written by systemtap testcases (945.35 KB, application/x-xz) 2016-03-02 14:41 UTC, Martin Cermak	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Frank Ch. Eigler 2009-10-16 20:18:11 UTC

See gdb/testsuite/Makefile.in for the hacky but effective way
of running testsuite/{tool}.* tests in parallel, then catting the
individual .log/.sum files together.

Comment 1 Martin Cermak 2016-02-25 10:03:39 UTC

My recent stats for systemtap's make installcheck are as follows:
------------------------------------------------------------------------------
rhel6    i386    MAX: 11:40:36         MEAN:  3:45:04         MEDIAN:  3:21:01
rhel6    x86_64  MAX:  6:44:16         MEAN:  3:19:03         MEDIAN:  3:00:46
rhel6    ppc64   MAX: 16:20:03         MEAN:  5:42:08         MEDIAN:  4:02:35
rhel6    s390x   MAX:  3:37:31         MEAN:  2:10:49         MEDIAN:  2:09:37
rhel7    x86_64  MAX:  9:37:39         MEAN:  4:48:23         MEDIAN:  4:15:52
rhel7    ppc64   MAX: 25:05:10         MEAN:  6:10:03         MEDIAN:  4:23:52
rhel7    ppc64le MAX:  4:53:26         MEAN:  3:23:12         MEDIAN:  3:36:14
rhel7    s390x   MAX:  4:09:53         MEAN:  2:14:52         MEDIAN:  2:24:33
rhel7    aarch64 MAX:  8:31:14         MEAN:  5:39:20         MEDIAN:  6:17:33

Shortening the run time (and not breaking the test results due to possible races) would be beneficial.  First I tried something as simple as this:

> 
> =======
> # cat paralleltest.sh 
> #!/bin/bash
>
> for dir in `find . -type d -name 'systemtap.*'`
> do
>     make installcheck RUNTESTFLAGS="--outdir=$dir $dir/*exp >& $dir/output.log" &
> done
>
> wait
>
> ./dg-extract-results.sh systemtap.*/systemtap.sum > systemtap.sum
> ./dg-extract-results.sh -L systemtap.*/systemtap.log > systemtap.log
> #
> =======
>
> Where dg-extract-results.sh is taken from https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/testsuite/dg-extract-results.sh

That works.  The GDB testsuite testlog merger can seemingly be used as-is for the systemtap testsuite.  But the GDB testsuite's Makefile trick is nice!  The (slightly outdated) rhel7 gdb testsuite seems to group tests by directory (similarly to the above), whereas the current upstream version tends to run each testcase separately (with some exceptions for long-running tests).  Related thoughts in: commit 81208388898d0d561dc0c167b8bb7b2d77e40381, Author: Tom Tromey.  Using the GDB upstream approach in the systemtap testsuite's Makefile (under the installcheck target):

=======
blah: site.exp                                                                  
        -rm -rf cache outputs temp                                              
        $(MAKE) -k do-check-parallel; \                                         
        result=$$?; \                                                           
        $(SHELL) $(srcdir)/dg-extract-results.sh \                              
          `find outputs -name systemtap.sum -print` > systemtap.sum; \          
        $(SHELL) $(srcdir)/dg-extract-results.sh -L \                           
          `find outputs -name systemtap.log -print` > systemtap.log; \          
        sed -n '/=== systemtap Summary ===/,$$ p' systemtap.sum; \              
        exit $$result                                                           
                                                                                
all_tests := $(shell cd $(srcdir) && find systemtap.* -name '*.exp' -print)     
TEST_TARGETS := $(addprefix check/,$(all_tests))                                
                                                                                
do-check-parallel: $(TEST_TARGETS)                                              
        @:                                                                      
                                                                                
check/%.exp:                                                                    
        -mkdir -p outputs/$*                                                    
        -$(MAKE) $(AM_MAKEFLAGS) check-DEJAGNU RUNTESTFLAGS="--outdir=outputs/$* $*.exp $(RUNTESTFLAGS) --tool_opts \'install $(TOOL_OPTS)\'"
=======

Using some random (probably not a representative sample) subset of testcases I see:

make installcheck: 1m30                                                         
time make -j 1 blah: 2m32                                                          
time make -j 2 blah: 1m22                                                          
time make -j 3 blah: 1m2                                                           
time make -j 4 blah: 1m1                                                           
time make -j 5 blah: 1m3                                                           
time make -j 6 blah: 1m5                                                           
time make -j 7 blah: 0m44, 1m9, 1m10                                               
time make -j 8 blah: 1m12                                                          

There is a significant overhead in lib/systemtap.exp:
echo "pass empty" > systemtap.printf/empty.exp                                     
time make installcheck RUNTESTFLAGS=empty.exp                                      
real    0m3.630s                                                                   
                                                                                   
After removing most of the flesh from it I'm (still) getting:
time make installcheck RUNTESTFLAGS=empty.exp
real    0m0.923s

From the GDB testsuite/README:

Parallel testing                                                                
****************                                                                
                                                                                
If not testing with a remote host (in DejaGnu's sense), you can run             
the GDB test suite in a fully parallel mode.  In this mode, each .exp           
file runs separately and maybe simultaneously.  The test suite ensures          
that all the temporary files created by the test suite do not clash,            
by putting them into separate directories.  This mode is primarily              
intended for use by the Makefile.                                               
                                                                                
For GNU make, the Makefile tries to run the tests in parallel mode if           
any -j option is given.  For a non-GNU make, tests are not                      
parallelized.                                                                   
                                                                                
If RUNTESTFLAGS is not empty, then by default the tests are                     
serialized.  This can be overridden by either using the                         
`check-parallel' target in the Makefile, or by setting FORCE_PARALLEL           
to any non-empty value:                                                         
                                                                                
        make check-parallel RUNTESTFLAGS="--target_board=native-gdbserver"      
        make check RUNTESTFLAGS="--target_board=native-gdbserver" FORCE_PARALLEL=1
                                                                                
If you want to use runtest directly instead of using the Makefile, see          
the description of GDB_PARALLEL below.

Comment 2 David Smith 2016-02-25 16:24:06 UTC

One of the tricky parts here is noted in the gdb documentation:

====
The test suite ensures that all the temporary files created by the test suite do not clash, by putting them into separate directories.
====

I'd guess that about half of our tests create a directory for temporary files, and about half don't. Now in most of those cases where the temporary files are left in the testsuite directory the names probably don't clash, but we can't really be sure.

The safest thing here would be to make sure that all temporary files are put in their own directory (most likely created with 'mktemp -d').

Comment 3 Frank Ch. Eigler 2016-02-25 16:26:03 UTC

> I'd guess that about half of our tests create a directory for temporary
> files, and about half don't.

In many cases, our tests create temporary files -unnecessarily-.  These
could/should be materialized into the checked-in testsuite source directory
instead.

Comment 4 Martin Cermak 2016-02-29 09:43:30 UTC

Created attachment 9050 [details]
list of slow tests (5+ mins)

As the GDB testsuite's Makefile.in suggests, slow tests should get run first. Grepping through my recent test logs, I've collected a list of tests running 5 mins or longer in some of my test environments. The final list of "slow tests" might look like this:

=======
$ zcat slowtests.txt.gz | awk '{tests[$2] += $1} END { for (test in tests) { print int(tests[test]/length(tests[test]))" "test } }' | sort -nr | awk '{if ($1>500) { print $1" "$2 } }'
90140 ./systemtap.examples/check.exp
63051 ./systemtap.pass1-4/buildok.exp
39786 ./systemtap.base/listing_mode.exp
36766 ./systemtap.syscall/nd_syscall.exp
32244 ./systemtap.unprivileged/unprivileged_myproc.exp
25596 ./systemtap.syscall/syscall.exp
25330 ./systemtap.base/sdt_misc.exp
25097 ./systemtap.pass1-4/parseok.exp
22534 ./systemtap.base/listing_mode_sanity.exp
20349 ./systemtap.pass1-4/parse-semok.exp
14486 ./systemtap.pass1-4/semok.exp
13074 ./systemtap.unprivileged/unprivileged_probes.exp
10981 ./systemtap.pass1-4/buildok-interactive.exp
8739 ./systemtap.onthefly/uprobes_onthefly.exp
7414 ./systemtap.server/server_concurrency.exp
6337 ./systemtap.onthefly/kprobes_onthefly.exp
5653 ./systemtap.base/sdt.exp
3810 ./systemtap.onthefly/hrtimer_onthefly.exp
3649 ./systemtap.onthefly/tracepoint_onthefly.exp
2663 ./systemtap.exelib/exelib.exp
2438 ./systemtap.pass1-4/semko.exp
2365 ./systemtap.pass1-4/parseko.exp
2010 ./systemtap.base/rlimit.exp
1850 ./systemtap.base/tracepoints.exp
1600 ./systemtap.base/systemtap-service.exp
1390 ./systemtap.maps/map_hash.exp
1051 ./systemtap.base/stmt_rel.exp
675 ./systemtap.base/library.exp
505 ./systemtap.server/client_args.exp
$
=======

Comment 5 Martin Cermak 2016-02-29 10:22:11 UTC

Created attachment 9051 [details]
list of files written by systemtap testcases

To get some idea about possible temp file clashes among the tests, I've used following command on a system running mostly the testsuite only:

=======
stap -ve 'global lasttest probe syscall.open { if(isinstr(filename, ".exp")) lasttest=filename  if(filename =~ "/usr/share/systemtap/testsuite/" && filename !~ "/usr/share/systemtap/testsuite/.systemtap" && _sys_open_flag_str(__int32($flags)) =~ "W") printdln(", ", filename, _sys_open_flag_str(__int32($flags)), lasttest ) }'
=======

It's not absolutely fleckless, I know: It doesn't take into accoutnt files outside of the testsuite directory (like /etc stuff and so on), and it also ignores systemtap's cache (but this should be okay in case the cache mechanism isn't buggy). Note that the standard testsuite cache location is /usr/share/systemtap/testsuite/.systemtap, but some tests appear to use /usr/share/systemtap/testsuite/.cache_test-root.

Following is the list of potentially problematic files:

=======
$ zcat writes.gz | awk '{print $1" "$3}' | sort -u | awk '{ files[$1]++ } END { for (file in files) if (files[file]>1) print file }'
"/usr/share/systemtap/testsuite/.cache_test-root/cache/5e/stapconf_5e8d30c12177c8f48b4b1538bdfc5fd5_751_hash.log",
"/usr/share/systemtap/testsuite/.cache_test-root/cache/cache_clean_interval_s",
"/usr/share/systemtap/testsuite/.cache_test-root/cache/cache_mb_limit",
"/usr/share/systemtap/testsuite/server.log",
"/usr/share/systemtap/testsuite/test",
"/usr/share/systemtap/testsuite/.cache_test-root/cache/34/stap_348eb98068ee7b8f3e54fe01d7dc4823_812_hash.log",
"/usr/share/systemtap/testsuite/test.stp",
"/usr/share/systemtap/testsuite/test.c",
$
=======

Here is the list of potentially problematic testcases (respective to the above file list):

=======
"./lib/compile_flags.exp"
"./systemtap.base/cache_clean.exp"
"./systemtap.base/cache.exp"
"./systemtap.base/hash_add_buildid.exp"
"./systemtap.base/pr18649.exp"
"./systemtap.base/uprobe_nd_params.exp"
"./systemtap.pass1-4/buildok-interactive.exp"
"./systemtap.server/client.exp"
"./systemtap.server/server_args.exp"
"./systemtap.server/server_concurrency.exp"
"./systemtap.server/server.exp"
"./systemtap.server/server_locale.exp"
=======

So this should either get fixed, or each test should get run in some separate "mktemp -d" (not sure about how to effectively+elegantly do this yet). Or both.

Comment 6 David Smith 2016-02-29 16:40:58 UTC

(In reply to Martin Cermak from comment #4)
> Created attachment 9050 [details]
> list of slow tests (5+ mins)
> 
> As the GDB testsuite's Makefile.in suggests, slow tests should get run
> first. Grepping through my recent test logs, I've collected a list of tests
> running 5 mins or longer in some of my test environments. The final list of
> "slow tests" might look like this:

In systemtap's case, I believe most (if not all) of the tests in systemtap.server must be run sequentially. Each test starts up a new server and only expects one local server to running.

Comment 7 David Smith 2016-03-01 20:42:41 UTC

(In reply to Martin Cermak from comment #5)

... stuff deleted ...

> Here is the list of potentially problematic testcases (respective to the
> above file list):
> 
> =======
> "./lib/compile_flags.exp"
> "./systemtap.base/cache_clean.exp"
> "./systemtap.base/cache.exp"
> "./systemtap.base/hash_add_buildid.exp"
> "./systemtap.base/pr18649.exp"
> "./systemtap.base/uprobe_nd_params.exp"
> "./systemtap.pass1-4/buildok-interactive.exp"
> "./systemtap.server/client.exp"
> "./systemtap.server/server_args.exp"
> "./systemtap.server/server_concurrency.exp"
> "./systemtap.server/server.exp"
> "./systemtap.server/server_locale.exp"
> =======
> 
> So this should either get fixed, or each test should get run in some
> separate "mktemp -d" (not sure about how to effectively+elegantly do this
> yet). Or both.

Hmm, that list is missing several culprits. After a test run, I see the following left behind in the testsuite directory:

====
&1
bad-code
cast-user.exe
debugtypes.exe
dtrace-1.od
dtrace-2.od
dtrace-test.h
dtrace-test.o
fib
hash_add_buildid
libsdt_V1_uprobe_loop.so
libsdt_V1_uprobe.so
libsdt_V2_uprobe_loop.so
libsdt_V2_uprobe.so
libsdt_V3_uprobe_loop.so
libsdt_V3_uprobe.so
main_quiesce_exe
multisym
multisym_baz.o
multisym_main.o
partial-class-type
partial-class-type-heap.o
partial-class-type-main.o
sdt_misc_.h
sdt_misc_.o
sdt_misc_V1_uprobe_loop_shared.x
sdt_misc_V1_uprobe_loop.x
sdt_misc_V1_uprobe_shared.x
sdt_misc_V1_uprobe.x
sdt_misc_V2_uprobe_loop_shared.x
sdt_misc_V2_uprobe_loop.x
sdt_misc_V2_uprobe_shared.x
sdt_misc_V2_uprobe.x
sdt_misc_V3_uprobe_loop_shared.x
sdt_misc_V3_uprobe_loop.x
sdt_misc_V3_uprobe_shared.x
sdt_misc_V3_uprobe.x
sdt_types.x
,semclean
,semout
server.log
site.exp
stap-list-servers.out
stmt_rel_user.x
stpd_cpu0
stpd_cpu1
stpd_cpu2
stpd_cpu3
stpd_cpu4
stpd_cpu5
stpd_cpu6
stpd_cpu7
task_fd_lookup
uprobe_uaddr
uprobe_uaddr_mark
utf_user_t.so
vma_vdso-m32.exe
vma_vdso-m64.exe
====

Your list doesn't include any of the sdt tests, which produce the *sdt* files. We'll have to track down where some of those other files come from.

I'll bet your stap script is finding opens with a full pathname, but missing them with a relative pathname or just a bare file name.

Comment 8 Martin Cermak 2016-03-01 21:25:34 UTC

Created attachment 9056 [details]
working version of a patch

Note that (In reply to David Smith from comment #7)
> Hmm, that list is missing several culprits. 
Right. Will look into it.

Note that currently I'm testing attached working version of a patch.

Comment 9 Martin Cermak 2016-03-02 14:41:21 UTC

Created attachment 9058 [details]
list of files written by systemtap testcases

> Note that (In reply to David Smith from comment #7)
> > Hmm, that list is missing several culprits. 
> Right. Will look into it.

Inspired myself in PR17920, updated the script thusly:

=======
# cat detect.stp 
global __lasttest

probe syscall.open
{
        #if (isinstr(filename, ".exp")) __lasttest[tid()]=filename
        if (isinstr(filename, ".exp")) __lasttest=filename
}

probe syscall.open.return
{
        #if (pid() != target()) next
        if ($return > 0)
        {
                try
                {
                        file = task_fd_lookup(task_current(), $return)
                        __filename=fullpath_struct_path(&@cast(file, "file")->f_path)
                        __flags_str=_sys_open_flag_str(__int32($flags))
                        #if(__filename =~ "/root/systemtap/testsuite/" && __flags_str =~ "W")
                        if(__flags_str =~ "W")
                        {
                                printdln(", ", __filename, __flags_str, __lasttest)
                        }
                }
                catch
                {
                        printf("*** %d: fail reading file\n", $return)
                }
        }
}
=======

After running it with the testsuite ( stap -g --suppress-handler-errors --suppress-time-limits -o /tmp/writes detect.stp -c 'make installcheck' ) I got attached unfiltered log. Looks sane wrt Comment #7. Analysing it...

Comment 10 David Smith 2016-03-02 17:29:34 UTC

Comment on attachment 9056 [details]
working version of a patch

I see you moved the "environment-sanity" tests to environment-sanity.exp in your patch. I don't see where that test gets run first and if it fails no other test is run, but perhaps I missed it.

The problem with your testsuite case changes (to pr18649.exp and uprobe_nd_params.exp) in that patch is that you just papered over the problem with writing files to the testsuite directory. You made the filenames the testcases write more unique, but that doesn't prevent another test from doing "rm *.c" in the testsuite directory and causing those 2 tests to fail.

A better fix would be something like the following (which is untested):

====
# Create a tempory directory.
if {[catch {exec mktemp -d -t staptestXXXXXX} tmpdir]} {
    verbose -log "Failed to create temporary directory: $tmpdir"
    return
}
set test_srcpath "${tmpdir}/${test}.c"
set test_exepath "${tmpdir}/${test}"
set test_stppath "${tmpdir}/${test}.stp"
====

At the test end we'd want to remove the temporary directory instead of the individual files.

Alternatively, in the case of those 2 tests, we could just go ahead and create the .c and .stp files in the systemtap.base directory. You'd still have the problem of the test creating test executables in the testsuite directory. The solution there would be similar to the above to create a temporary directory to hold the test executable.

In a perfect world the tests would never write to the testsuite directory (except for saving the test output). That way we'd ensure the tests could never interfere with each other (at least in that manner).

Comment 11 Martin Cermak 2016-03-16 06:55:38 UTC

I agree that separating testcase "working" directories is a good thing and I've done that for most of the tests (that actually do some such writes [about 100 testcases]) in the mcermak/paralleltest branch:

=======
8172649 Make testcases use the "artifact storage" (cont).
818a015 Make testcases use the "artifact storage" (cont).
09860f7 Make testcases use the "artifact storage" (cont).
ff2624e Make testcases use the "artifact storage".
75d1185 Use the artifacts dir "concept".
=======

There are still about 10 tests that need this update. Those are non-straighforward ones that are not passing and/or run for long time, so it's not easy to doublecheck that my update wouldn't break them [even more]. So I'll be addressing those as time permits.

Comment 12 Martin Cermak 2016-03-16 07:46:45 UTC

I am hitting one small make related problem that slightly complicates running some testcases with `make -jN`, such as:

=======
-bash-4.2# make installcheck RUNTESTFLAGS='unprivileged_myproc.exp'

... stuff deleted ...

Running ./systemtap.unprivileged/unprivileged_myproc.exp ...

                === systemtap Summary ===

# of expected passes            430
# of known failures             16
# of untested testcases         12
make[1]: Leaving directory `/root/systemtap/testsuite'
if test -n ""; then mail  < systemtap.sum; fi
-bash-4.2# 
=======

versus:

=======
-bash-4.2# make -j2 installcheck RUNTESTFLAGS='unprivileged_myproc.exp'

... stuff deleted ...

Running ./systemtap.unprivileged/unprivileged_myproc.exp ...
FAIL: unprivileged myproc: --unprivileged process.function(string)
FAIL: unprivileged myproc: --privilege=stapusr process.function(number)
FAIL: unprivileged myproc: --unprivileged process.function(number).call
^Cgot a INT signal, interrupted by user 

                === systemtap Summary ===

# of expected passes            5
# of unexpected failures        3
# of known failures             1
make: *** [installcheck] Interrupt

-bash-4.2# 
=======

I've been hitting this problem earlier (elsewhere) and fixed it in following commit (lib/systemtap.exp):

=======
commit c034e5d51b72f697ca8e6fd2a0069fab7ca5aef3
Author: Martin Cermak <mcermak@redhat.com>
Date:   Thu Mar 3 18:00:07 2016 +0100

    Don't consider stderr output as an error case.
    
    Tcl's exec() by default fails if the stderr of exec'd function is not-empty.
    Testsuite's operation is driven by make.  Funnily enough, if exec() calls stap
    *and* make, running somewhere alongside, decides to produce some stderr at the
    same time, exec() catches make's stderr too and -- fails. For example, when
    running:
    
    if {[catch {exec stap -p4 -e {probe begin {exit()}}} result]} {
    
    follofing output can pretty surprisingly get captured in $result:
    
    =======
    make[3]: Entering directory '/root/systemtap/testsuite'
    make[3]: Leaving directory '/usr/src/kernels/4.2.3-300.fc23.x86_64'
    /root/systemtap/testsuite/.systemtap-root/cache/9f/stap_9f465127b031099d258834cf23703c07_1401.ko
    make[3]: warning: -jN forced in submake: disabling jobserver mode.
    =======
    
    To work this around exec -ignorestderr needs to be used. This way exec()
    doesn't consider non-empty stderr as an error case, but still is able to
    catch non-zero exit code of the executed command, and treat it accordingly.

=======

I think the fix is okay, but the problem is less mysterious than I thought:

=======
# cat Makefile 
xxx:
        @stap --poison-cache -e 'probe syscall.read { log("hey") exit() }'
# 
# make xxx
hey
# 
# make -j1 xxx
hey
# 
# make -j2 xxx
make[1]: warning: -jN forced in submake: disabling jobserver mode.
hey
# 
# make -j2 xxx 2>/dev/null
hey
# 
=======

So, here, in `make -j2 xxx` case there is an extra output to stderr. This gets captured by tcl's exec() and if exec() is used without `-ignorestderr`, it fails. The warning message comes from make-3.82/main.c:1755 and I think it can't be selectively suppressed (sigh). So I am thinking about how to work this around in a general way applicable to all the testcases. It turns out that stap's -B can (almost) be misused for this blessing purpose, because -B doesn't only accept a 'NAME=VALUE' form arg, but also accepts e.g. '-j1':

=======
# stap -p4 -B '-j1' --poison-cache  --vp 0004 -e 'probe syscall.read { log("hit") exit() }' |& grep 'make.*\-j'
Running env -uARCH -uKBUILD_EXTMOD -uCROSS_COMPILE -uKBUILD_IMAGE -uKCONFIG_CONFIG -uINSTALL_PATH PATH=/usr/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin make -C /lib/modules/3.10.0-327.10.1.el7.x86_64/build M=/tmp/stapCNezGs modules ARCH=x86_64 CONFIG_DEBUG_INFO= -j1 V=1 -j5
#
=======

Here, however, our -j1 gets overriden by another -j5, hmmm. 

For now I fail to see a nice general solution to this small problem. In this case of unprivileged_myproc.exp, I'm going to use following update again:

=======
# git diff
diff --git a/testsuite/systemtap.unprivileged/unprivileged_myproc.exp b/testsuite/systemtap.unprivileged/unprivileged_myproc.exp
index b9ed7fa..2b31469 100644
--- a/testsuite/systemtap.unprivileged/unprivileged_myproc.exp
+++ b/testsuite/systemtap.unprivileged/unprivileged_myproc.exp
@@ -417,7 +417,7 @@ proc test_probes { privilege } {
 
        # Run stap.
        verbose -log "eval exec $cmd"
-       set rc [catch {eval exec $cmd} res_stap]
+       set rc [catch {eval exec -ignorestderr $cmd} res_stap]
        verbose -log "rc == $rc"
        verbose -log "$res_stap"
 
#
=======

If you, guys, see a better option, please, let me know.

Comment 13 Martin Cermak 2016-03-17 14:39:58 UTC

(In reply to Martin Cermak from comment #12)
> ... The problem is less mysterious than I thought:
> 
> =======
> # cat Makefile 
> xxx:
>         @stap --poison-cache -e 'probe syscall.read { log("hey") exit() }'
> # 
> # make xxx
> hey
> # 
> # make -j1 xxx
> hey
> # 
> # make -j2 xxx
> make[1]: warning: -jN forced in submake: disabling jobserver mode.
> hey
> # 
> # make -j2 xxx 2>/dev/null
> hey
> # 
> =======

I've built make that doesn't produce this warning and run the testsuite with and without it. With regular make-3.82-21.el7_2.x86_64 I'm getting following results: 

# of expected passes		7938
# of unexpected failures	1184
# of expected failures		684
# of known failures		319
# of untested testcases		127
# of unsupported tests		34

With my patched make https://url.corp.redhat.com/26fac5e The results are as follows:

# of expected passes		9287
# of unexpected failures	180
# of expected failures		684
# of known failures		319
# of untested testcases		123
# of unsupported tests		36

Also the testing time lowered from 150min down to 116min, since various timeouts are now out of the game. So since make can't be forced to drop this warning message using available command line switches, I've started to update individual tests to accept these blurbs of make in my mcermak/paralleltest branch. Really nice to see. The above looks like  usable testing results already I'd say.

Comment 14 Martin Cermak 2016-03-18 09:47:51 UTC

(In reply to Martin Cermak from comment #13)
> I've started to update individual tests to accept these blurbs of make
> in my mcermak/paralleltest branch.

Reverted that and re-fixed:

commit 7ddcadc2f81ab92a7e3d6c9e9c2146328af1fb77
Author: Martin Cermak <mcermak@redhat.com>
Date:   Fri Mar 18 09:33:04 2016 +0100

    Fix the way make nesting works.
    
    First of all, revert all the 'Make <testcase>.exp make -jN compliant
    changes. Those changes were inspired by PR10791#c13, but instead of
    addressing the root cause, they made testcases accept a valid error
    message. So, step back.
    
    The message in question is 'make: warning: -jN forced in submake:
    disabling jobserver mode.'. Make tries to communicate with its
    children using env vars (MAKEFLAGS, MAKELEVEL, MFLAGS) to ensure
    propagating -j (among other). But stap uses make internally and
    following thing can happen:
    
    =======
    # stap --poison-cache -e 'probe syscall.read { log("hey") exit() }'
    hey
    # MAKEFLAGS=' --jobserver-fds=3,4 -j' stap --poison-cache -e 'probe syscall.read { log("hey") exit() }'
    make: warning: -jN forced in submake: disabling jobserver mode.
    hey
    # 
    =======
    
    So, in order to be able to (mis)use the testsuite's make -jN to
    drive the testcase's concurrency and let stap work "normally"
    at the same time, we need to make stap unaware of the parent
    (testsuite) make. Apparently unsetting MAKEFLAGS is good enough.
    
    So this commit removes several testcase updates papering over
    the problem, and fixes Makefile.in instead to let stap's make
    work independently of the testsuite's make.


Note that by making stap's make unaware of testsuite's make, we have
testsuite make's -jN with "linear effect". Sort of, of course. Which
is good I think.


Test results: (Lenovo T530, make -j6 installcheck-parallel)
===========================================================

# of expected passes            9428
# of unexpected failures        84
# of expected failures          684
# of unknown successes          2
# of known failures             317
# of untested testcases         134
# of unsupported tests          14

Duration: 100 mins.
Load: around 7.

Comment 15 Frank Ch. Eigler 2016-03-27 15:44:59 UTC

While in the vicinity, it would be nice to investigate the case of a read-only $srcdir.  Some of the test cases cd to $srcdir to create temp files etc., and don't react well to $srcdir being actually read-only.  (Some of the test cases simply hang.)  If the new code takes care to make the artifact tree in the $builddir, then the read-only $srcdir case should also start working.

Comment 16 Martin Cermak 2016-04-13 13:35:53 UTC

Fixed in commit 311ffef18a6caa06d70d1b9c0641256c9ea6218a