This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
[PATCH v3 17/17] switch to fully parallel mode
- From: Tom Tromey <tromey at redhat dot com>
- To: gdb-patches at sourceware dot org
- Cc: Tom Tromey <tromey at redhat dot com>
- Date: Fri, 25 Oct 2013 14:21:04 -0600
- Subject: [PATCH v3 17/17] switch to fully parallel mode
- Authentication-results: sourceware.org; auth=none
- References: <1382732464-28121-1-git-send-email-tromey at redhat dot com>
This switches "make check" to fully parallel mode.
One primary issue facing full parallelization is the overhead of
"runtest". On my machine, if I "touch gdb.base/empty.exp", making a
new file, and then "time runtest.exp", it takes 0.08 seconds.
Multiply this by the 1008 (in my configuration) tests and you get ~80
seconds. This is the overhead that would theoretically be present if
all tests were run in parallel.
However, the problem isn't nearly as bad as this, for two reasons.
First, you must divide by the number of jobs, assuming perfect
parallelization -- reasonably true for small -j numbers, based on the
results I see.
Second, the current test suite parallelization approach bundles the
tests, largely by directory, but also splitting up gdb.base into two
halves.
I was curious to see how the current bundling played out in practice,
so I ran "make -j1 check RUNTEST='/bin/time runtest'". This invokes
the parallel mode (thus the bundling) and then shows the time taken by
each invocation of runtest.
Then, I ran "/bin/time make -j3 check". (See below about -j2.)
The time for the entire -j3 test run was the same as the time for
"gdb.base1". What this means is that gdb.base1 is currently the
time-limiting run, preventing further parallelization gains.
So, I reason, whatever overhead we see from full parallelization will
only be seen by "-j1" and "-j2".
I then tried a -j2 test run. This does take longer than a -j3 build,
meaning that the gdb.base1 job finishes and then proceeds to other
runtest invocations.
Finally I tried a -j2 test run with the appended patch.
This was 9% slower than the -j2 run without the patch.
I think that is a reasonable slowdown for what is probably a rare
case. I believe this patch will yield faster test results for all -j
values greater than 2. For -j3 on my machine, the test suite is a few
seconds faster; I didn't try any larger -j values.
For -j1, I went ahead and changed the Makefile so that, if no -j
option is given, then the "check-single" mode is used. You can still
use "make -j1 check" to get single-job parallel-mode, though of course
there's no good reason to do so.
This change is likely to speed up the plain "make check" scenario a
little as we will now bypass dg-extract-results.sh.
One drawback of this change is that "make -jN check" is now much more
verbose. I generally only look at the .sum and .log files, but
perhaps this will bother some.
Another interesting question is scalability of the result. The
slowest test, which limits the scalability, took 80.78 seconds. The
mean of the remaining tests is 1.08 seconds. (Note that this is just
a rough estimate, since there are still outliers.)
This means we can run 80.78 / 1.08 =~ 74 tests in the time available.
And, in this data set (slightly older than the above, but materially
the same) there were 948 tests. So, I think the current test suite
should scale ok up to about -j12.
We could improve this number if need be by breaking up the biggest
tests.
~ChangeLog~
2013-10-24 Tom Tromey <tromey@redhat.com>
* Makefile.in (TEST_DIRS): Remove.
(TEST_TARGETS, check-parallel): Rewrite.
(check-gdb.%, BASE1_FILES, BASE2_FILES, check-gdb.base%)
(subdir_do, subdirs): Remove.
(do-check-parallel, check/%): New targets.
(clean): Remove outputs, temp, and cache directories.
(saw_dash_j): New variable.
(CHECK_TARGET): Use it.
(check): Depend on all, site.exp. Rewrite.
(check-single): Remove dependencies.
(slow_tests, all_tests, reordered_tests): New variables.
---
gdb/testsuite/ChangeLog | 14 +++++++
gdb/testsuite/Makefile.in | 100 +++++++++++++++++++---------------------------
2 files changed, 56 insertions(+), 58 deletions(-)
diff --git a/gdb/testsuite/Makefile.in b/gdb/testsuite/Makefile.in
index a7b3d5c..46dac13 100644
--- a/gdb/testsuite/Makefile.in
+++ b/gdb/testsuite/Makefile.in
@@ -128,14 +128,23 @@ $(abs_builddir)/site.exp site.exp: ./config.status Makefile
installcheck:
-# For GNU make, try to run the tests in parallel. If RUNTESTFLAGS is
-# not empty, then by default the tests will be serialized. This can
-# be overridden by setting FORCE_PARALLEL to any non-empty value.
-# For a non-GNU make, do not parallelize.
-@GMAKE_TRUE@CHECK_TARGET = $(if $(FORCE_PARALLEL),check-parallel,$(if $(RUNTESTFLAGS),check-single,check-parallel))
+# See whether -j was given to make. Either it was given with no
+# arguments, and appears as "j" in the first word, or it was given an
+# argument and appears as "-j" in a separate word.
+@GMAKE_TRUE@saw_dash_j = $(or $(findstring j,$(firstword $(MAKEFLAGS))),$(filter -j,$(MAKEFLAGS)))
+
+# For GNU make, try to run the tests in parallel if any -j option is
+# given. If RUNTESTFLAGS is not empty, then by default the tests will
+# be serialized. This can be overridden by setting FORCE_PARALLEL to
+# any non-empty value. For a non-GNU make, do not parallelize.
+@GMAKE_TRUE@CHECK_TARGET = $(if $(FORCE_PARALLEL),check-parallel,$(if $(RUNTESTFLAGS),check-single,$(if $(saw_dash_j),check-parallel,check-single)))
@GMAKE_FALSE@CHECK_TARGET = check-single
-check: $(CHECK_TARGET)
+# Note that we must resort to a recursive make invocation here,
+# because GNU make 3.82 has a bug preventing MAKEFLAGS from being used
+# in conditions.
+check: all $(abs_builddir)/site.exp
+ $(MAKE) $(CHECK_TARGET)
# All the hair to invoke dejagnu. A given invocation can just append
# $(RUNTESTFLAGS)
@@ -151,70 +160,45 @@ DO_RUNTEST = \
export TCL_LIBRARY ; fi ; \
$(RUNTEST)
-check-single: all $(abs_builddir)/site.exp
+check-single:
$(DO_RUNTEST) $(RUNTESTFLAGS)
-# A list of all directories named "gdb.*" which also hold a .exp file.
-# We filter out gdb.base and add fake entries, because that directory
-# takes the longest to process, and so we split it in half.
-TEST_DIRS = gdb.base1 gdb.base2 $(filter-out gdb.base,$(sort $(notdir $(patsubst %/,%,$(dir $(wildcard $(srcdir)/gdb.*/*.exp))))))
-
-TEST_TARGETS = $(addprefix check-,$(TEST_DIRS))
-
-# We explicitly re-invoke make here for two reasons. First, it lets
-# us add a -k option, which makes the parallel check mimic the
-# behavior of the serial check; and second, it means that we can still
-# regenerate the sum and log files even if a sub-make fails -- which
-# it usually does because dejagnu exits with an error if any test
-# fails.
check-parallel:
- $(MAKE) -k $(TEST_TARGETS); \
+ -rm -rf cache
+ $(MAKE) -k do-check-parallel; \
$(SHELL) $(srcdir)/dg-extract-results.sh \
- $(addsuffix /gdb.sum,$(TEST_DIRS)) > gdb.sum; \
+ `find outputs -name gdb.sum -print` > gdb.sum; \
$(SHELL) $(srcdir)/dg-extract-results.sh -L \
- $(addsuffix /gdb.log,$(TEST_DIRS)) > gdb.log
-
-@GMAKE_TRUE@$(filter-out check-gdb.base%,$(TEST_TARGETS)): check-gdb.%: all $(abs_builddir)/site.exp
-@GMAKE_TRUE@ @if test ! -d gdb.$*; then mkdir gdb.$*; fi
-@GMAKE_TRUE@ $(DO_RUNTEST) --directory=gdb.$* --outdir=gdb.$* $(RUNTESTFLAGS)
-
-# Each half (roughly) of the .exp files from gdb.base.
-BASE1_FILES = $(patsubst $(srcdir)/%,%,$(wildcard $(srcdir)/gdb.base/[a-m]*.exp))
-BASE2_FILES = $(patsubst $(srcdir)/%,%,$(wildcard $(srcdir)/gdb.base/[n-z]*.exp))
-
-# Handle each half of gdb.base.
-check-gdb.base%: all $(abs_builddir)/site.exp
- @if test ! -d gdb.base$*; then mkdir gdb.base$*; fi
- $(DO_RUNTEST) $(BASE$*_FILES) --outdir gdb.base$* $(RUNTESTFLAGS)
-
-subdir_do: force
- @for i in $(DODIRS); do \
- if [ -d ./$$i ] ; then \
- if (rootme=`pwd`/ ; export rootme ; \
- rootsrc=`cd $(srcdir); pwd`/ ; export rootsrc ; \
- cd ./$$i; \
- $(MAKE) $(TARGET_FLAGS_TO_PASS) $(DO)) ; then true ; \
- else exit 1 ; fi ; \
- else true ; fi ; \
- done
+ `find outputs -name gdb.log -print` > gdb.log
+
+# Turn a list of .exp files into "check/" targets. Only examine .exp
+# files appearing in a gdb.* directory -- we don't want to pick up
+# lib/ by mistake. For example, gdb.linespec/linespec.exp becomes
+# check/gdb.linespec/linespec.exp. The list is generally sorted
+# alphabetically, but we take a few tests known to be slow and push
+# them to the front of the list to try to lessen the overall time
+# taken by the test suite -- if one of these tests happens to be run
+# late, it will cause the overall time to increase.
+slow_tests = gdb.base/break-interp.exp gdb.base/interp.exp \
+ gdb.base/multi-forks.exp
+@GMAKE_TRUE@all_tests := $(shell cd $(srcdir) && find gdb.* -name '*.exp' -print)
+@GMAKE_TRUE@reordered_tests := $(slow_tests) $(filter-out $(slow_tests),$(all_tests))
+@GMAKE_TRUE@TEST_TARGETS := $(addprefix check/,$(reordered_tests))
+
+do-check-parallel: $(TEST_TARGETS)
+ @:
+
+@GMAKE_TRUE@check/%.exp:
+@GMAKE_TRUE@ -mkdir -p outputs/$*
+@GMAKE_TRUE@ @$(DO_RUNTEST) GDB_PARALLEL=yes --outdir=outputs/$* $*.exp $(RUNTESTFLAGS)
force:;
-subdirs:
- for dir in ${ALL_SUBDIRS} ; \
- do \
- echo "$$dir:" ; \
- if [ -d $$dir ] ; then \
- (rootme=`pwd`/ ; export rootme ; \
- rootsrc=`cd $(srcdir); pwd`/ ; export rootsrc ; \
- cd $$dir; $(MAKE) $(TARGET_FLAGS_TO_PASS)); \
- fi; \
- done
-
clean mostlyclean:
-rm -f *~ core *.o a.out xgdb *.x *.grt bigcore.corefile .gdb_history
-rm -f core.* *.tf *.cl *.py tracecommandsscript copy1.txt zzz-gdbscript
-rm -f *.dwo *.dwp
+ -rm -rf outputs temp cache
if [ x"${ALL_SUBDIRS}" != x ] ; then \
for dir in ${ALL_SUBDIRS}; \
do \
--
1.8.1.4