This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: State-of-the-art in cross-platform Glibc testing?
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Brooks Moses <bmoses at google dot com>
- Cc: <libc-alpha at sourceware dot org>
- Date: Thu, 11 Jul 2013 23:53:17 +0000
- Subject: Re: State-of-the-art in cross-platform Glibc testing?
- References: <CAOxa4KpUEgzOR8J2cqAJYwap1ATEiJLHG1cj_WNf3+YbTROn_g at mail dot gmail dot com>
On Wed, 10 Jul 2013, Brooks Moses wrote:
> The big limitation of cross-test-ssh.sh for me is that it requires a
> shared filesystem on both host and target, with identical paths.
> Setting up NFS sharing from the build server for this is kind of a
> pain (i.e., not really possible) in my setup, and I'd rather scp files
> back and forth. Anyone else tried that, and any advice -- or horror
> stories -- about making it work?
Well, there is no real tracking of what bits of the filesystem particular
tests depend on as inputs, or produce as outputs; there may be makefile
dependencies, but generally nothing would break at present if those were
missing or incomplete. And the bits used are scattered all over the
source and build directories.
I sort of think that the glibc build process should install things into a
sysroot directory within the build directory that mirrors a standard glibc
installation, so that "make install" would then just deal with copying
that directory's contents to the given install_root. This would simplify
running programs with the build-tree libraries and make that more like
doing so with any libc installation not in the root directory; it could
also facilitate testing with a previously built library without needing
the original build tree to be present, if dependencies were set up
appropriately so it was possible to test without things in that sysroot
directory getting rebuilt. It would also be clear that this sysroot
directory is needed on the system on which tests run, and that those tests
should never modify the contents of the sysroot. It might also be
possible to build tests and other programs against this directory's
contents with --sysroot (though that might not work with
non-sysroot-configured tools).
But I'm not sure that would help with your problem, although it would
simplify defining the sets of files on which tests depend.
I don't then know to what extent tests may depend on the specific paths to
files, or on being run from within the source directory (which is where
the glibc makefiles run commands from), and so issues would arise with
paths being different on the different systems (beyond the known ftwtest
failures if the paths are the same but involve a symlink on one of the two
systems).
Then you have those tests that aren't testing something built for the
purposes of testing at all, but are based on examining the build tree.
These include at least check-local-headers and tst-xmmymm. My theory is
that such tests need to be distinguished from the normal tests in some way
so that they could be run at build time, separate from any testing
actually involving a system that could execute the newly built code.
(These tests of course don't actually involve running any newly built
code, so the miscellaneous build tree files they use don't need to be
visible on another system.) As a further complication,
check-local-headers is, I think, examining both files from the build of
the library itself and files from building other tests.
> Other than that, there's no timeout functionality, or retry
> functionality if the connection dies, or that sort of robustness
> support -- all of which is pretty straightforward to add, but it would
> save me some time if someone has pre-written code to share. :)
Why should "the connection dies" be any different from "someone randomly
sends a signal to a test program in native testing"? The expectation of
the testsuite code is that the test wrapper is reliable, and the
expectation of cross-test-ssh.sh is that the underlying ssh transport is
reliable. And if a test crashes the system running the tests 10% of the
time, it may be better if this shows up as a failure rather than quietly
rebooting and retrying (at least, if you want to make glibc, and the
system on which it runs, higher quality, which may well involve fixing a
kernel bug and getting that fix upstream in some cases, rather than just
getting one test run to finish).
As for timeouts - those aren't anything to do with cross-testing either.
Rather, test-skeleton deals with timeouts for both native and cross.
Now, Roland referred a while back to issues with the test-skeleton
approach of forking
<http://sourceware.org/ml/libc-alpha/2012-10/msg00313.html>. If you move
to a wrapper like suggested there, then I suppose the wrapper would still
run on the same system as the newly built tests (in order to set rlimits
etc.) but it might be easier to ensure that all tests do get an
appropriate timeout.
Next, given that you are running the tests (but want timeouts, so
presumably aren't expecting always 100% clean results), what do you want
to do with the results? The testsuite is notoriously deficient in
actually giving meaningful reports indicating what passed or failed. You
can run "make check", and have it stop on an "unexpected" failure (which
may or may not be unexpected depending on the architecture and kernel
version). Or run "make -k check", and look for errors in the output. Or
run "make -k --debug=b check" and parse the debug output to get details of
tests that succeeded as well as those that failed. Or get in some form of
Tomas Dohnalek's patch for which the discussion tailed off with
<http://sourceware.org/ml/libc-alpha/2012-10/msg00278.html>. None of
those will give information about assertions within individual tests, or
information that a test isn't really PASS or FAIL but UNSUPPORTED /
UNRESOLVED.
As I suggested in
<http://sourceware.org/ml/libc-alpha/2013-06/msg00497.html>, I think the
test-skeleton should provide standard functions for reporting the status
of individual test assertions and associated information (and the
test-skeleton code would then deduce the result of the overall test from
the results of individual assertions, e.g. return an exit status meaning
FAIL if any assertion FAILed). If a test couldn't use test-skeleton, e.g.
if it needed to use one of the ad hoc shell scripts for some reason, it
should still produce output in the same format so it can be collected to
produce overall logs of test results for the whole testsuite.
More closely related to your problem of identifying inputs / outputs for
particular tests is the general issue of defining more properties of
individual tests in data, rather than ad hoc makefile rules and
dependencies. See my comments in
<http://sourceware.org/ml/libc-alpha/2012-09/msg00276.html>, as referenced
in <http://sourceware.org/ml/libc-alpha/2013-05/msg01073.html>. I suppose
you might want <test>-inputs and <test>-outputs variables, or similar, and
then new variables like test-wrapper that are called before / after tests,
to copy files as needed. If you can move tests that presently have ad hoc
rules for running them into the "tests" variable (where presently they are
handled as dependencies of the "tests" makefile target instead), by means
of having metadata about tests in that variable, that might then mean you
only need one (or a few) places in which to use the new variables, rather
than needing to use them in lots of separate ad hoc rules.
The above lists the main issues with the glibc testsuite, and possible
approaches for fixing them, as I see it. I hope at least some of it is
relevant / helpful for what you see as deficiencies in the testsuite.
--
Joseph S. Myers
joseph@codesourcery.com