This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: State-of-the-art in cross-platform Glibc testing?


On Wed, 10 Jul 2013, Brooks Moses wrote:

> The big limitation of cross-test-ssh.sh for me is that it requires a
> shared filesystem on both host and target, with identical paths.
> Setting up NFS sharing from the build server for this is kind of a
> pain (i.e., not really possible) in my setup, and I'd rather scp files
> back and forth.  Anyone else tried that, and any advice -- or horror
> stories -- about making it work?

Well, there is no real tracking of what bits of the filesystem particular 
tests depend on as inputs, or produce as outputs; there may be makefile 
dependencies, but generally nothing would break at present if those were 
missing or incomplete.  And the bits used are scattered all over the 
source and build directories.

I sort of think that the glibc build process should install things into a 
sysroot directory within the build directory that mirrors a standard glibc 
installation, so that "make install" would then just deal with copying 
that directory's contents to the given install_root.  This would simplify 
running programs with the build-tree libraries and make that more like 
doing so with any libc installation not in the root directory; it could 
also facilitate testing with a previously built library without needing 
the original build tree to be present, if dependencies were set up 
appropriately so it was possible to test without things in that sysroot 
directory getting rebuilt.  It would also be clear that this sysroot 
directory is needed on the system on which tests run, and that those tests 
should never modify the contents of the sysroot.  It might also be 
possible to build tests and other programs against this directory's 
contents with --sysroot (though that might not work with 
non-sysroot-configured tools).

But I'm not sure that would help with your problem, although it would 
simplify defining the sets of files on which tests depend.

I don't then know to what extent tests may depend on the specific paths to 
files, or on being run from within the source directory (which is where 
the glibc makefiles run commands from), and so issues would arise with 
paths being different on the different systems (beyond the known ftwtest 
failures if the paths are the same but involve a symlink on one of the two 
systems).

Then you have those tests that aren't testing something built for the 
purposes of testing at all, but are based on examining the build tree.  
These include at least check-local-headers and tst-xmmymm.  My theory is 
that such tests need to be distinguished from the normal tests in some way 
so that they could be run at build time, separate from any testing 
actually involving a system that could execute the newly built code.  
(These tests of course don't actually involve running any newly built 
code, so the miscellaneous build tree files they use don't need to be 
visible on another system.)  As a further complication, 
check-local-headers is, I think, examining both files from the build of 
the library itself and files from building other tests.

> Other than that, there's no timeout functionality, or retry
> functionality if the connection dies, or that sort of robustness
> support -- all of which is pretty straightforward to add, but it would
> save me some time if someone has pre-written code to share.  :)

Why should "the connection dies" be any different from "someone randomly 
sends a signal to a test program in native testing"?  The expectation of 
the testsuite code is that the test wrapper is reliable, and the 
expectation of cross-test-ssh.sh is that the underlying ssh transport is 
reliable.  And if a test crashes the system running the tests 10% of the 
time, it may be better if this shows up as a failure rather than quietly 
rebooting and retrying (at least, if you want to make glibc, and the 
system on which it runs, higher quality, which may well involve fixing a 
kernel bug and getting that fix upstream in some cases, rather than just 
getting one test run to finish).

As for timeouts - those aren't anything to do with cross-testing either.  
Rather, test-skeleton deals with timeouts for both native and cross.

Now, Roland referred a while back to issues with the test-skeleton 
approach of forking 
<http://sourceware.org/ml/libc-alpha/2012-10/msg00313.html>.  If you move 
to a wrapper like suggested there, then I suppose the wrapper would still 
run on the same system as the newly built tests (in order to set rlimits 
etc.) but it might be easier to ensure that all tests do get an 
appropriate timeout.

Next, given that you are running the tests (but want timeouts, so 
presumably aren't expecting always 100% clean results), what do you want 
to do with the results?  The testsuite is notoriously deficient in 
actually giving meaningful reports indicating what passed or failed.  You 
can run "make check", and have it stop on an "unexpected" failure (which 
may or may not be unexpected depending on the architecture and kernel 
version).  Or run "make -k check", and look for errors in the output.  Or 
run "make -k --debug=b check" and parse the debug output to get details of 
tests that succeeded as well as those that failed.  Or get in some form of 
Tomas Dohnalek's patch for which the discussion tailed off with 
<http://sourceware.org/ml/libc-alpha/2012-10/msg00278.html>.  None of 
those will give information about assertions within individual tests, or 
information that a test isn't really PASS or FAIL but UNSUPPORTED / 
UNRESOLVED.

As I suggested in 
<http://sourceware.org/ml/libc-alpha/2013-06/msg00497.html>, I think the 
test-skeleton should provide standard functions for reporting the status 
of individual test assertions and associated information (and the 
test-skeleton code would then deduce the result of the overall test from 
the results of individual assertions, e.g. return an exit status meaning 
FAIL if any assertion FAILed).  If a test couldn't use test-skeleton, e.g. 
if it needed to use one of the ad hoc shell scripts for some reason, it 
should still produce output in the same format so it can be collected to 
produce overall logs of test results for the whole testsuite.

More closely related to your problem of identifying inputs / outputs for 
particular tests is the general issue of defining more properties of 
individual tests in data, rather than ad hoc makefile rules and 
dependencies.  See my comments in 
<http://sourceware.org/ml/libc-alpha/2012-09/msg00276.html>, as referenced 
in <http://sourceware.org/ml/libc-alpha/2013-05/msg01073.html>.  I suppose 
you might want <test>-inputs and <test>-outputs variables, or similar, and 
then new variables like test-wrapper that are called before / after tests, 
to copy files as needed.  If you can move tests that presently have ad hoc 
rules for running them into the "tests" variable (where presently they are 
handled as dependencies of the "tests" makefile target instead), by means 
of having metadata about tests in that variable, that might then mean you 
only need one (or a few) places in which to use the new variables, rather 
than needing to use them in lots of separate ad hoc rules.


The above lists the main issues with the glibc testsuite, and possible 
approaches for fixing them, as I see it.  I hope at least some of it is 
relevant / helpful for what you see as deficiencies in the testsuite.

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]