Running the frysk-core test suite, I experience intermittent hangs on the testSteppingtestInsertRemove test. Preliminary opinion is that the code is waiting for a signal that is never delivered. Thread 1: #0 0x00000034d200cbfb in read () from /lib64/libpthread.so.0 #1 0x000000000052b275 in frysk::testbed::ForkTestLib$ForkedInputStream::read ( this=0x2aaaaaf67a50, buf=<value optimized out>, off=0, len=2048) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-imports/frysk/testbed/cni/ForkTestLib.cxx:144 #2 0x00000034d44989a5 in java::io::BufferedInputStream::refill () from /usr/lib64/libgcj.so.7rh #3 0x00000034d449cca0 in java::io::InputStreamReader::refill () from /usr/lib64/libgcj.so.7rh #4 0x00000034d449cefb in java::io::InputStreamReader::read () from /usr/lib64/libgcj.so.7rh #5 0x00000034d4498304 in java::io::BufferedReader::fill () from /usr/lib64/libgcj.so.7rh #6 0x00000034d44a2fc8 in java::io::BufferedReader::readLine () from /usr/lib64/libgcj.so.7rh #7 0x00000000004c5af1 in frysk.proc.TestBreakpoints.testInsertRemove()void ( this=0x2aaaaaf70c40) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-core/frysk/proc/TestBreakpoints.java:422 #8 0x00000000004c4346 in frysk.proc.TestBreakpoints.testSteppingtestInsertRemove()void (this=0xc) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-core/frysk/proc/TestBreakpoints.java:549 #9 0x00000034d48e630c in _Jv_strtod_r () from /usr/lib64/libgcj.so.7rh #10 0x00000034d48e6198 in _Jv_strtod_r () from /usr/lib64/libgcj.so.7rh #11 0x00000034d41b58e5 in _Jv_CallAnyMethodA () from /usr/lib64/libgcj.so.7rh #12 0x00000034d41b6030 in _Jv_CallAnyMethodA () from /usr/lib64/libgcj.so.7rh #13 0x00000034d41b6383 in java::lang::reflect::Method::invoke () from /usr/lib64/libgcj.so.7rh #14 0x0000000000540e86 in junit.framework.TestCase.runTest()void ( this=<value optimized out>) at junit/framework/TestCase.java:154 #15 0x0000000000540cb6 in junit.framework.TestCase.runBare()void ( this=<value optimized out>) at junit/framework/TestCase.java:127 #16 0x00000000005429d4 in junit.framework.TestResult$1.protect()void ( this=<value optimized out>) at junit/framework/TestResult.java:106 #17 0x00000000005416a2 in junit.framework.TestResult.runProtected(junit.framework.Test, junit.framework.Protectable)void (this=0x2aaaab2e7190, test=0x2aaaaaf70c40, p=0x2aaaab269708) at junit/framework/TestResult.java:124 #18 0x0000000000541603 in junit.framework.TestResult.run(junit.framework.TestCase)void (this=0x2aaaab2e7190, test=0x2aaaaaf70c40) at junit/framework/TestResult.java:109 #19 0x0000000000540c84 in junit.framework.TestCase.run(junit.framework.TestResult)void (this=0xfffffffffffffe00, result=0xc) at junit/framework/TestCase.java:118 #20 0x00000000005404b1 in junit.framework.TestSuite.runTest(junit.framework.Test , junit.framework.TestResult)void (this=<value optimized out>, test=0x2aaaaaf70c40, result=0x2aaaab2e7190) at junit/framework/TestSuite.java:208 #21 0x000000000054046d in junit.framework.TestSuite.run(junit.framework.TestResult)void (this=<value optimized out>, result=0x2aaaab2e7190) at junit/framework/TestSuite.java:203 #22 0x00000000005404b1 in junit.framework.TestSuite.runTest(junit.framework.Test, junit.framework.TestResult)void (this=<value optimized out>, test=0x2aaaaaf47180, result=0x2aaaab2e7190) at junit/framework/TestSuite.java:208 #23 0x000000000054046d in junit.framework.TestSuite.run(junit.framework.TestResult)void (this=<value optimized out>, result=0x2aaaab2e7190) at junit/framework/TestSuite.java:203 #24 0x00000000005344d8 in junit.textui.TestRunner.doRun(junit.framework.Test, boolean)junit.framework.TestResult (this=0x2aaaaae06f00, suite=0x2aaaaae43a38, wait=false) at junit/textui/TestRunner.java:116 #25 0x000000000053446c in junit.textui.TestRunner.doRun(junit.framework.Test)junit.framework.TestResult (this=0xc, test=0x2aaaab1db00c) at junit/textui/TestRunner.java:109 #26 0x000000000051cae7 in frysk.junit.Runner.runCases(java.util.Collection)int (this=0x2aaaaae06f00, testClasses=<value optimized out>) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-imports/frysk/junit/Runner.java:289 #27 0x000000000051ce30 in frysk.junit.Runner.runArchCases(java.util.Collection)int (this=0x2aaaaae06f00, testClasses=0x2aaaaab107a8) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-imports/frysk/junit/Runner.java:330 #28 0x000000000051c0af in frysk.junit.Runner.runTestCases(java.util.Collection, frysk.Config, java.util.Collection, frysk.Config)int (this=0x2aaaaae06f00, tests=0x2aaaaab107a8, config=<value optimized out>, tests32=0x2aaaaab10668, config32=0x2aaaaadb4ab0) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-imports/frysk/junit/Runner.java:625 #29 0x00000000004b7dc6 in TestRunner.main(java.lang.String[])void ( args=0x2aaaaaae8f60) at TestRunner.java:63 #30 0x00000034d41a41b3 in gnu::java::lang::MainThread::call_main () from /usr/lib64/libgcj.so.7rh #31 0x00000034d41f925e in gnu::java::lang::MainThread::run () from /usr/lib64/libgcj.so.7rh #32 0x00000034d41b2738 in _Jv_ThreadRun () from /usr/lib64/libgcj.so.7rh #33 0x00000034d4175a05 in _Jv_RunMain () from /usr/lib64/libgcj.so.7rh #34 0x00000000004b7d16 in main (argc=-1424117748, argv=0x800) at /tmp/cc93VWxo.i:14 Thread 2: #0 0x00000034d14c4a36 in poll () from /lib64/libc.so.6 #1 0x0000000000528397 in frysk::sys::Poll::poll (pollObserver=0x2aaaaaf67ad0, timeout=-1) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-imports/frysk/sys/cni/Poll.cxx:197 #2 0x00000000004f2c9f in frysk.event.EventLoop.runEventLoop(boolean)void ( this=0x2aaaab27e8e8, pendingOnly=32) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-core/frysk/event/EventLoop.java:317 #3 0x00000000004f2d25 in frysk.event.EventLoop.run()void (this=0x2aaaab27e8e8) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-core/frysk/event/EventLoop.java:416 #4 0x00000000004c4a63 in frysk.proc.TestBreakpoints$EventLoopRunner.run()void (this=0x2aaaab1d3a80) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-core/frysk/proc/TestBreakpoints.java:943 #5 0x00000034d41b2738 in _Jv_ThreadRun () from /usr/lib64/libgcj.so.7rh #6 0x00000034d41b8f77 in _Jv_ThreadRegister () from /usr/lib64/libgcj.so.7rh #7 0x00000034d48f6a66 in _Jv_strtod_r () from /usr/lib64/libgcj.so.7rh #8 0x00000034d2006305 in start_thread () from /lib64/libpthread.so.0 #9 0x00000034d14cd50d in clone () from /lib64/libc.so.6 #10 0x0000000000000000 in ?? () Thread 3: #0 0x00000034d200a416 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000034d41b93bc in _Jv_CondWait () from /usr/lib64/libgcj.so.7rh #2 0x00000034d41aed60 in java::lang::Object::wait () from /usr/lib64/libgcj.so.7rh #3 0x000000000051e7cc in frysk.sys.Ptrace$PtraceThread.run()void ( this=0x2aaaab10bcc0) at /home/aedil/build_farm/frysk_fresh/frysk_config/frysk-imports/frysk/sys/Ptrace.java:220 #4 0x00000034d41b2738 in _Jv_ThreadRun () from /usr/lib64/libgcj.so.7rh #5 0x00000034d41b8f77 in _Jv_ThreadRegister () from /usr/lib64/libgcj.so.7rh #6 0x00000034d48f6a66 in _Jv_strtod_r () from /usr/lib64/libgcj.so.7rh #7 0x00000034d2006305 in start_thread () from /lib64/libpthread.so.0 #8 0x00000034d14cd50d in clone () from /lib64/libc.so.6 #9 0x0000000000000000 in ?? () Thread 4: #0 0x00000034d200a416 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000034d41b93bc in _Jv_CondWait () from /usr/lib64/libgcj.so.7rh #2 0x00000034d41a386c in gnu::gcj::runtime::FinalizerThread::run () from /usr/lib64/libgcj.so.7rh #3 0x00000034d41b2738 in _Jv_ThreadRun () from /usr/lib64/libgcj.so.7rh #4 0x00000034d41b8f77 in _Jv_ThreadRegister () from /usr/lib64/libgcj.so.7rh #5 0x00000034d48f6a66 in _Jv_strtod_r () from /usr/lib64/libgcj.so.7rh #6 0x00000034d2006305 in start_thread () from /lib64/libpthread.so.0 #7 0x00000034d14cd50d in clone () from /lib64/libc.so.6 #8 0x0000000000000000 in ?? ()
See http://sources.redhat.com/bugzilla/show_bug.cgi?id=3486 I'm pretty sure this is a duplicate.
If you could get it to run with log output (TestRunner -c FINE) that would be interesting. Although that might disrupt the precise timing and make it not hang again of course.
frysk's event loop is using waitpid now; is it still occuring?
Has been run hunderds of times (even running multiple in parallel) without any hangs anymore (but see bug #4847 and #6044).