testTerm(frysk.proc.TestTaskTerminateObserver)junit.framework.AssertionFailedError: event loop run explictly stopped (startChild (Sig_HUP)) at frysk.proc.TestLib.assertRunUntilStop(TestRunner) at frysk.proc.TestLib.assertRunUntilStop(TestRunner) at frysk.proc.TestLib$AckHandler.assertAwait(TestRunner) at frysk.proc.TestLib$AckHandler.await(TestRunner) at frysk.proc.TestLib$Child.<init>(TestRunner) at frysk.proc.TestLib$AckProcess.<init>(TestRunner) at frysk.proc.TestLib$DetachedAckProcess.<init>(TestRunner) at frysk.proc.TestTaskTerminateObserver.testTerm(TestRunner) at frysk.junit.Runner.runCases(TestRunner) at frysk.junit.Runner.runArchCases(TestRunner) at frysk.junit.Runner.runTestCases(TestRunner) at TestRunner.main(TestRunner)
This is a utrace bug, block on FC 6, not FC 5.
This is due to a .17 vs .18.utrace change in behavior. Given a non-main task that has exited, but not yet been joined, in kernel.17 that task would appear in /proc in the state 'X', in kernel.18.utrace the task completely disappears. Is this considered a change in defined behavior?
Rhel 5 bug: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217433
The test proper was rewritten, but the problem / question still exists. Index: frysk-core/frysk/pkglibexecdir/ChangeLog 2006-11-27 Andrew Cagney <cagney@redhat.com> * funit-threadexit.c (running_thread_can_exit): New barrier. (main, op_thread): Use running_thread_can_exit to block thread's exit until after main has opened the thread's /proc/stat file. * funit-threadexit.c (scan_thread): Delete (main): Do the scan for thread in 'X' state here, instead of in scan_thread. Create only one thread. (condition_cond, condition_mutex): Delete. (thread_running_barrier): Rename "barrier". (thread_id): Make volatile. (op_thread): Simplify, use only one barrier. Index: frysk-core/frysk/proc/ChangeLog 2006-11-27 Andrew Cagney <cagney@redhat.com> * TestTaskTerminateObserver.java (TerminatingCounter.addedTo): Add; stop the event loop. (testAttachToUnJoinedTask): Rename testTerm; simplify, explicitly terminate the thread.
Test case added, closing. Index: frysk-imports/tests/ChangeLog 2006-11-27 Andrew Cagney <cagney@redhat.com> * frysk3491/x-state.c: New file. * Makefile.am (TESTS, noinst_PROGRAMS): Add frysk3491/x-state. (frysk3491_x_state_SOURCES, frysk3491_x_state_LDFLAGS): Define.
Moving to suspended state ...
Bug is still there in the fc6 kernel. The redhat bugzilla bug referred to in the comments is not accessible from outside Red Hat. There is no public status on this problem. Can somebody in RH please clone https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217433 as a the kernel bug for FC6?
Suspending bug while upstream issue resolved. (It looks like this test is illustrating a race condition that became more racy in the switch from .17 to .18 based kernels).
Looking closer at this problem, I think that the problem is actually with the test itself. It contains the following code (specific to FC6): char buffer [1024]; int n = pread (fd, buffer, sizeof (buffer), 0); if (n <= 0) { // On FC-6 the thread completly disappears from /proc. if (errno == ESRCH) { printf ("%d.%d pread returns %d (%s)\n", getpid (), gettid(), errno, strerror (errno)); exit (1); } perror ("pread"); exit (1); } Given that the comment states that the behaviour in FC6 is that the thread disappears completely from /proc, the exit code should be 0 in this case, signaling a successful completion of the test case. I don't have CVS commit privs as far as I know, so could someone make this change (assuming of course I am right)?
(In reply to comment #9) > Looking closer at this problem, I think that the problem is actually with the > test itself. That was deliberate - detect the specific condition causing the corresponding test to fail and then exit with failure on that. Here's a description of what is going on from roland: > please show the /proc/pid/status contents with X state. > The X (EXIT_DEAD) state means in the middle of being reaped. > For a noninitial nptl thread, this means almost finished dying, > since the threads "reap" themselves (when not ptraced). There > is just a short race window after the thread starts dying when > it can still be looked up in /proc. I suspect nothing changed > but the timing. The only non-race way you can ever see X state > is if you opened an fd on the /proc file before it died, then > read later from that open fd. So the kernel test needs adjusting, and a lot more comments, and the correspnding testTerm might need a re-think.
There is a race between ptrace/waitpid seeing an event and /proc/$$/stat[us] seeing or reflecting that same event. Consequently what can be seen on one kernel (here X state) won't be seen on later kernels.