Bug 3010

Summary: Error with testSyscallInterrupt on ppc64: Read exit before enter exception
Product: frysk Reporter: Wu Zhou <woodzltc>
Component: generalAssignee: Yao Qi <qiyaoltc>
Status: WAITING ---    
Severity: normal CC: qiyaoltc, zhengyzy
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: ppc64-redhat-linux Target: ppc64-redhat-linux
Build: ppc64-redhat-linux Last reconfirmed:
Bug Depends on:    
Bug Blocks: 2188    

Description Wu Zhou 2006-08-08 04:08:29 UTC
After making setupMapsXXX in LinuxTask.java to support big endian on ppc64, we
can now run system calls related tests on this platform.

Here is one problem we encountered. Running testSyscallInterrupt will report
"ERROR":

Running testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) ...15736.15736:
wrote to the parent process fd2[1] 
15735.15735: read from child process fd2[0]
15735.15735: signaling frysk...
15735.15735: signaled frysk
15735.15735: waiting to read from fd[0]
15736.15736: wrote to fd[1]
15736.15736: DONE
ERROR
  java.lang.RuntimeException: Read exit before enter

Here is the java backtrace:

testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver)java.lang.RuntimeException:
Read exit before enter
   at
java.lang.VMThrowable._ZN4java4lang11VMThrowable16fillInStackTraceEPNS0_9ThrowableE(libgcj.so.7)
   at java.lang.Throwable._ZN4java4lang9Throwable16fillInStackTraceEv(libgcj.so.7)
   at java.lang.Throwable._ZN4java4lang9ThrowableC1EPNS0_6StringE(libgcj.so.7)
   at java.lang.Exception._ZN4java4lang9ExceptionC1EPNS0_6StringE(libgcj.so.7)
   at
java.lang.RuntimeException._ZN4java4lang16RuntimeExceptionC1EPNS0_6StringE(libgcj.so.7)
   at
frysk.proc.TestTaskSyscallObserver$TestSyscallInterruptInternals$SyscallInterruptObserver(TestRunner)
   at frysk.proc.Task(TestRunner)
   at frysk.proc.TaskState$16(TestRunner)
   at frysk.proc.Task(TestRunner)
   at frysk.proc.LinuxHost$PollWaitOnSigChld$2(TestRunner)
   at frysk.sys.Wait(TestRunner)
   at frysk.proc.LinuxHost$PollWaitOnSigChld(TestRunner)
   at frysk.event.EventLoop(TestRunner)
   at frysk.event.EventLoop(TestRunner)
   at frysk.proc.TestLib(TestRunner)
   at frysk.proc.TestLib(TestRunner)
   at frysk.proc.TestTaskSyscallObserver(TestRunner)
   at
java.lang.reflect.Method._ZN4java4lang7reflect6Method6invokeEPNS0_6ObjectEP6JArrayIS4_E(libgcj.so.7)
   at frysk.junit.Runner(TestRunner)
   at TestRunner(TestRunner)
   at
gnu.java.lang.MainThread._ZN3gnu4java4lang10MainThread9call_mainEv(libgcj.so.7)
   at gnu.java.lang.MainThread._ZN3gnu4java4lang10MainThread3runEv(libgcj.so.7)


Additional information:
1. testSyscallInterruptRestart will report the same error, with a very similar,
if not the same, backtrace.  
2. I notice there is only three minor difference between these two case:
testSyscallInterruptRestart set "restart" argument to 1, and it expects both
t.readEnter and t.readExit to be 2.
Comment 1 Wu Zhou 2006-08-08 04:15:42 UTC
In the frysk core log, we find that readExit is hit two times, but readEnter is
hit only one time.  Any hints about this kind of exception?  We are
investigating the source code right now.  Any quick hints are highly appreciated.

Here is the relevant FINE log:

{frysk.proc.LinuxHost$PollWaitOnSigChld@4d8d260,sig=Sig_CHLD} execute
frysk.sys.Wait pid 15940 status 0x857f
frysk.sys.Wait pid 0 errno 0
{TaskId,15940} syscallEvent
{frysk.proc.LinuxHost@4d90760,state=running} get TaskId
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
handleSyscalledEvent
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning} sendrecIsa
frysk.proc.IsaFactory@41652d0 getIsa
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
notifySyscallEnter 3
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
frysk.proc.TestTaskSyscallObserver$TestSyscallInterruptInternals$SyscallInterruptObserver@4fb4a30
updateSyscallEnter READ
+---------------------+
{frysk.proc.TestTaskSyscallObserver$TestSyscallInterruptInternals$PausedReadTimerEvent@4edce90,timeMillis=1155022869997,periodMillis=0}
new long
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
sendSyscallContinue
{frysk.proc.LinuxHost$PollWaitOnSigChld@4d8d260,sig=Sig_CHLD} execute
frysk.sys.Wait pid 15940 status 0x857f
frysk.sys.Wait pid 0 errno 0
{TaskId,15940} syscallEvent
{frysk.proc.LinuxHost@4d90760,state=running} get TaskId
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
handleSyscalledEvent
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
notifySyscallExit 3
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
frysk.proc.TestTaskSyscallObserver$TestSyscallInterruptInternals$SyscallInterruptObserver@4fb4a30
updateSyscallExit READ
+---------------------+
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
sendSyscallContinue
{frysk.proc.LinuxHost$PollWaitOnSigChld@4d8d260,sig=Sig_CHLD} execute
frysk.sys.Wait pid 15940 status 0xa7f
frysk.sys.Wait pid 0 errno 0
{TaskId,15940} stopped
{frysk.proc.LinuxHost@4d90760,state=running} get TaskId
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
handleSignaledEvent
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
notifySignaled(int)
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
sendSyscallContinue
{frysk.proc.LinuxHost$PollWaitOnSigChld@4d8d260,sig=Sig_CHLD} execute
frysk.sys.Wait pid 15940 status 0x857f
frysk.sys.Wait pid 0 errno 0
{TaskId,15940} syscallEvent
{frysk.proc.LinuxHost@4d90760,state=running} get TaskId
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
handleSyscalledEvent
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
notifySyscallEnter 208
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
sendSyscallContinue
{frysk.proc.LinuxHost$PollWaitOnSigChld@4d8d260,sig=Sig_CHLD} execute
frysk.sys.Wait pid 15940 status 0x857f
frysk.sys.Wait pid 0 errno 0
{TaskId,15940} syscallEvent
{frysk.proc.LinuxHost@4d90760,state=running} get TaskId
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
handleSyscalledEvent
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
notifySyscallExit 208
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
sendSyscallContinue
{frysk.proc.LinuxHost$PollWaitOnSigChld@4d8d260,sig=Sig_CHLD} execute
frysk.sys.Wait pid 15940 status 0x857f
frysk.sys.Wait pid 0 errno 0
{TaskId,15940} syscallEvent
{frysk.proc.LinuxHost@4d90760,state=running} get TaskId
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
handleSyscalledEvent
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
notifySyscallEnter 172
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=syscallRunning}
sendSyscallContinue
{frysk.proc.LinuxHost$PollWaitOnSigChld@4d8d260,sig=Sig_CHLD} execute
frysk.sys.Wait pid 15940 status 0x857f
frysk.sys.Wait pid 0 errno 0
{TaskId,15940} syscallEvent
{frysk.proc.LinuxHost@4d90760,state=running} get TaskId
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
handleSyscalledEvent
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
{frysk.proc.LinuxTask@4efdbc0,pid=15940,tid=15940,state=runningInSyscall}
notifySyscallExit 3
Get GPR0 frysk.proc.IsaPPC64$PPC64Register@417d820
frysk.proc.TestTaskSyscallObserver$TestSyscallInterruptInternals$SyscallInterruptObserver@4fb4a30
updateSyscallExit READ
+---------------------+
testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) >>>>>>>>>>>>>>>> start
tearDown
testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) kill -KILL 15940
testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) kill -CONT 15940
testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) detach -KILL 15940
frysk.sys.Wait pid -1 errno 10
testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) >>>>>>>>>>>>>>>> end
tearDown
testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) --- addError ----
ERROR: java.lang.RuntimeException: Read exit before enter
testSyscallInterrupt(frysk.proc.TestTaskSyscallObserver) ---- endTest ----
Comment 2 Yao Qi 2006-08-28 03:58:19 UTC
On X86, the sequence of these syscalls in funit-syscallint is,
read
tkill
sig_return
read
gettid
colse
gettid
write
exit

But on PPC64, the sequence is,
read
tkill
rt_sigreturn [Enter]
read [exit]
gettid
colse
....
....

Here are two points we should check,

1 Why kernel (or other thing, gcc?)  produce rt_sigreturn on PPC64, while
produce sig_return on X86?
In linux-2.6.12/arch/ppc64/kernel/signal.c,
Only setup_rt_frame could be found.(All the signals are treated as RT signal?)

2 Where is rt_sigreturn [exit] and read[Enter]?
Comment 3 Wu Zhou 2006-11-03 03:12:24 UTC
We are re-visiting all the bugs we opened and owned.  Since Yao posted a patch
for this (http://sources.redhat.com/ml/frysk/2006-q4/msg00169.html), I am mark
this as waiting for feedback.  Please feedback if you can.  Thanks.