I'm trying to run the frysk command line tools (fstack in particular, but they all seem to fail similarly) on an x86_64 Fedora 8 VM (running on VMWare Server). (The same behaviour was observed on a Ubuntu 8.04 i386 box.) It works when run against a small number of processes (e.g., bash) but fails when run against almost anything else (e.g., gnome-terminal, gconf2d, etc.). I've tried the pre-built binaries from the 0.4 RPM, as well as building the binaries from the source in the head branch. The debug trace output is as follows: 00:00:00.018 27105.27105 frysk.util.CommandlineParser: checking for pid 00:00:00.022 27105.27105 frysk.sys.Wait: adding 29 (I/O possible) 00:00:00.022 27105.27105 [Thread[Thread-1,5,main]]: new 00:00:00.023 27105.27105 [{frysk.proc.live.LinuxPtraceHost@2f8b1875}]: Host 00:00:00.025 27105.27105 [Thread[Thread-1,5,main]]: add WaitBuilder <<frysk.proc.live.LinuxWaitBuilder@2f8b5e6d>> 00:00:00.025 27105.27105 [{frysk.proc.live.LinuxPtraceHost@2f8b1875}]: requestProc 2875 00:00:00.026 27105.27105 [Thread[Thread-1,5,main]]: add Event <<frysk.proc.live.LinuxPtraceHost$3@2f8c13f5>> 00:00:00.026 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.027 27105.27105 [Thread[Thread-1,5,main]]: run 00:00:00.027 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop pendingOnly false 00:00:00.028 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.028 27105.27105 [Thread[Thread-1,5,main]]: remove ... return <<frysk.proc.live.LinuxPtraceHost$3@2f8c13f5>> 00:00:00.029 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop executing <<frysk.proc.live.LinuxPtraceHost$3@2f8c13f5>> 00:00:00.029 27105.27105 [{frysk.proc.live.LinuxPtraceHost@2f8b1875}]: execute - requestProc <<2875>> 00:00:00.030 27105.27105 [frysk.sys.proc.Stat@2f7bd185]: scan <<2875>> 00:00:00.031 27105.27105 [/proc/2875/stat]: pid <<2875>> 00:00:00.031 27105.27105 [/proc/2875/stat]: state S 00:00:00.032 27105.27105 [/proc/2875/stat]: ppid <<1>> 00:00:00.032 27105.27105 [frysk.sys.proc.Stat@2f7bd04d]: scan <<1>> 00:00:00.033 27105.27105 [/proc/1/stat]: pid <<1>> 00:00:00.033 27105.27105 [/proc/1/stat]: state S 00:00:00.034 27105.27105 [/proc/1/stat]: ppid <<0>> 00:00:00.035 27105.27105 [{frysk.proc.live.LinuxPtraceProc@2f8c7455,pid=1,state=<null>}]: new - create unattached running proc 00:00:00.036 27105.27105 frysk.proc.live.LinuxPtraceProcState: initial false 00:00:00.036 27105.27105 [{frysk.proc.live.LinuxPtraceProc@2f8c7455,pid=1,state=detached}]: add(Proc) -- a child process 00:00:00.037 27105.27105 [{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=<null>}]: new - create unattached running proc 00:00:00.038 27105.27105 frysk.proc.live.LinuxPtraceProcState: initial false 00:00:00.039 27105.27105 frysk.proc.Task: new -- create unattached 00:00:00.040 27105.27105 [Thread[Thread-1,5,main]]: requestStop 00:00:00.041 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.041 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.041 27105.27105 [frysk.util.ProcStopUtil$1@2f8543c5]: parse pids [<<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=detached}>>] 00:00:00.042 27105.27105 [frysk.proc.ProcBlockAction@2f8c5b55]: new 00:00:00.043 27105.27105 [Thread[Thread-1,5,main]]: add Event <<frysk.proc.ProcBlockAction$3@2f8de515>> 00:00:00.043 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.044 27105.27105 [Thread[Thread-1,5,main]]: run 00:00:00.044 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop pendingOnly false 00:00:00.045 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.045 27105.27105 [Thread[Thread-1,5,main]]: remove ... return <<frysk.proc.ProcBlockAction$3@2f8de515>> 00:00:00.045 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop executing <<frysk.proc.ProcBlockAction$3@2f8de515>> 00:00:00.046 27105.27105 [{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=detached}]: requestAddInstructionObserver 00:00:00.046 27105.27105 [{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=detached}]: requestAddInstructionObserver 00:00:00.047 27105.27105 [Thread[Thread-1,5,main]]: add Event <<Observation[observable:frysk.proc.live.TaskObservable@2f8d0215observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.047 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.048 27105.27105 [{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=detached}]: requestAddTerminatedObserver 00:00:00.048 27105.27105 [{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=detached}]: requestAddObservation 00:00:00.048 27105.27105 [Thread[Thread-1,5,main]]: add Event <<Observation[observable:frysk.proc.live.TaskObservable@2f8d02a5observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.049 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.049 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.050 27105.27105 [Thread[Thread-1,5,main]]: remove ... return <<Observation[observable:frysk.proc.live.TaskObservable@2f8d0215observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.050 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop executing <<Observation[observable:frysk.proc.live.TaskObservable@2f8d0215observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.051 27105.27105 frysk.proc.live.LinuxPtraceProcState: handleAddObserver <<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=detached}>> 00:00:00.051 27105.27105 frysk.proc.live.LinuxPtraceProcState: state <<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=detached}>> 00:00:00.051 27105.27105 frysk.proc.live.LinuxPtraceTaskState: handleAttach <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=detached}>> 00:00:00.052 27105.27105 [{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=detached}]: sendAttach 00:00:00.052 27105.27105 frysk.sys.ptrace.Ptrace: attach <<2875>> 00:00:00.053 27105.27105 [frysk.sys.proc.Status@2f8e5155]: stan <<2875>> 00:00:00.053 27105.27105 frysk.sys.proc.Status: state 'T' 00:00:00.054 27105.27105 frysk.sys.proc.Status: stopped false 00:00:00.054 27105.27105 frysk.sys.proc.Status: uid 500 00:00:00.054 27105.27105 frysk.sys.proc.Status: gid 500 00:00:00.055 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.055 27105.27105 [Thread[Thread-1,5,main]]: remove ... return <<Observation[observable:frysk.proc.live.TaskObservable@2f8d02a5observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.056 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop executing <<Observation[observable:frysk.proc.live.TaskObservable@2f8d02a5observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.056 27105.27105 frysk.proc.live.LinuxPtraceProcState: handleAddObservation <<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=Attaching.ToMainTask}>> 00:00:00.057 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.057 27105.27105 [Thread[Thread-1,5,main]]: block timeout 0xffffffffffffffff 00:00:00.057 27105.27105 frysk.sys.Wait: waitpid 2875 -> status 0x137f WIFSTOPPED 19 (Stopped (signal)) 00:00:00.058 27105.27105 frysk.sys.Wait: waitpid 0 -> errno 0 (Success) 00:00:00.058 27105.27105 [frysk.proc.live.LinuxWaitBuilder@2f8b5e6d]: stopped <<2875>> 00:00:00.059 27105.27105 [{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=attaching}]: stoppedEvent <<SIGSTOP(19)>> 00:00:00.059 27105.27105 [{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=attaching}]: initializeAttachedState 00:00:00.060 27105.27105 frysk.sys.ptrace.Ptrace: setOptions <<2875>> options 0x5a 00:00:00.061 27105.27105 [{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=Attaching.ToMainTask}]: performTaskAttachCompleted 00:00:00.062 27105.27105 [Thread[Thread-1,5,main]]: add Event <<[ProcEvent,procnull]>> 00:00:00.062 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.063 27105.27105 [Thread[Thread-1,5,main]]: checkForTimerEvents 00:00:00.063 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.063 27105.27105 [Thread[Thread-1,5,main]]: remove ... return <<[ProcEvent,procnull]>> 00:00:00.063 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop executing <<[ProcEvent,procnull]>> 00:00:00.063 27105.27105 frysk.proc.live.LinuxPtraceProcState: handleTaskAttachCompleted <<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=Attaching.ToMainTask}>> 00:00:00.064 27105.27105 frysk.proc.live.LinuxPtraceProcState: allAttached <<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=Attaching.ToMainTask}>> 00:00:00.064 27105.27105 frysk.proc.live.LinuxPtraceTaskState: handleAddObservation <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForContinueOr Unblock}>> 00:00:00.064 27105.27105 [Thread[Thread-1,5,main]]: add Event <<frysk.proc.ProcBlockAction$1@2f8ed25d>> 00:00:00.064 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.065 27105.27105 frysk.proc.live.LinuxPtraceTaskState: handleAddObservation <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForContinueOr Unblock}>> 00:00:00.065 27105.27105 frysk.proc.live.LinuxPtraceTaskState: handleContinue <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForContinueOr Unblock}>> 00:00:00.065 27105.27105 [frysk.proc.Proc$ObservableXXX@2f8cb0d5]: notify -- all observers <<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=Attaching.ToMainTask}>> 00:00:00.066 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.066 27105.27105 [Thread[Thread-1,5,main]]: remove ... return <<frysk.proc.ProcBlockAction$1@2f8ed25d>> 00:00:00.066 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop executing <<frysk.proc.ProcBlockAction$1@2f8ed25d>> 00:00:00.066 27105.27105 [frysk.proc.ProcBlockAction@2f8c5b55]: this taskList <<[]>> proc.taskList <<[{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForUnblock,sig nal=SIGNONE(0)}]>> 00:00:00.067 27105.27105 [Thread[Thread-1,5,main]]: add Event <<frysk.util.ProcStopUtil$UtilEvent@2f8cff8d>> 00:00:00.067 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.067 27105.27105 [{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForUnblock,signal =SIGNONE(0)}]: requestDeleteInstructionObserver 00:00:00.067 27105.27105 [{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=running}]: requestDeleteInstructionObserver 00:00:00.068 27105.27105 [Thread[Thread-1,5,main]]: add Event <<Observation[observable:frysk.proc.live.TaskObservable@2f8d0215observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.068 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.068 27105.27105 [{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForUnblock,signal =SIGNONE(0)}]: requestDeleteTerminatedObserver 00:00:00.068 27105.27105 [Thread[Thread-1,5,main]]: add Event <<Observation[observable:frysk.proc.live.TaskObservable@2f8d02a5observer:frysk.proc.ProcBlockActi on$ProcBlockTaskObserver@2f8de535]>> 00:00:00.068 27105.27105 [Thread[Thread-1,5,main]]: wakeupIfBlocked 00:00:00.069 27105.27105 [Thread[Thread-1,5,main]]: remove ... 00:00:00.069 27105.27105 [Thread[Thread-1,5,main]]: remove ... return <<frysk.util.ProcStopUtil$UtilEvent@2f8cff8d>> 00:00:00.069 27105.27105 [Thread[Thread-1,5,main]]: runEventLoop executing <<frysk.util.ProcStopUtil$UtilEvent@2f8cff8d>> 00:00:00.069 27105.27105 frysk.util.ProcStopUtil: execute live <<{frysk.proc.live.LinuxPtraceProc@2f8c7395,pid=2875,state=running}>> 00:00:00.069 27105.27105 [frysk.bindir.fstack$StackerEvent@2f574fe5]: printTasks 00:00:00.079 27105.27105 [frysk.stack.LibunwindAddressSpace@2f8b966d]: getReg <<UnwindRegistersX8664_RIP>> register <<[Register: rip type: {{frysk.value.PointerType@2f921485,name=VOIDPTR64L_T,size=8}}]>> 00:00:00.081 27105.27105 [frysk.sys.ptrace.RegisterSet@2f9692d5]: transfer <<2875>> read ... 00:00:00.082 27105.27105 [frysk.stack.LibunwindAddressSpace@2f8b966d]: read value 0x35c8acbd2f 00:00:00.082 27105.27105 [frysk.stack.LibunwindAddressSpace@2f8b966d]: getReg <<UnwindRegistersX8664_RSP>> register <<[Register: rsp type: {{frysk.value.PointerType@2f921485,name=VOIDPTR64L_T,size=8}}]>> 00:00:00.082 27105.27105 [frysk.sys.ptrace.RegisterSet@2f9692d5]: transfer <<2875>> read ... 00:00:00.082 27105.27105 [frysk.stack.LibunwindAddressSpace@2f8b966d]: read value 0x7fff505d9cd0 00:00:00.083 27105.27105 frysk.stack.Frame: Getting task: <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForUnblock,sig nal=SIGNONE(0)}>> 00:00:00.084 27105.27105 frysk.stack.Frame: Getting task: <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForUnblock,sig nal=SIGNONE(0)}>> 00:00:00.084 27105.27105 frysk.dwfl.DwflCache: entering createDwfl, task <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForUnblock,sig nal=SIGNONE(0)}>> 00:00:00.085 27105.27105 frysk.dwfl.DwflCache: creating new dwfl for task <<{frysk.proc.live.LinuxPtraceTask@2f8d7455,pid=2875,tid=2875,state=Attached.WaitForUnblock,sig nal=SIGNONE(0)}>> 00:00:00.085 27105.27105 [frysk.util.ProcStopUtil$1@2f8543c5]: parse failed <<exception java.lang.NullPointerException>> Error: java.lang.NullPointerException
Since I'm doing other fixes to fstack, I'll look at this one.
I've dug into this. I hope you don't mind. This actually seems to be a bug in the readlink function rather than in frysk. The call to readlink in frysk::sys::proc::Exe::get returns a much higher byte count as a return value than the actual length of the path it fills in the buffer. The buffer also seems to be partly filled with garbage: (gdb) p file $9 = "/proc/14211/exe", '\0' <repeats 258 times>, "?\224??*\000\000\000qV_?\177\000\000\000? \224??*\000\000U\204??5", '\0' <repeats 1716 times>, "?\224??*\000\000? \000\000\000\000\000\000\000*\000\000\000\000\000\000\000?\204??5", '\0' <repeats 1723 times>, "8M??5\000\000\000\b\000\000\000\000\000\000\000\000K??5\000\000\000@U?? 5\000\000\000?C??5", '\0' <repeats 11 times>, "8M?? 5\000\000\000\b\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000?\230?? 5\000\000\000?\006@?5\000\000\000@\205?\000\000\000\000\000@\210?\000"... (gdb) p len $10 = 48 (gdb) p link $11 = "/usr/libexec/gconfd-2\000\000attachment-remi (deleted)", '\0' <repeats 849 times>, "?\224? ?*\000\000\000qV_?\177\000\000\000?\224??*\000\000U\204??5", '\0' <repeats 1716 times>, "? \224??*\000\000\000qV_?\177\000\000\000?\224??*\000\000U\204??5", '\0' <repeats 1438 times> It looks like readlink fills in garbage (?) after the first NULL, and the byte count it returns includes this garbage. This confuses frysk, causing the NPE because we a null is returned from LinuxPtraceProc.getExeFile, at line 238. I was able to reproduce this problem in a sample program outside of frysk. I have glibc 2.7-2, which appears to be the latest Fedora 8 package. Also, it appears I was mistaken in saying that this problem exists on Ubuntu. I tried fstack again today on Ubuntu 8.04 (also glibc 2.7), and this did not reproduce.
Just for extra information, in Fedora 8, x86-64 I cannot reproduce on gnome-termnal. Could be a Fedora 8 specific issue? Without debug-info installed: [pmuldoon@localhost frysk_bin]$ ps -ax | grep gnome-terminal Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ 3686 ? Sl 0:00 gnome-terminal 3820 pts/0 R+ 0:00 grep gnome-terminal [pmuldoon@localhost frysk_bin]$ ./frysk-core/frysk/bindir/fstack 3686 Task #3686 #0 0x000000332f4da926 in __poll () #1 0x000000329603ab18 in [unknown] #2 0x000000329603b1ad in g_main_loop_run () #3 0x00000035edb83a98 in gtk_main () #4 0x000000000041959e in [unknown] #5 0x000000332f41e32a in __libc_start_main () #6 0x000000000040ef39 in [unknown] Task #3693 #0 0x000000333000db8b in __read () #1 0x000000329603852b in [unknown] #2 0x0000003296060434 in [unknown] #3 0x000000333000729a in start_thread () #4 0x000000332f4e42cd in __clone () With debug-info installed. [pmuldoon@localhost frysk_bin]$ ./frysk-core/frysk/bindir/fstack 3686 Task #3686 #0 0x000000332f4da926 in __poll () #1 0x000000329603ab18 in g_main_context_iterate () #2 0x000000329603b1ad in g_main_loop_run () #3 0x00000035edb83a98 in gtk_main () #4 0x000000000041959e in [unknown] #5 0x000000332f41e32a in __libc_start_main () #6 0x000000000040ef39 in [unknown] Task #3693 #0 0x000000333000db8b in __read () #1 0x000000329603852b in child_watch_helper_thread () #2 0x0000003296060434 in g_thread_create_proxy () #3 0x000000333000729a in start_thread () #4 0x000000332f4e42cd in __clone ()
(In reply to comment #2) > I've dug into this. I hope you don't mind. not at all! > This actually seems to be a bug in the readlink function rather than in frysk. The call to readlink in frysk::sys::proc::Exe::get returns a much higher byte count as a return value than the actual length of > the path it fills in the buffer. The buffer also seems to be partly filled with garbage: > > (gdb) p file > $9 = "/proc/14211/exe", '\0' <repeats 258 times>, "?\224??*\000\000\000qV_?\177\000\000\000? > \224??*\000\000U\204??5", '\0' <repeats 1716 times>, "?\224??*\000\000? > \000\000\000\000\000\000\000*\000\000\000\000\000\000\000?\204??5", '\0' <repeats 1723 > times>, "8M??5\000\000\000\b\000\000\000\000\000\000\000\000K??5\000\000\000@U?? > 5\000\000\000?C??5", '\0' <repeats 11 times>, "8M?? > 5\000\000\000\b\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000?\230?? > 5\000\000\000?\006@?5\000\000\000@\205?\000\000\000\000\000@\210?\000"... ah! this is sounding like the kernel; you could try "fexe" on a process showing the problem or just ls -l /proc/PID/exe? If the last one prints garbage then I think it is the kernel and we should push it to fedora. For a workaround, I guess we could sanity check using strlen and then log the problem.
Here's a repo: cagney@rat$ cp /bin/sleep /tmp cagney@rat$ /tmp/sleep 1000 & [1] 28790 cagney@rat$ cagney@rat$ cagney@rat$ fexe 28790 /tmp/sleep cagney@rat$ cagney@rat$ rm /tmp/sleep cagney@rat$ fexe 28790 Error: java.lang.NullPointerException cagney@rat$ cagney@rat$ ls -l /proc/28790/exe lrwxrwxrwx 1 cagney cagney 0 2008-06-12 08:51 /proc/28790/exe -> /tmp/sleep (deleted) Hmm, I suspect this was an "it was working"; we've a relatively straight forward test case implement.
This prints an error explainng what's gone wrong, rather than NPEing. commit e7fe8b0edbf5bc538a3e7fbcd37997bc800d27a5 Author: Andrew Cagney <cagney@toil.yyz.redhat.com> Date: Thu Jun 12 16:31:00 2008 -0400 When /proc/PID/exe is invalid throw an exception. frysk-core/frysk/bindir/ChangeLog 2008-06-12 Andrew Cagney <cagney@redhat.com> * TestFexe.java (testExeOfDeletedFile()): Enable. Match expected error. * fexe.java: Use Exe.getName. frysk-core/frysk/proc/live/ChangeLog 2008-06-12 Andrew Cagney <cagney@redhat.com> * LinuxPtraceProc.java (getExe()): Delete. frysk-sys/frysk/sys/ChangeLog 2008-06-12 Andrew Cagney <cagney@redhat.com> * cni/Errno.cxx (throwUserException): New. * cni/Errno.hxx (throwUserException): New. frysk-sys/frysk/sys/proc/ChangeLog 2008-06-12 Andrew Cagney <cagney@redhat.com> * Exe.java (getName): Rename get. (openReadOnly): New. * jni/Exe.cxx (Exe::getName): Update. * cni/Exe.cxx (Exe::getName): Update. frysk-sys/jnixx/ChangeLog 2008-06-12 Andrew Cagney <cagney@redhat.com> * exceptions.cxx (userException): Implement. * exceptions.hxx (userException): Declare. cagney@toil$ cp /bin/sleep . cagney@toil$ ./sleep 1000 & [2] 14879 cagney@toil$ ./frysk/bindir/fexe $! /notnfs/cagney/git/native/frysk-core/sleep cagney@toil$ rm sleep cagney@toil$ ./frysk/bindir/fexe $! Error: The link /proc/14879/exe points to the deleted file /notnfs/cagney/git/native/frysk-core/sleep
In my case, the binary was not deleted. If you look at the gdb output, you will see that the buffer contains the valid path, followed by a NULL, followed by garbage which happens to include "(deleted)". The byte count went to the end of "(deleted)". Of course, printing out the buffer will result in the right output, due to the NULL at the end of the correct path. There's definitely procfs or glibc weirdness here, but it has nothing to do with frysk.
It appears that as of commit 531115f2541df6ed1e8a8f16e24c519fb858b209, Errno.h includes frysk/UserException.h, but the latter does not seem to have been checked in, causing a build failure.
(In reply to comment #7) > In my case, the binary was not deleted. If you look at the gdb output, you will see that the buffer contains > the valid path, followed by a NULL, followed by garbage which happens to include "(deleted)". Ah, yes, changing the code, I found an old comment (now living in frysk/sys/proc/{cni,jni}/Exe.cxx: if ((int) strlen(link) != len) { // Assume that an EXE that has somehow ended up with an embedded // NUL character is invalid. This happens when the kernel screws // up "mv a-really-long-file $exe" leaving the updated EXE string // with something like "$exe<NUL>ally-long-file (deleted)". userException(env, "The link %s is corrupt", file); } here's the actual sequence, I think I might tweak fexe a little so it prints this out: cagney@rat$ /tmp/sleep 1000 & [1] 31378 cagney@rat$ cagney@rat$ mv /tmp/a-really-long-file /tmp/sleep cagney@rat$ ls -l /proc/31378/exe lrwxrwxrwx 1 cagney cagney 0 2008-06-13 14:21 /proc/31378/exe -> /tmp/sleep cagney@rat$ (the string actually contains "/tmp/sleep<NUL>.... (deleted)"), and as you can see the exe, while looking ok, is pointing to the wrong file. Pre-link, and possibly update, can do this. (i've re-opened the bug) Can you check for this. I'm also going to tweak fexe so its easier to see this.
Ah, this makes sense and explains what I've been seeing. I originally built frysk on Ubuntu. Prior to doing that, I updated quite a few packages and did not reboot or restart X. When I tried fstack, I would get the NPE. I then decided to try frysk on Fedora but wasn't able to get to it for several weeks. When I had some free time, I installed a Fedora 8 VM and proceeded to update all my packages (which was a fun project in itself...). I then built frysk and saw the same NPE. However, in those two weeks, I had rebooted my Ubuntu machine, and so when I went back to try frysk on it again, I could not reproduce the problem. Tricky little problem.