This is the mail archive of the frysk@sources.redhat.com mailing list for the frysk project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: remote unwinding of libunwind


On Tue, 2006-09-19 at 19:13 +0800, Wu Zhou wrote:
> Noticing there are quite some stack-unwind code checked into CVS, I spared some time to play around. 
>    The test results seem to be quite satisfactory.  It can now get the function name in the 
> dynamically-loaded library, and extract the source and line information if available.  And it also 
> start to support multi-thread unwinding now.
> 
> But I also noticed some little problems.  The first one is while I am playing with Kyle's code. It 
> can step / unwind both threads now, but it seems the unwinder swallows some frames for itself own 
> consumption.  :-)  Looking into the below unwind session, you will notice that there are four level 
> fames in both threads.  But in fact, there are six frames in each.  You can see this from the pstack 
> output.
> 
> $ ./unwinddebug
> Enter the PID of the main therad: 8297
> Assuming second thread is pid 8298
> Tracing main thread!
> Frames of pid 8297:
> 
> found frame 0
> 0000000000bfb402                                  (sp=00000000bfe87ba4)
> found frame 1
> 0000000008048893 main+0x10e                       (sp=00000000bfe87d70)
> found frame 2
> 0000000000c2e724 __libc_start_main+0xdc           (sp=00000000bfe87dd0)
> found frame 3
> 0000000008048521 _start+0x21                      (sp=00000000bfe87e40)
> 
> Trace Depth = 4
> 
> Tracing second thread!
> Frames of pid 8298:
> 
> found frame 0
> 0000000000bfb402 +0x21                            (sp=00000000b7eef264)
> found frame 1
> 00000000080486b6 thread1+0x77                     (sp=00000000b7eef430)
> found frame 2
> 0000000000db440b start_thread+0xa9                (sp=00000000b7eef460)
> found frame 3
> 0000000000ce1b7e __clone+0x5e                     (sp=00000000b7eef4d0)
> 
> Trace Depth = 4
> 
> $ pstack 8297
> Thread 2 (Thread -1209074784 (LWP 8298)):
> #0  0x00bfb402 in __kernel_vsyscall ()
> #1  0x00ca3f16 in __nanosleep_nocancel () from /lib/libc.so.6
> #2  0x00ca3d3b in sleep () from /lib/libc.so.6
> #3  0x080486b6 in thread1 ()
> #4  0x00db440b in start_thread () from /lib/libpthread.so.0
> #5  0x00ce1b7e in clone () from /lib/libc.so.6
> Thread 1 (Thread -1209071296 (LWP 8297)):
> #0  0x00bfb402 in __kernel_vsyscall ()
> #1  0x00ca3f16 in __nanosleep_nocancel () from /lib/libc.so.6
> #2  0x00ca3d3b in sleep () from /lib/libc.so.6
> #3  0x08048893 in main ()
> 
> 
> The second one is found while I am playing with Tromey's fdtrace:
> 
> # ./frysk/bindir/fdtrace /home/woodzltc/fdtrace/Closer2
> bad close() call at:
> val = 0; in function: null (<Unknown file> at line 0)
> val = 134513583; in function: doit2 (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 9)
> val = 134513607; in function: main (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 13)
> val = 12773156; in function: __libc_start_main (Unknown file at line 0)
> val = 134513409; in function: _start (Unknown file at line 0)
> bad close() call at:
> val = 0; in function: null (<Unknown file> at line 0)
> val = 134513583; in function: doit2 (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 9)
> val = 134513607; in function: main (/home/woodzltc/AboutFrame/libunwind/fdtrace/Closer2.c at line 13)
> val = 12773156; in function: __libc_start_main (Unknown file at line 0)
> val = 134513409; in function: _start (Unknown file at line 0)
> 
> The address of the first frame seems to be 0, and "doit()" and "close()" was swallowed as well.
> 
> Anyone noticed these problems before?  Is there any work to make improvement on this?
> 

Yup, I noticed it yesterday as well. However, Alex still has some
pending patches to go into libunwind. When those get in we'll take a
closer look at this... need to fix one problem at a time.

> 
> BTW, I also have one observation that libunwind has only two test cases for remote unwinding.  That 
> is far from enough, IMO.  Stack unwind has quite some different scenarios, especially in remote 
> unwind.  We will have no way to be sure how it works in these scenario, if we have not test them. 
> So I predict there are yet some other problems some where we didn't noticed.
> 
> My two cents is we need to write much more cases to evaluate how libunwind works in various 
> scenarios: single thread and multi-threads, normal operation and abnormal operation (signal frame or 
> exception handler or non-local jump)... It is better if we can also extract the backtrace 
> information from the core dumped out.

You're absolutely right - the testcases are lacking. I'll beef this up
when I get some time this week!

- Mike


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]