backtrace problem in signal-handler

Tianwei tianwei.sheng@gmail.com
Thu Jan 14 14:23:00 GMT 2010


Hi, all,
   I do not know if post the problem into right list, but I really
hope you can give me some suggestions for my problems.

Now I wrote a PMU profiling tool to monitor programs, such as httpd
using the LD_PRELOAD mechanism.  The tool is a self-monitoring
one which means that it reads information in program's signal-handler.
   Now I also want to get its backtrace in the signal-handler, so I
wrote some code as:
 static void signal_handler(int sig, siginfo_t *info, void *context) {
 void *stack_trace[10];
  size = backtrace(stack_trace, 10);
...........
}

I use SIGIO, also the signal_handler is called very frequently based
on the hardware PMU interrupt. now the problem is that sometimes I
will get SIGSEGV inside the "backtrace" function, as:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeb0e18910 (LWP 29023)]
(gdb) bt
#0  0x00007ffff639cc27 in ?? () from /lib/libgcc_s.so.1
#1  0x00007ffff639d48b in _Unwind_Backtrace () from /lib/libgcc_s.so.1
#2  0x00007ffff6c4d73e in backtrace () from /lib/libc.so.6
#3  0x00007ffff7bd8baa in signal_handler (sig=29, info=0x7ffe2f0708b0,
context=0x7ffe2f070780) at racez_pmuprofiler.cc:337
#4  <signal handler called>
#5  0x00007ffff6bebbd4 in ?? () from /lib/libc.so.6
#6  0x00000000419fa618 in ?? ()
#7  0x00007ffe2f070ca0 in ?? ()
#8  0x000000000077a2d0 in ?? ()
#9  0x00000000004c6c8d in explode_time (xt=0x1, t=0, offset=0,
use_localtime=0) at time/unix/time.c:92
#10 0x00007ffecc039de8 in ?? ()
#11 0x00000000006f5c60 in ?? ()
#12 0x0000000000000000 in ?? ()

I do not have any clues for what happen, so I switch to use libunwind
to get the backtrace, as:

int get_backtrace (void** buffer, int n) {
  unw_cursor_t cursor; unw_context_t uc;
  unw_word_t ip, sp;
  int i = 0;
  unw_getcontext(&uc);
  unw_init_local(&cursor, &uc);
  while (unw_step(&cursor) > 0 && i < n) {
    unw_get_reg(&cursor, UNW_REG_IP, &ip);
    unw_get_reg(&cursor, UNW_REG_SP, &sp);
    buffer[i] = (void*)ip;
    i++;
    //printf ("ip = %lx, sp = %lx\n", (long) ip, (long) sp);
  }
  return i;
}

 but met the similar problem:
Program received signal SIGSEGV, Segmentation fault.
(gdb) bt
  #0  access_mem (as=0x7ffff65991c0, addr=8, val=0x7ffefbe2e678,
write=0, arg=0x7ffefbe2ed90) at x86_64/Ginit.c:164
#1  0x00007ffff638cfd0 in dwarf_get (c=<value optimized out>,
rs=0x7ffff659fd48) at ../include/tdep-x86_64/libunwind_i.h:137
#2  apply_reg_state (c=<value optimized out>, rs=0x7ffff659fd48) at
dwarf/Gparser.c:766
#3  0x00007ffff638eccb in _Ux86_64_dwarf_find_save_locs
(c=0x7ffefbe2ed90) at dwarf/Gparser.c:849
#4  0x00007ffff638fb09 in _Ux86_64_dwarf_step (c=0x7ffff65991c0) at
dwarf/Gstep.c:35
#5  0x00007ffff6392c1a in _Ux86_64_step (cursor=0x7ffff65991c0) at
x86_64/Gstep.c:42
#6  0x00007ffff7bd8bc6 in get_backtrace (buffer=0x7ffefbe2f6f0, n=10)
at racez_pmuprofiler.cc:297
#7  0x00007ffff7bd8c9f in signal_handler (sig=29, info=0x7ffefbe2f8b0,
context=0x7ffefbe2f780) at racez_pmuprofiler.cc:329
#8  <signal handler called>
#9  0x00007ffff6bec13e in ?? () from /lib/libc.so.6
#10 0x00000000004c6c8d in explode_time (xt=Cannot access memory at
address 0xffffffffffffffa8
) at time/unix/time.c:92
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Now at time/unix/time.c:92 is a  "gmtime_r(&tt, &tm)", It seems that
inside this glibc function, it's interrupted with PMU, and then enter
the signal-handler,
finally when the signal-handler try to walk through the stack and get
the backtrace, it will meet segmentation fault problem.  My current
guess is that before entering the signal-handler , the stack is
already corrupted , which we can see even gdb can not get the full
backtrace as indicated by "Backtrace stopped: previous frame inner to
this frame (corrupt stack?)". But I do not know why this will fail. I
read that both the backtrace and libunwind are asychn-signal-safe, It
should be safe to use it.

I am really appreciated if someone can give me some suggestions for
this problems.

Thanks so much.
Tianwei
--
Sheng, Tianwei
Inst. of High Performance Computing
Dept. of Computer Sci. & Tech.
Tsinghua Univ.



More information about the Libc-help mailing list