Bug 2354 - Adding a syscall observer remotely on rhel, and invoking a syscall causes crash
Summary: Adding a syscall observer remotely on rhel, and invoking a syscall causes crash
Alias: None
Product: frysk
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Andrew Cagney
Depends on:
Blocks: 2357 2358
  Show dependency treegraph
Reported: 2006-02-18 21:53 UTC by Sami Wagiaalla
Modified: 2006-02-19 17:47 UTC (History)
0 users

See Also:
Last reconfirmed:


Note You need to log in before you can comment on or make changes to this bug.
Description Sami Wagiaalla 2006-02-18 21:53:42 UTC
I have seen this on tower.toronto.redhat.com
i run frysk
ssh in from another terminal
get the pid of that terminal,
using frysk i add a syscall observer to it
I type ls in the terminal
and boom:

#0  0x0062a7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x003497d5 in raise () from /lib/tls/libc.so.6
#2  0x0034b149 in abort () from /lib/tls/libc.so.6
#3  0x02702d49 in _Jv_Throw (value=0x65c5630)
   at ../../../libjava/exception.cc:113
#4  0x080f9e44 in throwRuntimeException (message=Could not find the frame base
for "throwRuntimeException(char const*, char const*, int)".
   at frysk/sys/cni/Errno.cxx:143
#5  0x080f8238 in handler (signum=17) at frysk/sys/cni/Poll.cxx:73
#6  <signal handler called>
#7  0x0062a7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#8  0x003df9e4 in poll () from /lib/tls/libc.so.6
#9  0x0053e7d4 in g_main_context_check () from /usr/lib/frysk/libglib-2.0.so.0
#10 0x0053edbf in g_main_loop_run () from /usr/lib/frysk/libglib-2.0.so.0
#11 0x00e93bff in link_thread_io_context () from /usr/lib/libORBit-2.so.0
#12 0x0059a1d4 in ?? () from /usr/lib/frysk/libglib-2.0.so.0
#13 0xb73b03a8 in ?? ()
#14 0x0055689c in g_static_private_free () from /usr/lib/frysk/libglib-2.0.so.0
Comment 1 Andrew Cagney 2006-02-19 17:08:41 UTC
This code, from frysk/sys/cni/Poll.cxx, is attempting to throw an exception:

static void
handler (int signum)
  if (!poll_jmpbuf.p)
    throwRuntimeException ("frysk.sys.Poll: bad jmpbuf", "tid",
                           frysk::sys::Tid::get ());
  siglongjmp (poll_jmpbuf.buf, signum);

Since the signal occured in C code, that can't work, and hence the panic.

Two issues turn up here:

- Why did the wrong thread get the signal?  SIGCHLD was ment to have been masked
for all but the event-loop thread

- A trace shows stuff other than the event-loop meddling with SIGCHLD signals. 
That shouldn't be occuring (except possibly during startup before the event loop
is activated).

This change avoids the symptoms.

Index: frysk-sys/frysk/sys/ChangeLog
2006-02-19  Andrew Cagney  <cagney@redhat.com>

        * cni/Poll.cxx (handler): If the signal is to the wrong thread,
        re-send it to the correct one.
        (poll_jmpbuf): Replace .p with .tid.
        (poll): Set the .tid.

I'll create separate bugs for the above two issues.
Comment 2 Andrew Cagney 2006-02-19 17:47:34 UTC
The specific bug <<fixed>> two related bugs created.