This is the mail archive of the gdb@sourceware.cygnus.com mailing list for the GDB project. See the GDB home page for more information.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

No Subject


This problem has already been reported to the bugs list:

    http://www.cygnus.com/ml/gdb-bugs/1999-Jan/0001.html

But I would like to add some clarifying information.  I'd be willing to help
find and fix this problem, because I like gdb very much and want to use it
for my IRIX development.  However, I've already taken my investigation about
as far as I can by myself, so I need some help.

The problem occurs with the following software versions:

    gdb 4.17, as released last year
    gdb 4.17.85 (4.18 pre-release candidate), recently released
    IRIX 6.5

My gdb startup report is as follows:

	foreigner [248]--> ./gdb ./hello
	GNU gdb 4.17.85
	Copyright 1998 Free Software Foundation, Inc.
	GDB is free software, covered by the GNU General Public License, and
you are
	welcome to change it and/or distribute copies of it under certain
conditions.
	Type "show copying" to see the conditions.
	There is absolutely no warranty for GDB.  Type "show warranty" for
details.
	This GDB was configured as "mips-sgi-irix6.5"...
	(gdb)  

The original bug report mentioned code compiled with gcc; I have the same
problem with MIPSpro C/C++.  I suspect that the compiler doesn't matter.

The problem is as reported earlier; to summarize: it occurs when trying to
debug any program that has the posix threads library (-lpthread) linked in.
Gdb reports an unknown signal, then nothing further happens (it hangs):

	(gdb) run
	Starting program: /home/jct/zetetic/hello/./hello
	warning: Signal ? does not exist on this system.

Interrupting the process (C-c) results in:

	/proc/2428755: Interrupted function call.
	PIOCSTATUS or PIOCWSTOP failed.
	(gdb)

This is all as previously reported;  I can add the following:

First, it doesn't matter what the program is actually trying to do, because
it never gets to main().  In fact, to recreate the problem only requires
that the POSIX threads library be linked in.  The problem can be recreated
with just a simple hello-world program, if it's linked with the pthreads
library:

    cc hello.c -o hello -lpthread

The same program without -lpthread will run without problem in the debugger.

Second, the problem appears to occur in the __start function:

	(gdb) where
	#0  __start ()
	    at
/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M4/csu/crt1text.s:103

Finally, I used gdb to debug itself, after locating the function that was
producing the error and setting a breakpoint.  The problem signal is the
third to arrive:

	foreigner [259]--> ./gdb ./gdb
	GNU gdb 4.17.85
	Copyright 1998 Free Software Foundation, Inc.
	GDB is free software, covered by the GNU General Public License, and
you are
	welcome to change it and/or distribute copies of it under certain
conditions.
	Type "show copying" to see the conditions.
	There is absolutely no warranty for GDB.  Type "show warranty" for
details.
	This GDB was configured as "mips-sgi-irix6.5"...
	Setting up the environment for debugging gdb.
	During symbol reading, type qualifier 'const' ignored.
	Breakpoint 1 at 0x101519e8: file utils.c, line 474.
	Breakpoint 2 at 0x1014e670: file top.c, line 2608.
	(top-gdb) break target_signal_to_host
	Breakpoint 3 at 0x100c4adc: file target.c, line 1482.
	(top-gdb) run ./hello
	Starting program: /home/jct/build/gdb-4.17.85/gdb/./gdb ./hello
	GNU gdb 4.17.85
	Copyright 1998 Free Software Foundation, Inc.
	GDB is free software, covered by the GNU General Public License, and
you are
	welcome to change it and/or distribute copies of it under certain
conditions.
	Type "show copying" to see the conditions.
	There is absolutely no warranty for GDB.  Type "show warranty" for
details.
	This GDB was configured as "mips-sgi-irix6.5"...
	Setting up the environment for debugging gdb.
	.gdbinit:5: Error in sourced command file:
	Function "fatal" not defined.
	(gdb) run
	Starting program: /home/jct/build/gdb-4.17.85/gdb/./hello

	Breakpoint 3, target_signal_to_host (oursig=TARGET_SIGNAL_0) at
target.c:1482
	1482      switch (oursig)
	(top-gdb) p oursig
	$1 = TARGET_SIGNAL_0
	(top-gdb) cont
	Continuing.

	Breakpoint 3, target_signal_to_host (oursig=TARGET_SIGNAL_0) at
target.c:1482
	1482      switch (oursig)
	(top-gdb) p oursig
	$2 = TARGET_SIGNAL_0
	(top-gdb) cont
	Continuing.

	Breakpoint 3, target_signal_to_host (oursig=TARGET_SIGNAL_UNKNOWN)
	    at target.c:1482
	1482      switch (oursig)
	(top-gdb) p oursig
	$3 = TARGET_SIGNAL_UNKNOWN
	(top-gdb) printf "%d\n", oursig
	76
	(top-gdb) 

This is where it gets weird, for me at least.  Signal 76?  I've never heard
of such a beast, and it's not documented in the IRIX man pages.  However, I
do know that some operating systems, Solaris for example, use special
signals for the kernel to notify the process of changes in its runnability,
for the purposes of scheduling user threads or somesuch.  Perhaps that's
what's going on here.  Either that or the signal value is getting mangled
somewhere.  I couldn't find the signal handler, though, to check the
signal's original value.  I tried to understand the signal setup inside gdb,
but it's rather, um, complex and I wasn't able to follow it.

I was able to follow the stack trace further up, though:

	(top-gdb) where
	#0  target_signal_to_host (oursig=TARGET_SIGNAL_UNKNOWN) at
target.c:1482
	During symbol reading, type qualifier 'const' ignored.
	#1  0x100a52c0 in procfs_resume (pid=-1, step=0,
signo=TARGET_SIGNAL_UNKNOWN)
	    at procfs.c:3808
	#2  0x100ab1fc in solib_create_inferior_hook () at irix5-nat.c:1259
	#3  0x100a1700 in fork_inferior (
	    exec_file=0x102a6788 "/home/jct/build/gdb-4.17.85/gdb/./hello",
	    allargs=0x1029ac38 "", env=0x1029b860,
	    traceme_fun=0x100a3a10 <proc_set_exec_trap>,
	    init_trace_fun=0x100a35c0 <procfs_init_inferior>,
pre_trace_fun=0,
	    shell_file=0x7fff35ab "/usr/freeware/bin/bash") at
fork-child.c:414
	(top-gdb)    

It looks like the function solib_create_inferior_hook is in a tight loop
waiting for something (for shared libraries to load?).  When signal 76
arrives, the child process stops, but because gdb doesn't know how to deal
with that signal number, it's unable to restart the child, hence the whole
thing hangs.

Again, I've taken this as far as I can, but would be happy to act as a
remote pair of hands and eyes if the gdb team doesn't have an IRIX 6.5
system handy for testing.

(BTW, I'm sorry about the crappy formatting of the gdb sessions above, but
I'm required to use MS Outlook from this site.)

--JT

application/ms-tnef