Zero valued N_FUN stabs in shared objects: Why?

Kevin Buettner kevinb@cygnus.com
Fri Sep 10 15:19:00 GMT 1999


[ This is going to be a rather long message, complete with code
  fragments, a patch, and lots of analysis regarding some bugs that I've
  been working on.  Even if you choose not to read the entire message,
  the salient question is as follows:

    Why do shared objects on Solaris and Linux have zero-valued
    N_FUN stabs?

  I quite understand if you choose not to read beyond this point,
  but if you know the answer to the above question, please write
  me and tell what you think...
]

Recently, on Linux, I noticed that stepping into shared libraries was
broken in the (Cygnus) development branch of gdb.  [It works okay using
gdb-4.17.0.11.]  I did a bit of digging and discovered that simply
defining SOFUN_ADDRESS_MAYBE_MISSING in config/i386/tm-linux.h seemed
to fix the problem.  Later, I did some more digging and found that
H.J. Lu's patches to gdb-4.17 have a similar #define for
SOFUN_ADDRESS_MAYBE_MISSING.  Great, I thought, this solves the
problem very neatly.

But it may be that defining SOFUN_ADDRESS_MAYBE_MISSING is not the
"right" thing to do.  Jim Blandy sent me the following message which
quite neatly sums up the problem:

    This flag should make you ask, "*Why* don't these stabs contain
    the information they are intended to?"

    As I understand it, SOFUN_ADDRESS_MAYBE_MISSING is meant to work
    around a bug in the way vendors' toolchains generate debug info. 
    If we have that problem, we should fix it in the toolchain, not
    propagate the workaround to other platforms.  This goes especially
    for Linux, where we actually have control over the toolchain.

    We need to figure out *why* the N_SO and N_FUN stabs have zero
    addresses.  I suspect it is due to confusion regarding relocs in
    shared libraries, which we need to actually understand and
    resolve.

I think it is worthwhile to consider the N_SO and N_FUN stabs separately
because it may be worthwhile to allow one, but not the other to be zero.

The motivation behind why you might conceivably want these symbols to
have zero addresses is given in appendix G.2 of ``The "stabs" debug
format'' by Julia Menapace, Jim Kingdon, and David MacKenzie (if
you're looking at a paper copy or postscript), or at

    http://sourceware.cygnus.com/gdb/onlinedocs/stabs_13.html#SEC89

if you're online.  [Thanks to Elena Zannoni for pointing this out to
me.]  Here is an excerpt:

    To keep linking fast, you don't want the linker to have to
    relocate very many stabs.  Making sure this is done for N_SLINE,
    N_RBRAC, and N_LBRAC stabs is the most important thing (see the
    descriptions of those stabs for more information).  But Sun's
    stabs in ELF has taken this further, to make all addresses in the
    n_value field (functions and static variables) relative to the
    source file.  For the N_SO symbol itself, Sun simply omits the
    address.

I don't want to spend a lot of time discussing the N_SO stabs, because
the GNU toolchain does seem to be giving these proper addresses (which
are actually offsets from the start of the section).

However, the N_FUN stabs that I've been looking at in Solaris and
Linux shared libraries (and generated with the GNU toolchain) are
zero.  [Actually, this isn't entirely true.  The N_FUN stabs come in
pairs with the first one giving the name of the function along with a
value of zero.  The second has an empty string for the name and the
value gives the size of the function.]  I speculate that the reason for
making the N_FUN stabs zero valued is to keep the linking fast.

When gdb wants to know the address associated with one of these
zero-valued N_FUN stabs, it consults the minimal symbol table
which on Linux and Solaris (I believe) is obtained by reading the
ELF symbols.  It turns out that gdb does this rather late, possibly
too late, when it converts partial symtabs to (complete) symtabs.

But I'm getting ahead of myself... first I need to describe the
other bug that I've been looking at.  This bug was reported by
Gal Shalif who also provided the following test case:

    --- test-gdb-virtual-functions-step.cpp ---
    #include "lib.h"
    int main()
    {
        TestGdbStepCommandWithVirtualFunctions *obj =
            create();
        obj->proc();
        return 0;
    }
    --- lib.h ---
    #include <stdio.h>
    class TestGdbStepCommandWithVirtualFunctions {
    public:
        virtual void proc() {
            printf("TestGdbStepCommandWithVirtualFunctions::proc()\n");
            return;
        }
    };

    TestGdbStepCommandWithVirtualFunctions *create();
    --- lib1.h ---
    #include "lib.h"
    class TestGdbStepCommandWithVirtualFunctions1 
        : public TestGdbStepCommandWithVirtualFunctions {
    public:
        virtual void proc() {
            printf("TestGdbStepCommandWithVirtualFunctions1::proc()\n");
            return;
        }
    };
    --- lib1.cpp ---
    #include "lib1.h"

    TestGdbStepCommandWithVirtualFunctions *create()
    {
        TestGdbStepCommandWithVirtualFunctions1 *obj = 
            new TestGdbStepCommandWithVirtualFunctions1();
        return (TestGdbStepCommandWithVirtualFunctions *)obj;
    }
    ----------------

lib1.cpp is compiled and linked into a shared object (liblib1.so) and
test-gdb-virtual-functions-step.cpp is compiled and linked into an
executable which depends on liblib1.so.  The bug is that while it is
possible to step into create(), it is not possible (on either Solaris
or Linux) to step into TestGdbStepCommandWithVirtualFunctions1::proc().
Note that this method is declared/defined in a header file.

There are several facts worth mentioning.  1) When linked
statically, everything works fine (i.e, stepping is possible where it
wasn't before).  2) If -gdwarf-2 is used instead of -g (which is stabs
on Linux and Solaris), stepping into obj->proc() also works.

It turned out that the reason that it was impossible to step into the
proc() method is because when the partial symtab was created, the
texthigh field (in a partial_symtab) was not determined correctly.
It accounted for create(), but not for either of the proc() methods.

In my patch below, I fix this problem in partial-stab.h by obtaining
the symbol's address from the minimal symbols when an N_FUN stab value
is zero.  This will not only cause the texthigh field to be set
properly, but the symbol will have the correct address in the data
structures associated with the partial_symtab.  It is crucial that
this value be correct so that symbol will actually be found when
find_pc_sect_psymtab() is called.

Since I'm discussing the details in my patch, I'll point out that
there were two nearly identical swatches of code which attempt to look
up the corresponding value of a minimal symbol for the case when
N_FUN's value is 0.  The first is located in minsyms.c and is neatly
contained in the function find_stab_function_addr().  The second
swatch of code appears in process_one_symbol() in dbxread.c.  It
seemed cleaner to completely remove this second swatch of code and
replace it with a call to find_stab_function_addr().  However, before
doing so I needed to change the interface to find_stab_function_addr
slightly.  Now, instead of taking a pointer to a partial_symtab as its
second parameter, it now takes a character pointer representing the
(source) filename associated with the partial_symtab.

It turned out that there was another change needed for
find_stab_function_addr() and it is a change with which I am not at
all happy.  Among other things, the SOFUN_ADDRESS_MAYBE_MISSING hack
attempts to associate the name of the (containing) source file with
each minimal symbol.  However, for the above example, the symbol
proc__39TestGdbStepCommandWithVirtualFunctions1 becomes associated
with the filename "new.cc".  I have not studied the problem in any
great detail yet, but when I glanced at it a few days ago, I saw no
easy solution for getting the filename right.

So what I've done instead is to modify find_stab_function_addr().  Now
after attempting and failing to look up the minimal symbol with the
filename, it will also attempt to look it up without the filename. 
This should work well for all external symbols, but may cause problems
for the statics (of which there could be multiple names).

The patch is below.  I invite comments and suggestions on how it
might be improved (particularly with regard to the problem involving
the association of incorrect filenames to minimal symbols).

Also, please note that this patch would be completely unnecessary if
the N_FUN stabs had non-zero values that represented offsets from the
beginning of the section.  I would definitely like to hear opinions,
expert or otherwise, on why shared objects should have zero valued
N_FUN stabs.

Thanks,

Kevin


Index: dbxread.c
===================================================================
RCS file: /cvs/cvsfiles/devo/gdb/dbxread.c,v
retrieving revision 1.241
diff -u -r1.241 dbxread.c
--- dbxread.c	1999/09/01 20:21:12	1.241
+++ dbxread.c	1999/09/10 20:55:12
@@ -2314,34 +2314,8 @@
 	         from N_FUN symbols.  */
 	      if (type == N_FUN
 		  && valu == ANOFFSET (section_offsets, SECT_OFF_TEXT))
-		{
-		  struct minimal_symbol *msym;
-		  char *p;
-		  int n;
-
-		  p = strchr (name, ':');
-		  if (p == NULL)
-		    p = name;
-		  n = p - name;
-		  p = alloca (n + 2);
-		  strncpy (p, name, n);
-		  p[n] = 0;
-
-		  msym = lookup_minimal_symbol (p, last_source_file,
-						objfile);
-		  if (msym == NULL)
-		    {
-		      /* Sun Fortran appends an underscore to the minimal
-		         symbol name, try again with an appended underscore
-		         if the minimal symbol was not found.  */
-		      p[n] = '_';
-		      p[n + 1] = 0;
-		      msym = lookup_minimal_symbol (p, last_source_file,
-						    objfile);
-		    }
-		  if (msym)
-		    valu = SYMBOL_VALUE_ADDRESS (msym);
-		}
+		valu = 
+		  find_stab_function_addr (name, last_source_file, objfile);
 #endif
 
 #ifdef SUN_FIXED_LBRAC_BUG
Index: minsyms.c
===================================================================
RCS file: /cvs/cvsfiles/devo/gdb/minsyms.c,v
retrieving revision 2.49
diff -u -r2.49 minsyms.c
--- minsyms.c	1999/07/07 23:51:39	2.49
+++ minsyms.c	1999/09/10 20:55:13
@@ -440,9 +440,9 @@
 
 #ifdef SOFUN_ADDRESS_MAYBE_MISSING
 CORE_ADDR
-find_stab_function_addr (namestring, pst, objfile)
+find_stab_function_addr (namestring, filename, objfile)
      char *namestring;
-     struct partial_symtab *pst;
+     char *filename;
      struct objfile *objfile;
 {
   struct minimal_symbol *msym;
@@ -457,7 +457,7 @@
   strncpy (p, namestring, n);
   p[n] = 0;
 
-  msym = lookup_minimal_symbol (p, pst->filename, objfile);
+  msym = lookup_minimal_symbol (p, filename, objfile);
   if (msym == NULL)
     {
       /* Sun Fortran appends an underscore to the minimal symbol name,
@@ -465,8 +465,23 @@
          was not found.  */
       p[n] = '_';
       p[n + 1] = 0;
-      msym = lookup_minimal_symbol (p, pst->filename, objfile);
+      msym = lookup_minimal_symbol (p, filename, objfile);
     }
+
+  if (msym == NULL && filename != NULL)
+    {
+      /* Try again without the filename. */
+      p[n] = 0;
+      msym = lookup_minimal_symbol (p, 0, objfile);
+    }
+  if (msym == NULL && filename != NULL)
+    {
+      /* And try again for Sun Fortran, but without the filename. */
+      p[n] = '_';
+      p[n + 1] = 0;
+      msym = lookup_minimal_symbol (p, 0, objfile);
+    }
+
   return msym == NULL ? 0 : SYMBOL_VALUE_ADDRESS (msym);
 }
 #endif /* SOFUN_ADDRESS_MAYBE_MISSING */
Index: partial-stab.h
===================================================================
RCS file: /cvs/cvsfiles/devo/gdb/partial-stab.h,v
retrieving revision 2.67
diff -u -r2.67 partial-stab.h
--- partial-stab.h	1999/09/01 00:16:02	2.67
+++ partial-stab.h	1999/09/10 20:55:15
@@ -577,9 +577,6 @@
       case 'f':
 	CUR_SYMBOL_VALUE += ANOFFSET (objfile->section_offsets, SECT_OFF_TEXT);
 #ifdef DBXREAD_ONLY
-	/* Keep track of the start of the last function so we
-	   can handle end of function symbols.  */
-	last_function_start = CUR_SYMBOL_VALUE;
 	/* Kludges for ELF/STABS with Sun ACC */
 	last_function_name = namestring;
 #ifdef SOFUN_ADDRESS_MAYBE_MISSING
@@ -588,12 +585,16 @@
 	if (pst && textlow_not_set)
 	  {
 	    pst->textlow =
-	      find_stab_function_addr (namestring, pst, objfile);
+	      find_stab_function_addr (namestring, pst->filename, objfile);
 	    textlow_not_set = 0;
 	  }
 #endif
 	/* End kludge.  */
 
+	/* Keep track of the start of the last function so we
+	   can handle end of function symbols.  */
+	last_function_start = CUR_SYMBOL_VALUE;
+
 	/* In reordered executables this function may lie outside
 	   the bounds created by N_SO symbols.  If that's the case
 	   use the address of this function as the low bound for
@@ -620,22 +621,27 @@
       case 'F':
 	CUR_SYMBOL_VALUE += ANOFFSET (objfile->section_offsets, SECT_OFF_TEXT);
 #ifdef DBXREAD_ONLY
-	/* Keep track of the start of the last function so we
-	   can handle end of function symbols.  */
-	last_function_start = CUR_SYMBOL_VALUE;
 	/* Kludges for ELF/STABS with Sun ACC */
 	last_function_name = namestring;
 #ifdef SOFUN_ADDRESS_MAYBE_MISSING
 	/* Do not fix textlow==0 for .o or NLM files, as 0 is a legit
 	   value for the bottom of the text seg in those cases. */
+	if (CUR_SYMBOL_VALUE == ANOFFSET (objfile->section_offsets, 
+	                                  SECT_OFF_TEXT))
+	  CUR_SYMBOL_VALUE = 
+	    find_stab_function_addr (namestring, pst->filename, objfile);
 	if (pst && textlow_not_set)
 	  {
-	    pst->textlow =
-	      find_stab_function_addr (namestring, pst, objfile);
+	    pst->textlow = CUR_SYMBOL_VALUE;
 	    textlow_not_set = 0;
 	  }
 #endif
 	/* End kludge.  */
+
+	/* Keep track of the start of the last function so we
+	   can handle end of function symbols.  */
+	last_function_start = CUR_SYMBOL_VALUE;
+
 	/* In reordered executables this function may lie outside
 	   the bounds created by N_SO symbols.  If that's the case
 	   use the address of this function as the low bound for
Index: symtab.h
===================================================================
RCS file: /cvs/cvsfiles/devo/gdb/symtab.h,v
retrieving revision 1.141
diff -u -r1.141 symtab.h
--- symtab.h	1999/09/01 00:16:03	1.141
+++ symtab.h	1999/09/10 20:55:15
@@ -1219,7 +1219,7 @@
 
 #ifdef SOFUN_ADDRESS_MAYBE_MISSING
 extern CORE_ADDR find_stab_function_addr PARAMS ((char *,
-						  struct partial_symtab *,
+						  char *,
 						  struct objfile *));
 #endif
 



More information about the Binutils mailing list