[RFC] minimal symbols on mips-irix and overlapping CUs...

Joel Brobecker brobecker@gnat.com
Mon Nov 17 22:39:00 GMT 2003


we have discovered a problem on mips-irix that had me wondering about
GDB's assumptions regarding the symbol table. First, the symptoms:

We have a small Ada program that uses nested procedures. The same can
be reproduced with a C program if GCC is used (GCC provides nested
procedures as an extension). Trying to get a backtrace off the nested
procedure does not work. With our example, we get something like this:

 #0  main ()
 #1  0x10013858 in main ()
 #2  0x100139d0 in gdb_no_local_symbols () at gdb_no_local_symbols.adb:37
 #3  0x10013574 in main (argc=1, argv=2147430180, envp=2147430188)
     at b~gdb_no_local_symbols.adb:204

While the expected backtrace was:

 #0  gdb_no_local_symbols.proch () at gdb_no_local_symbols.adb:28
 #1  0x10013858 in gdb_no_local_symbols.procc () at gdb_no_local_symbols.adb:32
 #2  0x100139d0 in gdb_no_local_symbols () at gdb_no_local_symbols.adb:37
 #3  0x10013574 in main (argc=1, argv=2147430180, envp=2147430188)
     at b~gdb_no_local_symbols.adb:204

The problem, in my opinion, is two-fold:

  a. The IRIX linker does not include the LOCAL symbols into the symbol
     table. They are present in the object files, but are stripped from
     the executable symbol table. This concerns at least nested
     functions, and static functions as well.

  b. A recent version of the IRIX linker has introduced an extra asm CU
     (Compilation Unit) named "__sgi_ld_generated_code" which contains
     one function before CU gdb_local_symbol.adb, and 2 after. So the
     code range of gdb_local_symbol.adb is included in the code range
     of the linker CU.

What happens after GDB stops on the breakpoint and tries to find the
associated function name is the following:

  1. In find_pc_sect_symtab(), we search all symtabs for one which
     code range includes the PC. We find the linker CU, which is the
     wrong one. But we acknowledge the fact that CUs can overlap,
     and therefore resort to find_pc_sect_psymtab() to find the correct

  2. In find_pc_sect_psymtab():
     a. We first scan the minimal symbols, find "main". We did not find
        the right function because the linker did not include it into
        the symbol table.

     b. Then, we scan all the partial symtabs for the ones which code
        range include the given address. For all the qualifying ones,
        we check whether it contains a function which start address
        matches the minimal symbol address. If yes, then we have found
        the right symtab. Otherwise, we just return the first partial
        symtab we found.

So in our case, we simply end up selecting the linker CU. The GDB later
realizes that there is no function matching the given PC in the CU we
selected, it falls back to using the minimal symbol table, and therefore
ends up declaring that GDB stopped in function main().

In my opinion, (b) is not a problem, and GDB should be able to handle it
as long as the symbol table is "complete" (ie as long as (a) does not
apply). I am inclined to declare this a linker problem, but can this
really be categorized as a linker problem?

My general question is the following: Has GDB been designed to assume
that the symbol table will always be complete? This has always been my
feeling, but I'm wondering if it wasn't an unwarranted assumption.

Maybe it's just a hole in find_pc_sect_psymtab() that needs correcting.
For instance, I am contemplating the idea of tweaking GDB to
automatically add new "virtual" minimal symbols after having built
the psymbols using the following approach: Foreach non-type psymbol
we found, check whether we have a minimal symbol at the same address.
If not, then we have a missing msymbol, and I therefore add it.



More information about the Gdb-patches mailing list