This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFA] fix relocation of symbol inside shared library


Hello,

We encountered this problem on sparc-solaris 2.10, but I think it can
affect other platforms too. It looks like it's been latent for quite
a long time, but we hadn't seen it before because of how unlikely it
is to cause problems. See details below.

We have a simple Ada program that raises an exception that does not
get handled.

        procedure A is
        begin
           [...]
           --  Cause an exception to be raised, and do not handle it.
           declare
              Var : Natural := 10;
           begin
              Var := - Var;
           end;
        
        end A;

This program was compiled with the shared GNAT runtime using:

        % gnatmake -f -g -gnata a -bargs -shared

We tried to insert a catchpoint on unhandled exception, and then
run until the debugger catches it:

    (gdb) start
    [...]
    (gdb) catch exception unhandled
    Catchpoint 2: unhandled Ada exceptions
    (gdb) cont
    Continuing.
    
    Catchpoint 2, unhandled CONSTRAINT_ERROR at 0xff180784 in <__gnat_raise_nodefer_with_msg> (e=0xff37afb8) at a-except.adb:829
    829     a-except.adb: No such file or directory.
                in a-except.adb

The expected behavior is for GDB to automatically select the frame
correspondig to user code, in our case frame #7, for procedure A:

    (gdb) c
    Continuing.
    
    Catchpoint 2, unhandled CONSTRAINT_ERROR at 0x000121bc in a () at a.adb:33
    33            Var := - Var;

Another symptom can be seen in the backtrace:

    (gdb) bt
    #0  <__gnat_unhandled_exception> () at a-exextr.adb:186
    [...]
    #7  0x000121bc in _ada_a ()

As you can see, GDB reports that it could not find the symtab-and-line
for PC 0x000121bc. Pretty strange when in fact this function was built
with debugging info.

Disass provides more insight as to what went wrong:

        (gdb) disass
        Dump of assembler code for function _ada_a:
        0x00011ff4 <_ada_a+0>:  save  %sp, -152, %sp
        0x00011ff8 <_ada_a+4>:  sethi  %hi(0x12400), %g1
        [...]
 !!! -> 0x000121ac <_end+0>:    or  %g1, 0x108, %o0     ! 0x12508 <C.2.401+8>
 !!! -> 0x000121b0 <_end+4>:    mov  0x21, %o1
        [...]
 !!! -> 0x000121e0 <_end+52>:   nop
        End of assembler dump.

An unexpected symbol "_end" found its way inside the address range
of our function.

After further analysis, the "_end" symbol comes from /lib/libmd5.so.1.
This symbol is attached to the .bss section:

   000121ac g    DO .bss   00000000  Base        _end

But the glitch here happens because the .bss section is empty:

   Idx Name          Size      VMA       LMA       File off  Algn
    18 .bss          00000000  000121ac  000121ac  000021ac  2**0
                     ALLOC

GDB discards this section because it is empty:

   static void
   add_to_section_table (bfd *abfd, struct bfd_section *asect,
                         void *table_pp_char)
   {
     [...]
     if (0 == bfd_section_size (abfd, asect))
       return;

And as a result fails to perform the relocation of any symbol belonging
to that section. That's when our bad luck kicked in, since the unrelocated
address falls right inside our function _ada_a... The code we have to
convert a PC into a SAL relies on a number of minimal symbol searches,
and depending on how we do the search, we end up finding the "_end"
symbol instead of "_ada_a". When this happens, we silently error out
because its type indicates that this is not a text address:

  /* If we know that this is not a text address, return failure.  This is
     necessary because we loop based on the block's high and low code
     addresses, which do not include the data ranges, and because
     we call find_pc_sect_psymtab which has a similar restriction based
     on the partial_symtab's texthigh and textlow.  */
  msymbol = lookup_minimal_symbol_by_pc_section (pc, section);
  if (msymbol
      && (msymbol->type == mst_data
          || msymbol->type == mst_bss
          || msymbol->type == mst_abs
          || msymbol->type == mst_file_data
          || msymbol->type == mst_file_bss))
    return NULL;

The real problem, of course, is that the address of our "_end"
symbol is wrong, and this comes from the section size check in
add_to_section_table(). I think this check should be removed.
I tried finding out about the history of that check, but it
predates the public CVS...

2007-01-23  Joel Brobecker  <brobecker@adacore.com>

        * exec.c (add_to_section_table): Do not discard empty sections.

I tested the attached patch against our testsuite as well as the
official testsuite. No regression, and it fixes the problem above.

OK to apply?

Thank you,
-- 
Joel

Attachment: exec.c.diff
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]