Bug 11717 - common/.bss variables from shared libraries not displayed correctly
Summary: common/.bss variables from shared libraries not displayed correctly
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: 7.0
: P2 normal
Target Milestone: 7.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-06-18 13:25 UTC by Lance Richardson
Modified: 2012-04-03 08:30 UTC (History)
5 users (show)

See Also:
Host: x86_64-linux-gnu
Target: x86_64-linux-gnu
Build: x86_64-linux-gnu
Last reconfirmed:


Attachments
GDB fix for the copy-relocations. (4.03 KB, patch)
2010-07-21 08:55 UTC, Jan Kratochvil
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Lance Richardson 2010-06-18 13:25:13 UTC
Observed with version 7.0-ubuntu under Kubuntu 9.10 with gcc version 4.4.1.

Also occurs for gdb 6.8 and gcc version 4.4.3 in x86_64 hosted MIPS
cross-compilation toolchain.

Apparently, when linking a shared library which contains a variable in .bss with
an executable referencing that variable, ld allocates space for that variable in
the executable's .bss section and creates a duplicate symbol for this variable.

The executable's copy of the variable will be used due to the dynamic linker's
search order (executable first).

However, gdb always used data from the shared library .bss section (always zero)
instead of the executable's .bss section.

Here's a minimal example:

Contents of shared.c:

     int var;

Build shared.c into libshared.so:
     gcc -g -o libshared.so -fPIC -shared shared.c

Contents of executable.c:
     #include <stdlib.h>
     extern int var;

     int main(int argc, char **argv)
     {
         var = 42;
         abort();
         return 0;
     
     }

Build executable:
     gcc -g -L. -Wl,-rpath,. -o executable -lshared executable.c

Display symbols (note that "var" is defined in both files):
     nm -aA libshared.so executable | grep var
     libshared.so:0000000000201020 B var
     executable:0000000000601020 B var

GDB session demonstrating problem:
     gdb executable
     GNU gdb (GDB) 7.0-ubuntu
     Copyright (C) 2009 Free Software Foundation, Inc.
     License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
     This is free software: you are free to change and redistribute it.
     There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
     and "show warranty" for details.
     This GDB was configured as "x86_64-linux-gnu".
     For bug reporting instructions, please see:
     <http://www.gnu.org/software/gdb/bugs/>...
     Reading symbols from /home/lrichardson/test/executable...done.
     (gdb) run
     Starting program: /home/lrichardson/test/executable

     Program received signal SIGABRT, Aborted.
     0x00007ffff78a04b5 in raise () from /lib/libc.so.6
     (gdb) p &var
     $1 = (int *) 0x7ffff7dde020
     (gdb) p var
     $2 = 0
     (gdb) p *(int *)0x601020
     $3 = 42
     (gdb) info shared
     From                To                  Syms Read   Shared Object Library
     0x00007ffff7ddfaf0  0x00007ffff7df7354  Yes (*)     /lib64/ld-linux-x86-64.so.2
     0x00007ffff7bdd4a0  0x00007ffff7bdd5a8  Yes         ./libshared.so
     0x00007ffff788b730  0x00007ffff798b7fc  Yes (*)     /lib/libc.so.6
     (*): Shared library is missing debugging information.

Note that &var and the value of var (zero) come from the shared library's
version of the variable.  Using the address of var from the executable shows the
expected value (42) as modified by the program.
Comment 1 Jan Kratochvil 2010-07-21 08:55:34 UTC
Created attachment 4880 [details]
GDB fix for the copy-relocations.

This patch has been posted first at:
gfortran invalid DW_AT_location for overridable variables
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40040#c7

But it has heavy regressions now, still the symbols reading logic should be
reworked.
Comment 2 Jan Kratochvil 2010-08-22 07:41:37 UTC
#include <iostream>
int main()
{
  return std::cin.eof();
}

(gdb) p std::cin
$1 = {<error reading variable>
(gdb) p &std::cin
$2 = (std::istream *) 0x7ffff7dc7b60
Symbol table '.dynsym' contains 9 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     7: 0000000000600b60   280 OBJECT  GLOBAL DEFAULT   25 _ZSt3cin@GLIBCXX_3.4 (2)

because of:
Relocation section '.rela.dyn' at offset 0x4a8 contains 2 entries:
    Offset             Info             Type               Symbol's Value 
Symbol's Name + Addend
0000000000600b60  0000000700000005 R_X86_64_COPY          0000000000600b60
_ZSt3cin + 0

Unaware if it is the only problem of:
(gdb) p std::cin.eof()
Cannot access memory at address 0xffffffffffffffe8
Comment 3 eager 2012-01-31 18:04:56 UTC
As the test case shows, this problem is not limited to unintialized variables in .bss -- it also occurs with initialized variables in .data.  

The basic problem is that the DW_AT_location for the variable in the shared library is incorrect, since it does not match how the variable is accessed.  It says that var is addressed directly, while in fact it is accessed indirectly through the GOT.

GCC generates
    DW_OP_addr  var
which points to the unused copy of var in the shared library.  The correct DWARF should be
    DW_OP_addr  var@GOT
    DW_OP_deref

I'm unclear what the patch is doing exactly, but to the extent it is having gdb ignore the generated debug info, as suggested by this comment
+		  /* Never use DW_AT_location, rely on the minimal symbols.  */
then this seems to be headed in the wrong direction.  The best fix for incorrect debug data is to generate the correct info.  Or this fix should be clearly identified as a workaround for incorrect data, with some way of defeating the workaround when correct data is present.

Unfortunately, gdb does not currently support "complex" DWARF expressions which include an indirect reference.
Comment 4 Jakub Jelinek 2012-04-03 08:30:08 UTC
(In reply to comment #3)
> As the test case shows, this problem is not limited to unintialized variables
> in .bss -- it also occurs with initialized variables in .data.  
> 
> The basic problem is that the DW_AT_location for the variable in the shared
> library is incorrect, since it does not match how the variable is accessed.  It
> says that var is addressed directly, while in fact it is accessed indirectly
> through the GOT.
> 
> GCC generates
>     DW_OP_addr  var
> which points to the unused copy of var in the shared library.  The correct
> DWARF should be
>     DW_OP_addr  var@GOT
>     DW_OP_deref

Unfortunately, that has a couple of problems:
1) we don't want to generate runtime overhead just for debugging, so the above
   would "work" only if we have some other GOT reference to that symbol in the
   code
2) on most targets we don't have suitable relocations that would give us the 
   address of the GOT slot
Even
DW_OP_addr var@GOT
DW_OP_addr _GLOBAL_OFFSET_TABLE_
DW_OP_plus
DW_OP_deref
doesn't work on x86_64, while var@GOT in that case gives the relative offset from _GLOBAL_OFFSET_TABLE_ to the GOT entry for var, unfortunately _GLOBAL_OFFSET_TABLE_ symbol is handled specially and thus becomes a wrong kind of relocation.