This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Shared library call problems on PowerPC with current binutils/gdb
- From: "Ulrich Weigand" <uweigand at de dot ibm dot com>
- To: gdb-patches at sourceware dot org
- Cc: bauerman at br dot ibm dot com, amodra at bigpond dot net dot au
- Date: Tue, 29 Apr 2008 00:53:58 +0200 (CEST)
- Subject: Shared library call problems on PowerPC with current binutils/gdb
Hello,
using current binutils and gdb head on powerpc-linux, you see the
following quite annoying effect(s) with shared library calls.
I'm starting out with a simple "hello world" program (compiled with
-ffreestanding to avoid the compiler optimizing the printf to puts),
and the following debugging session:
>GNU gdb 6.8.50.20080408
>Copyright (C) 2008 Free Software Foundation, Inc.
>License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>This is free software: you are free to change and redistribute it.
>There is NO WARRANTY, to the extent permitted by law. Type "show copying"
>and "show warranty" for details.
>This GDB was configured as "powerpc64-linux"...
>(gdb) break printf
>Function "printf" not defined.
>Make breakpoint pending on future shared library load? (y or [n]) n
Hmmm. Ok, maybe because libc is not loaded yet ...
>(gdb) start
>Breakpoint 1 at 0x10000474: file hello.c, line 6.
>Starting program: /home/uweigand/a.out
>main () at hello.c:6
>6 printf ("Hello, world!\n");
>(gdb) info sharedlibrary
>From To Syms Read Shared Object Library
>0x0ffc1960 0x0ffda700 Yes /lib/ld.so.1
>0x0fe3db20 0x0ff5e230 Yes /lib/libc.so.6
>(gdb) break printf
>Function "printf" not defined.
>Make breakpoint pending on future shared library load? (y or [n]) n
Well, libc is definitely loaded now, but I still cannot set a
breakpoint ... Maybe stepping in works?
>(gdb) s
>0x10000800 in call___do_global_ctors_aux ()
Not really.
>(gdb) start
>The program being debugged has been started already.
>Start it from the beginning? (y or n) y
>Breakpoint 2 at 0x10000474: file hello.c, line 6.
>Starting program: /home/uweigand/a.out
>main () at hello.c:6
>6 printf ("Hello, world!\n");
>(gdb) n
>0x10000800 in call___do_global_ctors_aux ()
Huh. Not even stepping over works ...
>(gdb) bt
>#0 0x10000800 in call___do_global_ctors_aux ()
>#1 0x0fe3de0c in generic_start_main () from /lib/libc.so.6
>#2 0x0fe3e060 in __libc_start_main () from /lib/libc.so.6
>#3 0x00000000 in ?? ()
... and I guess that's the reason why.
>(gdb) n
>Single stepping until exit from function call___do_global_ctors_aux,
>which has no line number information.
>0x0ffd544c in _dl_runtime_resolve () from /lib/ld.so.1
>(gdb)
>Single stepping until exit from function _dl_runtime_resolve,
>which has no line number information.
>0x0fe75820 in printf@@GLIBC_2.4 () from /lib/libc.so.6
>(gdb)
>Single stepping until exit from function printf@@GLIBC_2.4,
>which has no line number information.
>Hello, world!
>main () at hello.c:7
>7 }
Also, it's quite tedious to get back.
So, what's going on here? It looks like a combination of multiple
different problems.
1) Setting a breakpoint on a shared library function (that is called by
the main program) before libraries are loaded used to work because
of the so-called "solib trampoline" minimal symbols.
These are generated by GDB when it finds an *undefined* symbol in
the dynamic symbol table with a *non-zero* value. Those are a special
"hack" used by the linker to implement function pointer comparison
correctly; their "value" points to the PLT call stub in the main
executable used to call the shared library function.
However, over time BFD has been optimized to only use this hack when
it is actually necessary, i.e. when the symbol is in fact used for
purposed of function pointer comparisons. In simple cases like this
where the function is just called, the value of the undefined symbol
is now always 0 when using current binutils.
On the other hand, BFD now provides "synthetic symbols" that point to
those same PLT call stubs (on many targets). In fact, the synthetic
symbol "printf@plt" is actually defined. However, elfread.c does not
consider this to be a "solib trampoline" symbol for printf.
Even if it would, there is an additional complication on PowerPC:
when using the new-style "secure" PLTs, the "printf@plt" entry point
actually points to a *data* variable holding a pointer to the "glink"
stub, not the original PLT call stub itself.
This could still work, as ppc-linux-tdep.c actually contains code to
treat this as a case of "function descriptors" and would resolve to
the real target. However, "break printf" actually still wouldn't work
because linespec.c:minsym_found does not handle function descriptors ...
2) What about when libc is already loaded? Why is printf still not found?
This is because the (static) symbol table of libc.so on current powerpc
systems does not contain a symbol "printf", only "printf@@GLIBC_2.4" and
"printf@GLIBC_2.0". This is because of symbol versioning needed to handle
both 128-bit and 64-bit long double types.
Now, the *dynamic* table *does* contain "printf" (twice, with different
version information), but elfread.c ignores the dynamic table "as the
dynamic symbol table is usually a subset of the main symbol table."
Note that even if full debug information for libc.so is available, we
do not get a debug symbol "printf" either -- the two entry points are
called __printf and __nldbl_printf in the original source, and that's
what debug symbols show.
3) As to stepping in and/or over the printf call, the primary reason why
this doesn't work is that unwinding breaks. The immediate target of
the call instruction is a PLT call stub; these stubs are part of the
.text section (when using the secure PLT scheme) and have no symbols.
The immediately preceding symbol happens to be call___do_global_ctors_aux
from GCC's crtend.o, which is compiled with -finhibit-size-directive,
so that function is considered by GDB to span until the end of .text.
Thus prolog parsing of call___do_global_ctors_aux detects building of
a stack frame, and GDB assumes that the PLT call stubs have one --
which they really don't.
4) Even if that worked, stepping *into* the call would require support
for gdbarch_skip_trampoline_code, and the current ppc32 implementation
of that (for secure PLTs) simply calls find_solib_trampoline_target
-- which requires the old-style "solib trampoline" symbols to work.
To fix this, I'd propose the following implementation steps:
- Extend elf_symtab_read to treat a synthetic symbol XXX@plt as a
mst_solib_trampoline symbol for XXX.
- Change elf_symtab_read to register symbols with version name
simply under their base name.
- Include something along the lines of Markus' "multiply-defined
symbol" patch so that "break printf" will break on *both* definitions
(created by the two symbols with different version names) after libc
has been loaded.
- Teach minsym_found about function descriptors.
- Add extra unwinders to ppc-linux-tdep that recognize the various
PLT call and glink stubs, and properly treat them as frameless.
(This will probably require code reading ...)
- Extend ppc_skip_trampoline_code to likewise handle those stubs.
Does this look reasonable? Am I overlooking anything?
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
Ulrich.Weigand@de.ibm.com