This is the mail archive of the
mailing list for the glibc project.
Re: ppc64: Call to gettimeofday fails with segfault in __glink_PLTresolve because .plt0 is all zeros.
- From: Alan Modra <amodra at gmail dot com>
- To: Carlos O'Donell <carlos at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- Date: Wed, 6 Nov 2013 01:33:31 +1030
- Subject: Re: ppc64: Call to gettimeofday fails with segfault in __glink_PLTresolve because .plt0 is all zeros.
- Authentication-results: sourceware.org; auth=none
- References: <52788890 dot 7080608 at redhat dot com>
On Tue, Nov 05, 2013 at 12:56:32AM -0500, Carlos O'Donell wrote:
> # PLT call stub (note that build_plt_stub() in bfd doesn't say much about
> # why the stub does what it does so I my analysis might be wrong)...
> 11dfc: f8 41 00 28 std r2,40(r1)
> 11e00: e9 62 99 a8 ld r11,-26200(r2)
> 11e04: 7d 69 03 a6 mtctr r11
> # At this point r11 points at the address of the VDSO __kernel_gettimeofday
Which says you have had a successful resolution of that plt entry at
some point. The PLT is .bss on powerpc64, so it starts all zero then
ld.so initialises the function entry address (first dword of each 2 or
3 dword plt entry) to point at the glink code if doing lazy linking or
resolves to the final addresses if LD_BIND_NOW=1.
> 11e08: e8 42 99 b0 ld r2,-26192(r2)
> # At this point r2 is zero (Why?)
That is odd.
> 11e0c: 28 22 00 00 cmpldi r2,0
> # Should be "bnectr+" but my local objdump doesn't seem to know it, though gdb does.
> 11e10: 4c e2 04 20 .long 0x4ce20420
objdump -d -Mpower7
> # And because r2 is zero the plt stub does not jump to r11 but instead
> # calls the plt entry for gettimeofday.
> 11e14: 48 02 0e 3c b 32c50 <gettimeofday@plt>
The test for zero r2 is for thread safety on power7. If you're
running single threaded you should never see a zero in r2 except when
the first dword of a plt entry would take you to that glink call stub
> # PLT entry jumps to the glink call stub:
> Dump of assembler code for function gettimeofday@plt:
> 0x00000fffb1292c50 <+0>: li r0,264
> 0x00000fffb1292c54 <+4>: b 0xfffb12923d8 <__glink_PLTresolve>
> # Enter .glink0 with index 264 in r0.
> Dump of assembler code for function __glink_PLTresolve:
> 0x00000fffb12923d8 <+0>: mflr r12
> 0x00000fffb12923dc <+4>: bcl 20,4*cr7+so,0xfffb12923e0 <__glink_PLTresolve+8>
> 0x00000fffb12923e0 <+8>: mflr r11
> 0x00000fffb12923e4 <+12>: ld r2,-16(r11)
> 0x00000fffb12923e8 <+16>: mtlr r12
> 0x00000fffb12923ec <+20>: add r12,r2,r11
> fffb12a0000-fffb12b0000 rw-p 00040000 fd:00 2887512 /usr/lib64/librpmio.so.1.0.0
> (gdb) x/9g $r12 - 24
> 0xfffb12a3e88: 0x00000fffb12a0e60 0x00000fffb12aa358
> 0xfffb12a3e98: 0x00000fffb12aa470 0x0000000000000000
> 0xfffb12a3ea8: 0x0000000000000000 0x0000000000000000
> 0xfffb12a3eb8: 0x00000fffb10e90d0 0x00000fffb12577d8
> 0xfffb12a3ec8: 0x00000fffb10e9100
> # The .plt0 entry is all zeros for the ip, toc, and aux pointer.
But that says that ld.so hasn't yet initialised the PLT for this
> 0x00000fffb12923f0 <+24>: ld r11,0(r12)
> # So r11 is zero.
> 0x00000fffb12923f4 <+28>: ld r2,8(r12)
> # So r2 is zero.
> 0x00000fffb12923f8 <+32>: mtctr r11
> # And this is a segfault.
> 0x00000fffb12923fc <+36>: ld r11,16(r12)
> 0x00000fffb1292400 <+40>: bctr
> 0x00000fffb1292404 <+44>: nop
> 0x00000fffb1292408 <+48>: nop
> 0x00000fffb129240c <+52>: nop
> Could it be that elf_machine_runtime_setup (dl-machine.h) never setup
> .plot0 because lazy resolution was not requested because the entire
> DSO used OPDs?
> No other function would have ever called __glink_PLTresolve because
> they will all go through their OPDs, but in thise case the VDSO fails
> the PLT stub check for a non-zero toc and the stub attempts a lazy
> resolution when the dynamic loader never prepared .plt0 for it.
> Does any of that make sense?
Not a great deal. :)
> All I can say is that I'm seeing a clear failure here and it appears to
> have to do with an interaction of a DSO, an IFUNC with a VDSO symbol,
> and the dynamic loader never setting up .plt0 because it didn't need it.
My guess is that your executable or some other shared library is
calling an ifunc resolver in librpmio.so.1, before any relocation (or
plt) processing in librpmio.so.1. ifunc resolvers get called during
relocation processing. It's generally a bad idea to have non-static
resolvers, unless you know a lot about the order in which ld.so will
relocate a project's shared libraries..
Even that doesn't answer how you appear to have resolved gettimeofday.
> I can confirm that under LD_DEBUG=all I don't see "(lazy)" being printed
> which means the dynamic loader is not processing relocations lazily so
> elf_machine_runtime_setup is called with lazy==0. So that serves to
> confirm my suspicion.
> I'm putting together a small test case, but I wanted to send this out
> before I spent any more time. I wanted to get a feel from either of you
> if you've seen something like this before.
Australia Development Lab, IBM