This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ppc64: Call to gettimeofday fails with segfault in __glink_PLTresolve because .plt0 is all zeros.

On Tue, Nov 05, 2013 at 12:56:32AM -0500, Carlos O'Donell wrote:
> # PLT call stub (note that build_plt_stub() in bfd doesn't say much about
> # why the stub does what it does so I my analysis might be wrong)...
>    11dfc:       f8 41 00 28     std     r2,40(r1)
>    11e00:       e9 62 99 a8     ld      r11,-26200(r2)
>    11e04:       7d 69 03 a6     mtctr   r11
> # At this point r11 points at the address of the VDSO __kernel_gettimeofday

Which says you have had a successful resolution of that plt entry at
some point.  The PLT is .bss on powerpc64, so it starts all zero then initialises the function entry address (first dword of each 2 or
3 dword plt entry) to point at the glink code if doing lazy linking or
resolves to the final addresses if LD_BIND_NOW=1.

>    11e08:       e8 42 99 b0     ld      r2,-26192(r2)
> # At this point r2 is zero (Why?)

That is odd.

>    11e0c:       28 22 00 00     cmpldi  r2,0
> # Should be "bnectr+" but my local objdump doesn't seem to know it, though gdb does.
>    11e10:       4c e2 04 20     .long 0x4ce20420 

objdump -d -Mpower7

> # And because r2 is zero the plt stub does not jump to r11 but instead
> # calls the plt entry for gettimeofday.
>    11e14:       48 02 0e 3c     b       32c50 <gettimeofday@plt>

The test for zero r2 is for thread safety on power7.  If you're
running single threaded you should never see a zero in r2 except when
the first dword of a plt entry would take you to that glink call stub

> # PLT entry jumps to the glink call stub:
> Dump of assembler code for function gettimeofday@plt:
>    0x00000fffb1292c50 <+0>:	li      r0,264
>    0x00000fffb1292c54 <+4>:	b       0xfffb12923d8 <__glink_PLTresolve>
> # Enter .glink0 with index 264 in r0.
> Dump of assembler code for function __glink_PLTresolve:
>    0x00000fffb12923d8 <+0>:	mflr    r12
>    0x00000fffb12923dc <+4>:	bcl     20,4*cr7+so,0xfffb12923e0 <__glink_PLTresolve+8>
>    0x00000fffb12923e0 <+8>:	mflr    r11
>    0x00000fffb12923e4 <+12>:	ld      r2,-16(r11)
>    0x00000fffb12923e8 <+16>:	mtlr    r12
>    0x00000fffb12923ec <+20>:	add     r12,r2,r11
> fffb12a0000-fffb12b0000 rw-p 00040000 fd:00 2887512 /usr/lib64/
> (gdb) x/9g $r12 - 24
> 0xfffb12a3e88:	0x00000fffb12a0e60	0x00000fffb12aa358
> 0xfffb12a3e98:	0x00000fffb12aa470	0x0000000000000000
> 0xfffb12a3ea8:	0x0000000000000000	0x0000000000000000
> 0xfffb12a3eb8:	0x00000fffb10e90d0	0x00000fffb12577d8
> 0xfffb12a3ec8:	0x00000fffb10e9100
> # The .plt0 entry is all zeros for the ip, toc, and aux pointer.

But that says that hasn't yet initialised the PLT for this
shared library..

>    0x00000fffb12923f0 <+24>:	ld      r11,0(r12)
> # So r11 is zero.
>    0x00000fffb12923f4 <+28>:	ld      r2,8(r12)
> # So r2 is zero.
>    0x00000fffb12923f8 <+32>:	mtctr   r11
> # And this is a segfault.
>    0x00000fffb12923fc <+36>:	ld      r11,16(r12)
>    0x00000fffb1292400 <+40>:	bctr
>    0x00000fffb1292404 <+44>:	nop
>    0x00000fffb1292408 <+48>:	nop
>    0x00000fffb129240c <+52>:	nop
> Could it be that elf_machine_runtime_setup (dl-machine.h) never setup
> .plot0 because lazy resolution was not requested because the entire
> DSO used OPDs?
> No other function would have ever called __glink_PLTresolve because
> they will all go through their OPDs, but in thise case the VDSO fails
> the PLT stub check for a non-zero toc and the stub attempts a lazy
> resolution when the dynamic loader never prepared .plt0 for it.
> Does any of that make sense?

Not a great deal. :)

> All I can say is that I'm seeing a clear failure here and it appears to
> have to do with an interaction of a DSO, an IFUNC with a VDSO symbol,
> and the dynamic loader never setting up .plt0 because it didn't need it.

My guess is that your executable or some other shared library is
calling an ifunc resolver in, before any relocation (or
plt) processing in  ifunc resolvers get called during
relocation processing.  It's generally a bad idea to have non-static
resolvers, unless you know a lot about the order in which will
relocate a project's shared libraries..

Even that doesn't answer how you appear to have resolved gettimeofday.

> I can confirm that under LD_DEBUG=all I don't see "(lazy)" being printed
> which means the dynamic loader is not processing relocations lazily so
> elf_machine_runtime_setup is called with lazy==0. So that serves to
> confirm my suspicion.
> I'm putting together a small test case, but I wanted to send this out
> before I spent any more time. I wanted to get a feel from either of you
> if you've seen something like this before.
> Cheers,
> Carlos.

Alan Modra
Australia Development Lab, IBM

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]