Bug 21847 - ppc64le: expected localentry:0 `pthread_…' ld.so error prevents allocation startup
Summary: ppc64le: expected localentry:0 `pthread_…' ld.so error prevents allocation st...
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.29
: P2 normal
Target Milestone: ---
Assignee: Alan Modra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-07-27 09:51 UTC by Florian Weimer
Modified: 2017-08-28 08:05 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Weimer 2017-07-27 09:51:51 UTC
I don't know if this is a glibc bug or not.  On ppc64le with recent binutils 2.29, applicaitons fail to start with errors like:

./bin/rustc: error while loading shared libraries: ./lib/libstd-c3a1748e15265da7.so: expected localentry:0 `pthread_condattr_destroy'
/usr/bin/cmake: error while loading shared libraries: /lib64/libcurl.so.4: expected localentry:0 `pthread_mutex_destroy'
./dsimpletest: error while loading shared libraries: ../lib/libmumps_common-5.1.1.so: expected localentry:0 `pthread_cond_init'

The common theme seems to be that this only affects symbols which live both in libpthread and libc.
Comment 1 Andreas Schwab 2017-07-27 12:40:50 UTC
2.29 is the first release supporting PPC64_OPT_LOCALENTRY, so it seem like this is still buggy.
Comment 2 Florian Weimer 2017-07-27 14:10:35 UTC
List thread: https://sourceware.org/ml/libc-alpha/2017-07/msg00861.html
Comment 3 Alan Modra 2017-07-28 09:31:39 UTC
It may be that the localentry:0 optimisation needs to be turned off by default in ld.  Which is a shame, because it gives a nice speedup for plt calls to many small leaf functions.
Comment 4 Alan Modra 2017-07-28 10:00:39 UTC
The problem is this:

amodra@pike:~/build/glibc$ readelf -s ./libc.so.6 | grep pthread_condattr_destroy
  1344: 000000000012e420   108 FUNC    GLOBAL DEFAULT [<localentry>: 8]    10 pthread_condattr_destroy@@GLIBC_2.17
  5940: 000000000012e420   108 FUNC    GLOBAL DEFAULT [<localentry>: 8]    10 pthread_condattr_destroy
amodra@pike:~/build/glibc$ readelf -s ./nptl/libpthread.so.0 | grep pthread_condattr_destroy
   299: 0000000000011f70    20 FUNC    GLOBAL DEFAULT   11 pthread_condattr_destroy@@GLIBC_2.17
   181: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS pthread_condattr_destroy.
   694: 0000000000011f70    20 FUNC    GLOBAL DEFAULT   11 pthread_condattr_destroy

libpthread.so.0 has a stub implementation of pthread_condattr_destroy which consists of just
  ld r3,0
  blr

while libc.so.6 has the full implementation.  Users link against libpthread.so which means ld sees the stub symbol first before finding the real implementation in libc.so.  By the usual rules of duplicate symbol resolution involving shared libraries, ld must take the first symbol as the definition.  The stub symbol of course doesn't need a toc pointer so the optimised localentry:0 call is used.  At runtime, ld.so apparently resolves pthread_condattr_destroy to the libc version, which isn't localentry:0.

I'm not familiar with the reasons for glibc having two conflicting implementations for pthread functions, but I believe the correct fix is to make the libpthread.so stub symbols weak.
Comment 5 Florian Weimer 2017-07-28 10:11:07 UTC
(In reply to Alan Modra from comment #4)
> The problem is this:
> 
> amodra@pike:~/build/glibc$ readelf -s ./libc.so.6 | grep
> pthread_condattr_destroy
>   1344: 000000000012e420   108 FUNC    GLOBAL DEFAULT [<localentry>: 8]   
> 10 pthread_condattr_destroy@@GLIBC_2.17
>   5940: 000000000012e420   108 FUNC    GLOBAL DEFAULT [<localentry>: 8]   
> 10 pthread_condattr_destroy
> amodra@pike:~/build/glibc$ readelf -s ./nptl/libpthread.so.0 | grep
> pthread_condattr_destroy
>    299: 0000000000011f70    20 FUNC    GLOBAL DEFAULT   11
> pthread_condattr_destroy@@GLIBC_2.17
>    181: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS
> pthread_condattr_destroy.
>    694: 0000000000011f70    20 FUNC    GLOBAL DEFAULT   11
> pthread_condattr_destroy
> 
> libpthread.so.0 has a stub implementation of pthread_condattr_destroy which
> consists of just
>   ld r3,0
>   blr

But this is the full implementation.  The stub implementation is in libc.so.6.  I agree that it is bizarre that the stub implementation does more work than the real implementation, but that's just an artifact of pthread_condattr_destroy.

The problem also happens with pthread_mutex_destroy, where there is actually some code in libpthread.

> while libc.so.6 has the full implementation.  Users link against
> libpthread.so which means ld sees the stub symbol first before finding the
> real implementation in libc.so.  By the usual rules of duplicate symbol
> resolution involving shared libraries, ld must take the first symbol as the
> definition.  The stub symbol of course doesn't need a toc pointer so the
> optimised localentry:0 call is used.  At runtime, ld.so apparently resolves
> pthread_condattr_destroy to the libc version, which isn't localentry:0.

How does this optimization work?  Does the link editor detect that the implementation in libpthread does not require the TOC pointer and then activates the optimization?  What happens if we change the implementation later, so that it needs a TOC pointer?

I still think this optimization has a general issue with symbol interposition.

> I'm not familiar with the reasons for glibc having two conflicting
> implementations for pthread functions, but I believe the correct fix is to
> make the libpthread.so stub symbols weak.

That seems wrong because the libpthread implementation is the real (full) implementation.
Comment 6 Alan Modra 2017-07-28 11:28:28 UTC
> How does this optimization work?  Does the link editor detect that the
> implementation in libpthread does not require the TOC pointer and then
> activates the optimization?

Yes.

> What happens if we change the implementation later, so that it needs a TOC
> pointer?

That could be solved by versioning.

> I still think this optimization has a general issue with symbol interposition.

True.

> That seems wrong because the libpthread implementation is the real (full)
> implementation.

OK, so I had it back to front.  How is it that an executable or shared library linked against libpthread.so.0 ends up not using the libpthread.so.0 symbols at run time?
Comment 7 Jakub Jelinek 2017-07-28 16:36:58 UTC
(In reply to Alan Modra from comment #6)
> > How does this optimization work?  Does the link editor detect that the
> > implementation in libpthread does not require the TOC pointer and then
> > activates the optimization?
> 
> Yes.

Note that in that case I think it can be even exact compiler version and optimization options that affect ABI, say with -O0 or -O1 the compiler doesn't optimize some dead, but harder to be proven as dead, access to a global variable, while -O2 or -O3 or whatever other option could optimize that away and make the function not using r2 anymore.

So like that the localentry change is useful only for the development models where any single change causes rebuild of all the binaries and shared libraries and then nothing changes until following whole world rebuild.

What could work is a shared library providing a hint to the dynamic linker, this function doesn't need or clobber r2, say in some .dynsym flags, and only the dynamic linker then using that information as a hint whether to use a more or less efficient PLT sequence when calling certain symbol.  But it has to be something only determined at runtime, there can't be anything decided at static link time (because then some property that is not part of ABI is inlined into binaries or shared libraries otherwise).
Comment 8 Alan Modra 2017-07-31 23:56:43 UTC
Binutils commits 8b5f1ed8 and d44c746a addressed this bug on the binutils side.  There is a gold fix yet to be committed.
Comment 9 cvs-commit@gcc.gnu.org 2017-08-28 07:00:22 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=565ed01a4e0e3584f24580177822a5271b1c0c8b

commit 565ed01a4e0e3584f24580177822a5271b1c0c8b
Author: Alan Modra <amodra@gmail.com>
Date:   Mon Aug 28 16:27:33 2017 +0930

    [GOLD] Symbol flag for PowerPC64 localentry:0 tracking
    
    This patch provides a flag for PowerPC64 ELFv2 use in class Symbol,
    and modifies Sized_target::resolve to return whether the symbol has
    been resolved.  If not, normal processing continues.  I use this for
    PowerPC64 ELFv2 to keep track of whether a symbol has any definition
    with non-zero localentry, in order to disable --plt-localentry for
    that symbol.
    
    	PR 21847
    	* powerpc.cc (Target_powerpc::is_elfv2_localentry0): Test
    	non_zero_localentry.
    	(Target_powerpc::resolve): New function.
    	(powerpc_info): Set has_resolve for 64-bit.
    	* target.h (Sized_target::resolve): Return bool.
    	* resolve.cc (Symbol_table::resolve): Continue with normal
    	processing when target resolve returns false.
    	* symtab.h (Symbol::non_zero_localentry, set_non_zero_localentry):
    	New accessors.
    	(Symbol::non_zero_localentry_): New flag bit.
    	* symtab.cc (Symbol::init_fields): Init non_zero_localentry_.
Comment 10 cvs-commit@gcc.gnu.org 2017-08-28 07:59:56 UTC
The binutils-2_29-branch branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=188f4b19708f85539da85c8cc722589edd32389b

commit 188f4b19708f85539da85c8cc722589edd32389b
Author: Alan Modra <amodra@gmail.com>
Date:   Mon Aug 28 16:27:33 2017 +0930

    [GOLD] Symbol flag for PowerPC64 localentry:0 tracking
    
    This patch provides a flag for PowerPC64 ELFv2 use in class Symbol,
    and modifies Sized_target::resolve to return whether the symbol has
    been resolved.  If not, normal processing continues.  I use this for
    PowerPC64 ELFv2 to keep track of whether a symbol has any definition
    with non-zero localentry, in order to disable --plt-localentry for
    that symbol.
    
    	PR 21847
    	* powerpc.cc (Target_powerpc::is_elfv2_localentry0): Test
    	non_zero_localentry.
    	(Target_powerpc::resolve): New function.
    	(powerpc_info): Set has_resolve for 64-bit.
    	* target.h (Sized_target::resolve): Return bool.
    	* resolve.cc (Symbol_table::resolve): Continue with normal
    	processing when target resolve returns false.
    	* symtab.h (Symbol::non_zero_localentry, set_non_zero_localentry):
    	New accessors.
    	(Symbol::non_zero_localentry_): New flag bit.
    	* symtab.cc (Symbol::init_fields): Init non_zero_localentry_.
Comment 11 Alan Modra 2017-08-28 08:05:56 UTC
Fixed.