[PATCH] BZ #5784: Build libpthread.a with ld -r

Roland McGrath roland@hack.frob.com
Fri Sep 7 01:26:00 GMT 2012


> I think people doing such links today are already using --whole-archive to 
> get them to work, so the bloat isn't really new.  Static links generally 
> are pretty bloated - the overheads associated with static linking of 
> trivial programs are huge - but a baseline where they do at least work 
> reliably (where the testsuite failures identified by HJ are fixed, so that 
> a --disable-shared build gets clean test results) is probably a better 
> basis for reducing the bloat than one with lots of bugs.

You and I may well have different priorities.  But I think there is a
more subtle distinction here, one that is objective.

I say we should have a (long-term) goal of reducing the bloat in static
linking as much as possible (i.e. eliminating dead code from the links).
I am never going to accept an argument of, "If two wrongs don't make a
right, try three."

You're saying there's already bloat and things are broken today, so
working but even more bloated is better than broken but no more bloated.
If I accepted that premise, then it would indeed be just an issue of
priorities.

In fact, I don't think that libc/libpthread is actually broken for
static linking at all.  I haven't seen any credible claim that it is.
The only claims I see from those bug reports are that the libstdc++ code
(or whatever GCC runtime pieces it is) has assumptions about how it can
use weak references to libpthread symbols that do not jibe with the
reality of static linking.  What's broken is not in libpthread or libc.

I grant that the places where the bad assumptions are being made today
have a sticky problem to solve.  It even seems likely that they cannot
really solve it without some more help from us.  But that doesn't mean
it's us who is broken today.

Based on that analysis, I object to the notion that we should introduce
additional bloat to work around someone else's brokenness.

> That said, I'd certainly be happier if a much more in-depth analysis of 
> what the static libpthread issues are - what the cases are where only a 
> subset of objects are brought in, and which other objects being missing 
> causes problems - were posted to justify any change.  We "know" that such 
> links are broken, but don't have a clear self-contained explanation of the 
> details to justify the patch.  Maybe linking a smaller subset of objects 
> with -r would suffice, for example?

We don't need to look for a new methodology to induce gratuitous bloat
as a workaround.  We already have a bloat-inducing workaround for this
problem, in nptl/pthread_create.c's PTHREAD_STATIC_FN_REQUIRE uses.  The
simple, bloat-increasing workaround for the cited problem is to add more
symbols to that list.  HJ has collected the complete list of symbols
potentially affected this way (I assume for today's GCC trunk) in
http://sourceware.org/bugzilla/show_bug.cgi?id=5780#c2

I can probably be talked into that as a stopgap solution while we work
out and deploy something really correct with the GCC folks.  But I will
object to such changes in the absence of a plan and commitment on all
sides to solve the problem--which is a problem for users of GCC's
runtime libraries doing static linking with -lpthread caused by
implementation choices in GCC's runtime libraries, not a problem in
static linking of -lpthread per se--in a way that does not induce bloat
in all static linking of -lpthread.

I'm going to talk about g++ and libstdc++ from now on as shorthand.
But I recognize that it's actually some deeper pieces of GCC runtime
code that is used at least by some other non-C, non-C++ cases.  The
solutions I'll contemplate should be applied to the appropriate layers
of GCC stuff, not necessarily (or only) g++/libstdc++ per se.

The approach that libstdc++ is using today is based entirely on the
assumption of the shared-library semantics of using weak references.
That is, it assumes it can do:

	extern foo () __attribute__ ((weak));
	extern bar () __attribute__ ((weak));
	if (&foo != 0)
	  bar ();

This expects that either you get foo and bar from the same DSO,
or you have neither.  With static linking, you might have foo
but have no reason to have linked in bar, and so here you get
a positive test on &foo but a resolution of bar to zero in the call.

The existing style of workaround is to ensure that the file that defines
foo (pthread_create) also has a strong undefined reference to each bar
(pthread_mutex_lock et al) that's used that way.  But as well as
inducing bloat, this presumes the broken libstdc++ assumptions won't be
extend to new symbols used like bar.  In fact, they've already been
extended to pthread_cond_broadcast and some others that aren't in our
list, demonstrating the fragility of the workaround.

The scheme that libstdc++ employs really is just fine for dynamic
linking.  But it is fundamentally broken for static linking.  We should
not be perpetuating and tweaking the workaround for this bad design, but
instead helping come up with a replacement that doesn't have the same
fragility and bloat issues.

Off hand I've come up with one alternative scheme.

There are two pieces of the puzzle that are somewhat independent.

1. Determining threadedness.

   The existing weak reference scheme works adequately for this.
   It might be improved in a couple of ways, but this piece is not an
   immediate source of trouble.  I'll digress on the possibilities here
   for a moment, even though it's not the paramount issue.

   There are two definitions of "threadedness" that might be useful
   here.  I don't know enough off hand about the libstdc++ use of the
   pthread calls to be sure whether its correctness depends on which
   definition it uses.

   a. "static threadedness" means that the program, as linked, might
      ever create a second thread.  This is "static" in the sense that
      it is fixed at link time.  This is what libstdc++ tests today.

      There is a further wrinkle here, the possibility of what might be
      termed "dynamic static threadedness".  That is, a program linked
      without -lpthread might use dlopen (or equivalent via NSS modules
      or whatnot) to load libpthread.so (or another DSO that depends on
      it), and then be able to create a thread.  The existing scheme
      doesn't account for this possibility, and I am not aware of anyone
      having complained, so perhaps nobody actually cares.

  b. "dynamic threadedness" means that the program, running right now,
     actually has more than one thread.  (To simplify matters, we'll
     allow that it might be easier to consider a process "dynamically
     threaded" if it ever had more than one thread, even if only one is
     still alive now.)  This is what we test in libc/libpthread
     internals for various things like cancellation handling.

     We could provide an entrypoint for testing this dynamically with a
     call, e.g., bool __glibc_is_threaded (void).  The call is slightly
     more costly than the test of a constant (immediate for static
     linking, loaded GOT entry for dynamic linking) we have today, but
     probably worthwhile if it avoids thread-support overhead in a
     pthread-enabled program that happens to have only one thread.

  Testing b. is certainly preferable if it's sufficient.  It's not
  sufficient if e.g. libstdc++ uses the test to decide how to initialize
  some object that may be used in a multi-threaded way later, when
  additional threads may have been created between the initialization
  and the use.

2. Referring to pthread functions in code that's dead in the
   single-threaded case.

   When the code is dead, it doesn't matter how it behaves.  That's why
   the DSO -lpthread case works today: the references are resolved to
   zero at dynamic link time and thus would crash if calls were ever
   made.

   Weak references are not really what you want here.  What you want are
   *conditional* references: conditional on the #1 test.

   The g++ driver already supports a magical -pthread option whose
   (important) effect is to add -lpthread to the link *after* the
   implicit -lstdc++.  So, a link not using pthreads looks like:
	ld program.o -lstdc++ -lgcc -lc
   while a link using pthreads looks like:
	ld program.o -lstdc++ -lpthread -lgcc -lc
   (I simplified a bit, but that's close enough for what matters here.)

   So here's a scheme:

   Instead of calling pthread_foo with a weak reference, libstdc++ calls
   __glibc_maybe_pthread_foo with a normal (strong) reference.
   libpthread defines __glibc_maybe_pthread_foo as an alias for pthread_foo.
   A link with -lpthread works just fine, equivalent to today.  But
   actually that link was:
	ld program.o -lstdc++ -lpthread -lgcc -lc -lgcc_thread_stubs
   (The order of -lgcc_thread_stubs relative to -lgcc and -lc doesn't
   really matter--it just has to be after -lpthread.)  Since -lpthread
   defined __glibc_maybe_pthread_*, all those references were satisfied
   and there was never any need to look at -lgcc_thread_stubs.  A link
   without -lpthread becomes:
	ld program.o -lstdc++ -lgcc -lc -lgcc_thread_stubs
   Now we'd get to -lgcc_thread_stubs, which would define each
   __glibc_maybe_pthread_foo symbol as a stub.  It might be a C function
   that calls abort, or it might be:
	.globl __glibc_maybe_pthread_foo
	__glibc_maybe_pthread_foo = 0
   It doesn't really matter how they're defined, since the only uses of
   them will be in dead code.  It just matters that they are indeed
   defined, so that the strong references never produce link-time errors.

   One issue with this scheme is that libstdc++ now always uses
   __glibc_maybe_pthread_foo instead of pthread_foo, so you cannot
   interpoose a different library that defines pthread_foo itself for
   some wrapper-instrumentation sort of purpose.  A slight alternative
   that avoids that issue but perhaps has other bad effects I'm not
   thinking of at the moment, is to simply use strong references to
   pthread_foo in libstdc++ and have -lgcc_thread_stubs define
   stubs named pthread_foo.

   An alternative that's more different (and off hand seems vaguely
   worse to me), but perhaps is better in some way or other, is to make
   sure that libc defines redirector functions for every pthread_foo
   that libstdc++ calls.  Then if -lpthread is missing, the strong
   references are resolved to the libc definitions, which normally do
   nothing.  This alternative has the advantage that, if paired with a
   b. solution for #1, it also correctly handles "dynamic static
   threadedness".

I don't claim to have figured out every aspect of this yett, but what
I've proposed avoids the pitfalls of the current libstdc++ strategy
without inducing bloat.  (It even allows us to get rid of the gratuitous
bloat we already have induced in pthread_create.c today, if built to
assume a new libstdc++.)

I want people concerned with this bug to get serious about considering
fleshing out something along these lines, rather than only looking for
kludges to induce the bloat that works around libstdc++'s existing bad
assumptions.


Thanks,
Roland



More information about the Libc-alpha mailing list