Semantics of a common definition in an archive
Ali Bahrami
Ali.Bahrami@Oracle.COM
Wed Aug 19 16:46:29 GMT 2020
On 8/18/20 11:23 PM, Fangrui Song wrote:
> Thanks, Alan! Nick's 3 commits around 1999-12-10 introduced the Solaris ld
> behavior.
>
> .....I have to complain that the Solaris/HP ld treatment on common symbols is too bizarre.
> Hope Ali or Rainer can share with me the rationale.
History.
It's complicated and messy, but bizarre isn't completely fair,
because it's a natural consequence of the rule that says that defined
symbols dominate tentative symbols, which in turn dominate undefined
symbols. Hence, if you're holding a tentative symbol, then you
do need to pull in any archive members that might upgrade that
to a defined symbol.
https://docs.oracle.com/cd/E37838_01/html/E36783/chapter2-55859.html#OSLLGman-ap
As Sun old timers like to say, this behavior is "straight from
New Jersey". We inherited it as part of SysVR4, as did HP. As Alan
already mentioned, it's all derived from the original Fortran COMMON
block model. The original Unix guys didn't like Fortran, but they
lived in a world where even they couldn't ignore it. Sadly, they made
it the default linking model, rather than having it be a non-default
option, but that's how it goes, and since we're still using the results
40 years later, it would seem that it wasn't bad enough to matter in
the larger sense.
Pinning this on SysVR4 is true, but incomplete, because I think it was
true of the older versions of Bell Labs Unix, and was probably also the
case for the older BSD based SunOS versions that preceded SysVR4. However
that unfolded, it goes way back.
Tentative (common) symbols are a dumb idea, and the fact
that Unix adopted the common block model as its linking default,
rather than ref/def, is little more than a historical mistake
that we live with, but don't have to love or justify. The best
approach here is to just avoid their use, rather than worrying
overly about how they work or their efficiency. I advocate using
-fno-common on any non-toy code.
COMMON is also one of those dark corners with multiple implementations,
so I'm not at all surprised to learn that some linkers handle it
differently. Most code doesn't tickle those cases, so these differences
generally go unnoticed.
>
> I will make a weak-vs-global analog:
> Suppose both libweak.a and libglobal.a define a symbol which is
> referenced by undef.o. Let's consider two link orders:
>
> * `undef.o -lweak -lglobal` will pick libweak.a and ignore libglobal.a if
> libglobal.a does not need fetching. ld does not inspect whether
> libglobal.a contains a definition which can override libweak.a!
> * `-lglobal undef.o -lweak` does not fetch libglobal.a at all
>
> So to provide a weak definition while allowing an optional strong
> definition, the strong definition needs to be surrounded in --whole-archive.
> More complaints (and an Mach-O example) in https://urldefense.com/v3/__https://reviews.llvm.org/D86142*2225447__;Iw!!GqivPVa7Brio!IWOgLSD7OgMzZFqpPDXDMmd4CltcNU9ESuqxTpR7Ap_uD_8KFfkZ1y_UHGLrOF-5$
The only ELF feature uglier than COMMON, is WEAK, and just
like COMMON, the only justification for them is to support
historical behavior. There's almost always a different
solution that one should consider first. WEAK might have made
sense in the very early days, but it's pretty inadequate in
a dynamic linking world.
In any event, both of these features are more about about supporting
history than anything else. I think you just need to hold your nose
and find a workable path through. :-)
- Ali
More information about the Binutils
mailing list