Semantics of a common definition in an archive

Fangrui Song i@maskray.me
Wed Aug 19 05:23:31 GMT 2020


On 2020-08-18, Alan Modra wrote:
>On Mon, Aug 17, 2020 at 09:46:40PM -0700, Fangrui Song wrote:
>> I am playing with common definitions in an archive and have noticed a strange
>> property.
>>
>> > a.s echo '.globl _start; _start: call foo'
>> > b.s echo '.globl foo; foo: .common var,4,4'
>> > c.s echo '.globl foo; foo: .data; .globl var; var:'
>> > d.s echo '.globl foo; foo: .common var,8,8'
>>
>> gcc -c a.s b.s c.s d.s
>> ar rc b.a b.o
>> ar rc c.a c.o
>> ar rc d.a d.o
>>
>>
>> The archive index of c.a says c.o defines var.  It seems that ld is not
>> satisfied with the common definition in b.a(b.o) and checks whether c.a(c.o)
>> provides a regular definition. It does, so GNU ld pulls c.a(c.o) and errors
>> for the multiple definition of foo.
>>
>> ld.bfd a.o b.a c.a
>> # ld.bfd: c.a(c.o): in function `foo':
>> # (.text+0x0): multiple definition of `foo'; b.a(b.o):(.text+0x0): first defined here
>> # ld.bfd: warning: alignment 1 of symbol `var' in c.a(c.o) is smaller than 4 in b.a(b.o)
>>
>> Gold and LLD's semantics are different.
>>
>> gold a.o b.a c.a   # succeeded. c.a(c.o) is not pulled.
>> ld.lld a.o b.a c.a # succeeded
>>
>>
>> ld does not pull d.a(d.o) because it provides a common definition, not better
>> than the common definition in b.a(b.o). Though the archive member is inspected,
>> align/size fields are not updated.
>>
>> ld.bfd a.o b.a d.a
>> # good.
>> readelf -Ws a.out | grep var  # align=size=4; The common definition in d.a(d.o) is ignored
>> #      5: 0000000000402000     4 OBJECT  GLOBAL DEFAULT    2 var
>>
>> -----
>>
>> So, are the GNU ld behaviors described above all desired?  Apparently, when a
>> symbol in the archive index is currently common, ld does not treat it as "ground
>> truth". This behavior appears to be quite unusual. I think the documentation
>> should probably be improved to mention the desired behaviors.
>
>I believe this behaviour is to satisfy the requirements of fortran
>common blocks.  See this message
>https://sourceware.org/pipermail/binutils/1999-December/002952.html
>Thread starts at
>https://sourceware.org/pipermail/binutils/1999-December/001283.html
>
>> (GCC 10 and clang 11 default to -fcommon for C. In Fortran, IIUC FORTRAN 77 uses
>> COMMON blocks. Fortran 90 does not recommend COMMON. These issues may become
>> less and less relevant over time.)
>

Thanks, Alan! Nick's 3 commits around 1999-12-10 introduced the Solaris ld
behavior.

.....I have to complain that the Solaris/HP ld treatment on common symbols is too bizarre.
Hope Ali or Rainer can share with me the rationale.



I will make a weak-vs-global analog:
Suppose both libweak.a and libglobal.a define a symbol which is
referenced by undef.o.  Let's consider two link orders:

* `undef.o -lweak -lglobal` will pick libweak.a and ignore libglobal.a if
   libglobal.a does not need fetching. ld does not inspect whether
   libglobal.a contains a definition which can override libweak.a!
* `-lglobal undef.o -lweak` does not fetch libglobal.a at all

So to provide a weak definition while allowing an optional strong
definition, the strong definition needs to be surrounded in --whole-archive.
More complaints (and an Mach-O example) in https://reviews.llvm.org/D86142#2225447


More information about the Binutils mailing list