Suppress the fetch of an archive member via --defsym (glibc/elf/librtld.map.o)

Fangrui Song maskray@google.com
Mon Mar 16 15:47:06 GMT 2020


On 2020-03-16, H.J. Lu wrote:
>On Sun, Mar 15, 2020 at 10:02 PM Fangrui Song via Libc-alpha
><libc-alpha@sourceware.org> wrote:
>>
>> On 2020-03-15, Fangrui Song wrote:
>> >cd /tmp/p
>> >git clone git://sourceware.org/git/glibc.git; cd glibc
>> >mkdir Release; ../configure --prefix=/tmp/opt
>> >make -j
>> >
>> >When linking elf/librtld.map.o
>> >
>> >% gcc -nostdlib -nostartfiles -r -o /tmp/p/glibc/Release/elf/librtld.map.o -Wl,--defsym=calloc=0 -Wl,--defsym=free=0 -Wl,--defsym=malloc=0 -Wl,--defsym=realloc=0 -Wl,--defsym=__stack_chk_fail=0 -Wl,--defsym=__stack_chk_fail_local=0 '-Wl,-(' /tmp/p/glibc/Release/elf/dl-allobjs.os /tmp/p/glibc/Release/libc_pic.a -lgcc '-Wl,-)' -Wl,-Map,/tmp/p/glibc/Release/elf/librtld.mapT
>> >
>> >Without -Wl,defsym:
>> >
>> >dl-allobjs.os has an undefined __libc_scratch_buffer_set_array_size
>> >__libc_scratch_buffer_set_array_size fetches libc_pic.a(scratch_buffer_set_array_size.os)
>> >libc_pic.a(scratch_buffer_set_array_size.os) has an undefined free
>> >free fetches libc_pic.a(malloc.os)
>> >libc_pic.a(malloc.os) has an undefined __libc_message
>> >__libc_message fetches libc_pic.a(libc_fatal.os)
>> >
>> >libc_fatal.os will cause a multiple definition error (__GI___libc_fatal)
>> >>>>defined at dl-fxstatat64.c
>> >>>>           /tmp/p/glibc/Release/elf/dl-allobjs.os:(__GI___libc_fatal)
>> >>>>defined at libc_fatal.c
>> >>>>           libc_fatal.os:(.text+0x240) in archive /tmp/p/glibc/Release/libc_pic.a
>> >
>> >glibc/elf/Makefile uses -Wl,--defsym= (rtld-stubbed-symbols) to suppress libc_pic.a(malloc.os):
>> >
>> >% readelf -s elf/librtld.map.o | grep ABS | grep -v LOCAL
>> >   712: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  ABS __stack_chk_fail_local
>> >   826: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  ABS malloc
>> >   876: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  ABS __stack_chk_fail
>> >   905: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  ABS calloc
>> >   975: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  ABS realloc
>> >  1174: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  ABS free
>> >
>> >My question is: does the suppression via --defsym work reliably?
>> >
>> ># a.o
>> >call foo
>> >
>> ># b.a(b.o)
>> >.globl foo, free
>> >foo:
>> >free:
>> >
>> >
>> ># GNU ld --defsym is order dependent.
>> >ld.bfd a.o b.a --defsym foo=0  # b.a(b.o) is fetched. free is present
>> >ld.bfd --defsym foo=0 a.o b.a  # b.a(b.o) is not fetched. free is absent
>> >
>> ># gold --defsym is order independent. For the more complex glibc elf/librtld.map.o case, it happens to match GNU ld.
>> >gold a.o b.a --defsym foo=0    # b.a(b.o) is not fetched. free is absent
>> >gold --defsym foo=0 a.o b.a    # b.a(b.o) is not fetched. free is absent
>> >
>> ># lld --defsym is order independent. --defsym is processed the last. For elf/librtld.map.o it will report a multiple definition error.
>> ># https://sourceware.org/pipermail/libc-alpha/2020-March/111899.html is required to bypass a configure check
>> >ld.lld a.o b.a --defsym foo=0  # b.a(b.o) is not fetched. free is absent
>> >ld.lld --defsym=0 a.o b.a      # b.a(b.o) is not fetched. free is absent
>>
>> Sorry, clarify the behavior of lld.
>>
>> # lld --defsym is order independent. --defsym is processed the last. For elf/librtld.map.o it will report a multiple definition error.
>> ld.lld a.o b.a --defsym foo=0  # b.a(b.o) is fetched. free is present
>> ld.lld --defsym=0 a.o b.a      # b.a(b.o) is fetched. free is present
>
>Glibc build requires a linker compatible with ld.  Can you provide an lld
>option to make lld compatible with ld for cases like this?

As a contributor of lld, I would be cooperative and be happy to adapt lld if the proposed semantic is reasonable.

I am concerned that the --defsym's order dependence with archive files is not so obvious, given -u's behavior:

# -u inserts an undefined which fetches b.a(b.o)
ld.bfd -u foo b.a       # b.a(b.o) is fetched. free is present
# This can't be order dependent because b.a (not in a group) should have been dropped when we saw -u
ld.bfd b.a -u foo       # b.a(b.o) is fetched. free is present


Some observations:


# GNU ld --defsym interacts with an archive
ld.bfd a.o b.a --defsym foo=0  # b.a(b.o) is fetched. free is present
ld.bfd --defsym foo=0 a.o b.a  # b.a(b.o) is not fetched. free is absent

# a.x contains one line `foo = 0;`
# -T a.x is similar to --defsym
ld.bfd a.o b.a -T a.x -o a  # b.a(b.o) is fetched. free is present
ld.bfd -T a.x a.o b.a -o a  # b.a(b.o) is not fetched. free is absent

# -u is usually order independent
# The second can't be order dependent because b.a should have been dropped when we see -u
ld.bfd -u foo b.a       # b.a(b.o) is fetched. free is present
ld.bfd b.a -u foo       # b.a(b.o) is fetched. free is present


# gold --defsym is order independent. For the more complex glibc elf/librtld.map.o case, it happens to make it work
gold a.o b.a --defsym foo=0    # b.a(b.o) is not fetched. free is absent
gold --defsym foo=0 a.o b.a    # b.a(b.o) is not fetched. free is absent

# gold --export-dynamic-symbol (not in GNU ld) implies -u
gold --export-dynamic-symbol foo b.a    # b.a(b.o) is fetched. free is present
gold b.a --export-dynamic-symbol foo    # b.a(b.o) is fetched. free is present


# lld --defsym is order independent. --defsym is processed the last. For elf/librtld.map.o it will report a multiple definition error.
ld.lld a.o b.a --defsym foo=0  # b.a(b.o) is fetched. free is present
ld.lld --defsym=0 a.o b.a      # b.a(b.o) is fetched. free is present


If we aim for robustness and make the librtld.map.o trick supported (I will add a note that gold happens to work),
I will hope both the following can suppress b.a(b.o):

   ld.bfd a.o b.a --defsym foo=0
   ld.bfd --defsym foo=0 a.o b.a

(a) Given --defsym's similarity to a symbol assignment specified by a -T, we will hope -T does not behave too differently.
(b) Note that in a linker script, at least input files should be order dependent w.r.t. input files on the command line.

(a)+(b) => symbol assignments specified by -T need to be declared early but input files specified -T are ordered w.r.t. input files on the command line.


For linker portability, projects using this trick (currently glibc is the only one) should place --defsym first to work with
existing releases of GNU ld.

The added librtld.map.o code is related to https://sourceware.org/bugzilla/show_bug.cgi?id=25486


More information about the Libc-alpha mailing list