Bug 26551 - A definition referenced by an unneeded (--as-needed) shared object should be exported
Summary: A definition referenced by an unneeded (--as-needed) shared object should be ...
Status: WAITING
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-08-29 06:56 UTC by Fangrui Song
Modified: 2023-08-07 10:05 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2020-08-29 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2020-08-29 06:56:25 UTC
echo '.global _start; _start: ret' | as -o a.o
echo 'call _start' | as -o b.o
ld.bfd -shared b.o -o b.so

ld.bfd a.o b.so -o bfd.needed
ld.bfd a.o --as-needed b.so -o bfd.unneeded
ld.gold a.o b.so -o gold.needed
ld.gold a.o --as-needed b.so -o gold.unneeded
ld.lld a.o b.so -o lld.needed
ld.lld a.o --as-needed b.so -o lld.unneeded

bfd.unneeded does not export _start. All other 5 executables export _start.

_start should be exported. In a larger application, b.so may be referenced through a shared object chain. Linking the executable against b.so makes the intention clear that the definitions in regular objects resolving shared object references are needed.

----

Here is a larger example related to --allow-shlib-undefined, demonstrating why the GNU ld behavior (a definition referenced by an unneeded (--as-needed) shared object is not exported) is not good. This may be related to bug 18652.


echo '.globl _start, myexit; _start: jmp foo; myexit: movq $60, %rax; movq $42, %rdi; syscall' | as -o a.o
echo '.globl foo; foo: jmp bar' | as -o b.o
echo '.globl bar; bar: jmp myexit' | as -o c.o
ld.bfd -shared c.o -o c.so
ld.bfd -shared b.o ./c.so -o b.so
ld.bfd --dynamic-linker /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 a.o /usr/lib/x86_64-linux-gnu/libc.so --as-needed ./b.so ./c.so -y myexit --allow-shlib-undefined -o bfd.bad
ld.bfd --dynamic-linker /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 a.o /usr/lib/x86_64-linux-gnu/libc.so --as-needed ./b.so ./c.so -y myexit -o bfd.good

c.so is an "unneeded" (in the sense of --as-needed) shared object. In the --allow-shlib-undefined command line, myexit is somehow not exported.

bfd.good exits with code 42.

% ./bfd.bad
./bfd.bad: symbol lookup error: ./c.so: undefined symbol: myexit
Comment 1 H.J. Lu 2020-08-29 17:38:43 UTC
--allow-shlib-undefined is a separate issue.  Do you have a run-time testcase
without --allow-shlib-undefined.
Comment 2 H.J. Lu 2020-08-29 19:09:35 UTC
For

ld.bfd a.o --as-needed b.so -o bfd.unneeded

since there is no DT_NEEDED tag, no dynamic section is created.
Comment 3 Fangrui Song 2020-08-29 19:39:51 UTC
When --as-needed is in action, a shared object is added as a DT_NEEDED tag if it satisfies a non-weak undefined reference from a regular object (surviving under --gc-sections). This is the main use case of the option.

Apparently in GNU ld, --as-needed is also used to decide whether a definition needs to be exported. I think it'd be good if this task can be detached from --as-needed (like gold).

For an "unneeded" (in terms of --as-needed) shared object, it may be loaded by other shared objects.

Adding the shared object on the command line is an explicit intention that the "unneeded" (unneeded by the executable, but
may be needed by other shared objects) may require some definitions to be exported. (The definitions may be statically known (by transitive loading of shared objects; the behavior is related to --copy-dt-needed-entries) or dynamic (dlopen).)
Comment 4 H.J. Lu 2020-08-29 21:38:28 UTC
I have a patch.  But I need a run-time test without--allow-shlib-undefined
to justify it.
Comment 5 Fangrui Song 2020-09-01 04:29:58 UTC
ld.bfd a.o --as-needed b.so

Let a.o define a function which will be called by b.so via dlopen. In this case, ld should export the function. This justification may look weak but I think it is moving toward the right direction if we consider that GNU ld from binutils 2.22 defaulted to --no-copy-dt-needed-entries (doing less shared object traversal for more proper dependency tracking and avoiding unneeded work).
Comment 6 H.J. Lu 2020-09-01 14:22:40 UTC
(In reply to Fangrui Song from comment #5)
> ld.bfd a.o --as-needed b.so
> 
> Let a.o define a function which will be called by b.so via dlopen. In this
> case, ld should export the function. This justification may look weak but I
> think it is moving toward the right direction if we consider that GNU ld
> from binutils 2.22 defaulted to --no-copy-dt-needed-entries (doing less
> shared object traversal for more proper dependency tracking and avoiding
> unneeded work).

LD creates dynamic section only if dynamic relocation is needed.
For this test, since no dynamic relocation is needed, there is no
dynamic section.  Since there is no dynamic section, there is no
dynamic symbol table.
Comment 7 Fangrui Song 2020-09-02 04:12:27 UTC
(In reply to H.J. Lu from comment #6)
> (In reply to Fangrui Song from comment #5)
> > ld.bfd a.o --as-needed b.so
> > 
> > Let a.o define a function which will be called by b.so via dlopen. In this
> > case, ld should export the function. This justification may look weak but I
> > think it is moving toward the right direction if we consider that GNU ld
> > from binutils 2.22 defaulted to --no-copy-dt-needed-entries (doing less
> > shared object traversal for more proper dependency tracking and avoiding
> > unneeded work).
> 
> LD creates dynamic section only if dynamic relocation is needed.
> For this test, since no dynamic relocation is needed, there is no
> dynamic section.  Since there is no dynamic section, there is no
> dynamic symbol table.



>a.c <<e cat
 #include <dlfcn.h>
 int foo() { return 42; }
 int main() {
   void *h = dlopen("./b.so", RTLD_LAZY);
   int (*bar)(void) = dlsym(h, "bar");
   return bar();
 }
e
>b.c <<e cat
 int foo();
 int bar() { return foo(); }
e

cc -fuse-ld=bfd -shared -fPIC b.c -ldl -o b.so
cc -fuse-ld=bfd -pie -fPIE a.c -Wl,--push-state -Wl,--as-needed ./b.so -Wl,--pop-state -ldl

./a.out => symbol lookup error: ./b.so: undefined symbol: foo

-fuse-ld=gold or -fuse-ld=lld is good.
Comment 8 H.J. Lu 2020-09-02 12:00:26 UTC
(In reply to Fangrui Song from comment #7)
> 
> >a.c <<e cat
>  #include <dlfcn.h>
>  int foo() { return 42; }
>  int main() {
>    void *h = dlopen("./b.so", RTLD_LAZY);
>    int (*bar)(void) = dlsym(h, "bar");
>    return bar();
>  }
> e
> >b.c <<e cat
>  int foo();
>  int bar() { return foo(); }
> e
> 
> cc -fuse-ld=bfd -shared -fPIC b.c -ldl -o b.so
> cc -fuse-ld=bfd -pie -fPIE a.c -Wl,--push-state -Wl,--as-needed ./b.so
> -Wl,--pop-state -ldl
> 
> ./a.out => symbol lookup error: ./b.so: undefined symbol: foo
> 
> -fuse-ld=gold or -fuse-ld=lld is good.

If it is the only use case, why not use

'--dynamic-list=DYNAMIC-LIST-FILE'
     Specify the name of a dynamic list file to the linker.  This is
     typically used when creating shared libraries to specify a list of
     global symbols whose references shouldn't be bound to the
     definition within the shared library, or creating dynamically
     linked executables to specify a list of symbols which should be
     added to the symbol table in the executable.  This option is only
     meaningful on ELF platforms which support shared libraries.
Comment 9 Michael Matz 2020-09-02 12:34:01 UTC
I think ld.bfd is completely fine to not export exe symbols only referenced by
mentioned but not otherwise needed libraries.  It's follows from traditional behaviour that executables don't export any symbols, which aren't obviously needed
in a static linking model, which is why -E exists.  I could a libararies that
isn't necessary by anything from the executable containing a back-reference to
the executable to not be obvious.  If you really need to support this situation
you would normally need to use -E, which btw is documented to sometime be necessary with dlopen games:

       -E
           ...
           If you use "dlopen" to load a dynamic object which needs to refer
           back to the symbols defined by the program, rather than some other
           dynamic object, then you will probably need to use this option when
           linking the program itself.

The more controled way for this is --dynamic-list.

Doing what you suggest would break the following invariant: you can remove any
unneeded as-needed -lxyz arguments from the link command and end up with the same
binary.  This is basically the set of libraries that 'ldd -u' would print, and is
why as-needed was implement to start with, to save that manual manual work.
Comment 10 Fangrui Song 2020-09-02 17:18:19 UTC
(In reply to Michael Matz from comment #9)

I filed the bug not to find an alternative solution. I am wary of the semantics of --export-dynamic, --dynamic-list (and contributed the recent --export-dynamic-symbol --export-dynamic-symbol{,-list} BTW).

First, this is already a problem with --allow-shlib-undefined. Now the question is whether it is also a problem of --no-allow-shlib-undefined. My reasoning is that placing a shared object on the linker command line is an explicit intention that its referenced symbols should be exported, even if the shared object itself is unneeded (in terms of --as-needed).
Comment 11 H.J. Lu 2020-09-02 18:42:09 UTC
(In reply to Fangrui Song from comment #10)
> (In reply to Michael Matz from comment #9)
> 
> I filed the bug not to find an alternative solution. I am wary of the
> semantics of --export-dynamic, --dynamic-list (and contributed the recent
> --export-dynamic-symbol --export-dynamic-symbol{,-list} BTW).
> 
> First, this is already a problem with --allow-shlib-undefined. Now the
> question is whether it is also a problem of --no-allow-shlib-undefined. My
> reasoning is that placing a shared object on the linker command line is an
> explicit intention that its referenced symbols should be exported, even if
> the shared object itself is unneeded (in terms of --as-needed).

You want to export symbols referenced by unused shared libraries which
may be dlopened.  But not all unused shared libraries will be dlopened.
I think this new behavior should be controlled by a new explicit option
if this behavior is desirable.