Bug 28248 - is there a way to LD_PRELOAD library before other dynamic libraries?
Summary: is there a way to LD_PRELOAD library before other dynamic libraries?
Status: WAITING
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: 2.33
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-19 12:57 UTC by Milian Wolff
Modified: 2022-08-29 09:53 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-08-29 00:00:00
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Milian Wolff 2021-08-19 12:57:15 UTC
Hey there, I hope this is the right place to report this.

I'm the author of heaptrack, which relies on LD_PRELOAD to intercept calls to malloc and friends. Only recently, I realized my assumption about LD_PRELOAD behavior is wrong - I thought it would be loaded _before_ all other dynamic libraries, but that isn't the case.

See documentation over at https://www.man7.org/linux/man-pages/man8/ld.so.8.html:

>    LD_PRELOAD
>              A list of additional, user-specified, ELF shared objects
>              to be loaded before all others.

But as can be seen by the below, the preloaded library will be loaded after other dependencies of the binary we are executing, i.e.:

```
$ cat preload_dump.c:
#include <stdio.h>

void __attribute__ ((constructor)) init(void)
{
    fprintf(stderr, "preload init!\n"); 
}

void __attribute__ ((destructor)) cleanup(void)
{
    fprintf(stderr, "preload cleanup!\n"); 
}
$ gcc -shared -fPIC -o preload_dump.so preload_dump.c
```

```
$ readelf -d $(which bash) | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libtinfo.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
```

```
LD_DEBUG=files LD_PRELOAD=$PWD/preload_dump.so bash --version
        83:
        83:     file=/opt/craft/preload_dump.so [0];  needed by bash [0]
        83:     file=/opt/craft/preload_dump.so [0];  generating link map
        83:       dynamic: 0x00007fb5dbbd3dd0  base: 0x00007fb5dbbd2000   size: 0x0000000000002029
        83:         entry: 0x00007fb5dbbd2000  phdr: 0x00007fb5dbbd2040  phnum:                  8
        83:
        83:
        83:     file=libtinfo.so.5 [0];  needed by bash [0]
        83:     file=libtinfo.so.5 [0];  generating link map
        83:       dynamic: 0x00007fb5db9b3d10  base: 0x00007fb5db78b000   size: 0x0000000000229f00
        83:         entry: 0x00007fb5db797e40  phdr: 0x00007fb5db78b040  phnum:                  7
        83:
        83:
        83:     file=libdl.so.2 [0];  needed by bash [0]
        83:     file=libdl.so.2 [0];  generating link map
        83:       dynamic: 0x00007fb5db789d68  base: 0x00007fb5db587000   size: 0x0000000000203130
        83:         entry: 0x00007fb5db587e50  phdr: 0x00007fb5db587040  phnum:                  7
        83:
        83:
        83:     file=libc.so.6 [0];  needed by bash [0]
        83:     file=libc.so.6 [0];  generating link map
        83:       dynamic: 0x00007fb5db57fb60  base: 0x00007fb5db1b9000   size: 0x00000000003cd200
        83:         entry: 0x00007fb5db1db660  phdr: 0x00007fb5db1b9040  phnum:                 10
        83:
        83:
        83:     calling init: /lib64/libc.so.6
        83:
        83:
        83:     calling init: /lib64/libdl.so.2
        83:
        83:
        83:     calling init: /lib64/libtinfo.so.5
        83:
        83:
        83:     calling init: /opt/craft/preload_dump.so
        83:
preload init!
        83:
        83:     initialize program: bash
        83:
        83:
        83:     transferring control: bash
        83:
GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
        83:
        83:     calling fini: bash [0]
        83:
        83:
        83:     calling fini: /opt/craft/preload_dump.so [0]
        83:
preload cleanup!
        83:
        83:     calling fini: /lib64/libtinfo.so.5 [0]
        83:
        83:
        83:     calling fini: /lib64/libdl.so.2 [0]
        83:

```

Note how the needed libs of the binary are loaded before the LD_PRELOAD'ed library. And note how that then obviously means the fini call for preload_dump happens before e.g. libtinfo.so's fini call.

Is there a way for me to inject a library into the dynamic loader in such a way, that it is loaded first before any other dependency of the executable?

In my case, the main reason why I want the above behavior is actually to enforce proper shutdown semantics: As a memory profiler, I have an atexit handler which is calling `__libc_freeres` and `__gnu_cxx::__freeres`, similar to and inspired by valgrind. But because my preloaded library is loaded *after* the other libraries, its cleanup code will run *before* that of other libraries.

Combined, this can lead to crashes when another fini of some other library calls code that depends on resources that get freed by the freeres calls.

so, tl;dr;

- is there a way to force something like LD_PRELOAD, but load the library before all others?
- is there an alternative way to force an atexit handler call at the very end of the shutdown semantics, after all other libraries got unloaded?

Many thanks!
Comment 1 Alexander Monakov 2021-08-31 17:41:23 UTC
You're mixing up two separate library orderings: the symbol lookup order (which is what the man page implicitly refers to, it's breadth-first order over DT_NEEDED links, with LD_PRELOAD modules appearing just after the main executable and before its first DT_NEEDED library), and initialization order (which is reverse topological order over DT_NEEDED links, assuming lack of cycles).

To interpose malloc and the like, your library needs to appear prior to libc.so in the former; also, your library in all likelihood depends on libc.so, so it will appear after libc.so in the latter.

The log you've shown confirms this: preload_dump.so is loaded prior to libtinfo.so.5, but its constructors are invoked last.

ELF gABI document explains this, please have a look:
http://www.sco.com/developers/gabi/latest/ch5.dynamic.html#init_fini
Comment 2 Milian Wolff 2021-09-06 08:19:46 UTC
I see, thank you!

Is there an alternative mechanism - potentially low-level / undocumented - that I could leverage to somehow inject my interposing library in such a way, that it's initialization happens after libc, but before any other libraries?

I.e. instead of this:
```
        83:     calling init: /lib64/libc.so.6
        83:
        83:
        83:     calling init: /lib64/libdl.so.2
        83:
        83:
        83:     calling init: /lib64/libtinfo.so.5
        83:
        83:
        83:     calling init: /opt/craft/preload_dump.so
```

I would like to somehow achieve this:

```
        83:     calling init: /lib64/libc.so.6
        83:
        83:
        83:     calling init: /opt/craft/preload_dump.so
        83:
        83:
        83:     calling init: /lib64/libdl.so.2
        83:
        83:
        83:     calling init: /lib64/libtinfo.so.5
```

That way, I can call `__libc_freeres` safely, as no other library code run - except that of libc itself.

Thanks
Comment 3 Florian Weimer 2021-09-06 08:32:48 UTC
(In reply to Milian Wolff from comment #2)
> I see, thank you!
> 
> Is there an alternative mechanism - potentially low-level / undocumented -
> that I could leverage to somehow inject my interposing library in such a
> way, that it's initialization happens after libc, but before any other
> libraries?

For glibc 2.34 on Linux, linking with -Wl,-z,initfirst should work. glibc should no longer use DF_1_INITFIRST itself, so it's available for application use.

> That way, I can call `__libc_freeres` safely, as no other library code run -
> except that of libc itself.

So this is about destructors? glibc currently does not guarantee that destructors run in the opposite order of constructors (even without dlclose, which can make some reordering necessary due to early unloading).
Comment 4 Milian Wolff 2021-09-09 08:35:34 UTC
Thank you, I'll try out that linker flag and report back.

> So this is about destructors?

Yes, indeed. As a heap profiler and leak checker, I would like to call `__libc_freeres` to silence leaks, similar to how valgrind does it. And I need do that before any other library gets unloaded, as those could potentially try to dlclose plugins loaded at runtime (this is what QtCore does). Doing that after `__libc_freeres` would lead to crashes. So here I'm looking for ways of preventing said crash.

> glibc currently does not guarantee that destructors run in the opposite order of constructors

Hm that sounds unfortunate. But if you say: "don't guarantee" - does it do it normally, and there are just some corner cases where it wouldn't happen? As long as it would help with the common case I'm running into (see above), it would help me.

Cheers
Comment 5 Milian Wolff 2021-09-09 08:54:20 UTC
I just tried the linker flag with the example given above, but sadly it doesn't help as only the `calling init` is changed, but `calling fini` isn't - this was probably what you had in mind when you raised the potential issue about constructor/destructor order.

```
$ gcc -shared -fPIC -o preload_dump.so preload_dump.c
LD_DEBUG=files LD_PRELOAD=$PWD/preload_dump.so bash --version |& grep -i calling
      7413:     calling init: /lib64/ld-linux-x86-64.so.2
      7413:     calling init: /usr/lib/libc.so.6
      7413:     calling init: /usr/lib/libncursesw.so.6
      7413:     calling init: /usr/lib/libdl.so.2
      7413:     calling init: /usr/lib/libreadline.so.8
      7413:     calling init: /tmp/preload_dump.so
      7413:     calling fini: bash [0]
      7413:     calling fini: /tmp/preload_dump.so [0]
      7413:     calling fini: /usr/lib/libreadline.so.8 [0]
      7413:     calling fini: /usr/lib/libdl.so.2 [0]
      7413:     calling fini: /usr/lib/libncursesw.so.6 [0]

$ gcc -shared -fPIC -Wl,-z,initfirst -o preload_dump.so preload_dump.c
$ LD_DEBUG=files LD_PRELOAD=$PWD/preload_dump.so bash --version |& grep -i calling
      7373:     calling init: /tmp/preload_dump.so
      7373:     calling init: /lib64/ld-linux-x86-64.so.2
      7373:     calling init: /usr/lib/libc.so.6
      7373:     calling init: /usr/lib/libncursesw.so.6
      7373:     calling init: /usr/lib/libdl.so.2
      7373:     calling init: /usr/lib/libreadline.so.8
      7373:     calling fini: bash [0]
      7373:     calling fini: /tmp/preload_dump.so [0]
      7373:     calling fini: /usr/lib/libreadline.so.8 [0]
      7373:     calling fini: /usr/lib/libdl.so.2 [0]
      7373:     calling fini: /usr/lib/libncursesw.so.6 [0]
```

Too bad! That would seem to indicate that I'm out of luck here and must stop calling `__libc_freeres` - or does someone have an alternative suggestion I could employ?
Comment 6 Florian Weimer 2022-08-29 09:53:15 UTC
In glibc 2.34, we stopped using -z initfirst for libpthread, which means that is in theory now available for application use. Maybe using that for your shared object is now an option?