This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?


I've got a situation where I need to hook a dlopen() made by VDDK, a proprietary library, where it passes a relative name expecting to resolve to a copy of several libraries, including libstdc++.so, that it installs alongside itself, and fails to load if that resolves to the system libstdc++.so. The simplest solution of providing LD_LIBRARY_PATH is enough to load VDDK, but then poisons any child process which likewise fail to load if they pick up VDDK's libstdc++.so instead of the system one. Up to now, we've documented throwing the burden on the end user who has to write convoluted:

LD_LIBRARY_PATH_save=$LD_LIBRARY_PATH \
 LD_LIBRARY_PATH=/path/to/vddklibs:$LD_LIBRARY_PATH \
 nbdkit vddk libdir=/path/to/vddklibs file --run \
   'LD_LIBRARY_PATH=$LD_LIBRARY_PATH_save; program args'

where we would rather the end-user could get away with a more concise:

nbdkit vddk libdir=/path/to/vddklibs file --run 'program args'

Sequentially, we have this scenario:

nbdkit vddk libdir=/path/to/libs file --run 'program args'
- nbdkit binary calls dlopen("/path/to/nbdkit-vddk-plugin.so")
  - nbdkit-vddk-plugin.so calls
    dlopen("/path/to/libs/libvixDiskLibs.so") using the libdir= argument
    to load vddk (rather than dlopen("libvixDiskLibs.so") relying on
    LD_LIBRARY_PATH)
  - vddk's initializer calls dlopen("libcrypto.so") expecting to
    open /path/to/libs/libcrypto.so, but either LD_LIBRARY_PATH
    made that possible (at which point we have to scrub it before
    a child process will be penalized), or we have to find a way to
    rewrite vddk's dlopen call from relative into absolute before
    passing it to the real dlopen
- nbdkit binary spawns a child process to exec 'program args'
  - program does not want /path/to/libs in its search path

Writing my own dlopen() wrapper directly in nbdkit seems like a non-starter (my override has to come from a shared library before it can replace the shared version that would be imported from -ldl, at least for all subsequent shared library loads that want to bind to the override). And if I read 'man dlopen' correctly, since nbdkit used dlopen() to load nbdkit-vddk-plugin.so, then dlopen() is already bound in the main context, so unless I use RTLD_DEEPBIND from nbdkit, then nbdkit-vddk-plugin.so will also see dlopen() bound to -dl rather than anything it loads locally; but even with RTLD_DEEPBIND, it sounds like that higher precedence lasts only for nbdkit-vddk-plugin.so and does not extend to later bindings performed for libvixDiskLib.so (which means vddk is back to -ldl's dlopen, without my hook). Thus, to hook dlopen within the same process, I need some way to create a scope where I can provide a shared dlopen() that will take precedence when resolving symbols during the load of libvixDiskLib.so, but where that hook code can still defer back to the real dlopen() from -ldl and does not penalize child processes.

I managed to create a solution that avoids the need to set LD_PRELOAD_PATH at all by installing a shared library that hooks dlopen(), then loading both my shim and vddk via dlmopen() without the use of RTLD_GLOBAL or RTLD_DEEPBIND. More links on my solution:

https://www.redhat.com/archives/libguestfs/2020-February/msg00154.html
https://bugzilla.redhat.com/show_bug.cgi?id=1756307#c7
https://sourceware.org/bugzilla/show_bug.cgi?id=15971#c5

However, when Florian saw it, he suggested that my solution of dlmopen() for a shim library that overrides dlopen() is reinventing what la_objsearch() can already do. This is in part because the moment you dlmopen() a library into a separate namespace, you can't debug it (both glibc and gdb need additional patches to expose alternative namespaces for debugging), but there may be other nasty surprises lurking.

But after spending more than an hour playing with la_objsearch() and reading 'man rtld-audit', it looks like an audit library cannot be triggered in glibc except by listing it in LD_AUDIT in the environment during exec - which is back to the same problem we have with needing LD_LIBRARY_PATH in the environment. Furthermore, although I know that glibc's audit interface is slightly different from the Solaris version it copied from, the Solaris documentation states that an audit library has some rather tough restrictions (including that using 'malloc' is unsafe, https://docs.oracle.com/cd/E36784_01/html/E36857/chapter6-3.html#scrolltoc "Some system interfaces assume that the interfaces are the only instance of their implementation within a process. Examples of such implementations are signals and malloc(3C). Audit libraries should avoid using such interfaces, as doing so can inadvertently alter the behavior of the application."). But Solaris also stated that a library could serve as an audit entry point without LD_AUDIT if it is registered locally, via -Wl,-paudit.so.1 when creating the shared library (https://docs.oracle.com/cd/E36784_01/html/E36857/chapter6-18.html#scrolltoc); it doesn't seem that this functionality exists with glibc (/usr/lib64/libaudit.so on Linux has nothing to do with rtld-audit).

Does anyone have any ideas on how to let a shared library implement an audit interface for just its own process, without having to edit LD_AUDIT or re-exec the process? Or is there yet another way to hook a program to rewrite misbehaving dlopen() calls without relying on either dlmopen() or la_objsearch(), or requiring pre-set environment variables, or having to re-exec a process?

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]