This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: dlopen_from()


Whoops, sent as text/html, bounced by the list. Trying again.

On 2020-01-27 12:05, Nick Barnes wrote:
Hi Carlos,

Please find attached my tiny example, complete with a shim library:

    $ make go
    cc -g main.c -o main -ldl -Wl,-rpath="\$ORIGIN/A"
    mkdir -p A
    cc -g -shared -fPIC A.c  -o A/libA.so -ldl -Wl,-rpath="\$ORIGIN/../B"
    mkdir -p B
    cc -g -shared -fPIC B.c -o B/libB.so
    ./main
    in main.
       in A.
         in B.
       back in A.
    back in main.
    $ make go-shim
    mkdir -p shim
    cc -g -shared -fPIC shim.c  -o shim/libshim.so -ldl
    LD_PRELOAD=shim/libshim.so ./main
    SHIM: in dlopen(libA.so, ...) ...
    SHIM: ... failed; libA.so: cannot open shared object file: No such
    file or directory
    main couldn't dlopen(libA.so): (null)
    Makefile:10: recipe for target 'go-shim' failed
    make: *** [go-shim] Error 1
    $ LD_LIBRARY_PATH=A:B make go-shim
    LD_PRELOAD=shim/libshim.so ./main
    SHIM: in dlopen(libA.so, ...) ...
    SHIM: ... 0x55e9dce14690
    in main.
    SHIM: in dlopen(libB.so, ...) ...
    SHIM: ... 0x55e9dce14cb0
       in A.
         in B.
       back in A.
    back in main.
    $

Regards,

Nick B


On 2020-01-24 17:36, Nick Barnes wrote:

Hi Carlos,

I'll knock up some sample code for you next week. Thank you for the pointer to LD_AUDIT, which is a useful feature: we can definitely make some use of the information from there. However, that interface doesn't appear to include any function called when there is a failure in the loader. For instance, dlopen() on a nonexistent library: there's a sequence of la_objsearch() calls, working down the library search path, but no final call to indicate failure. An LD_AUDIT library would have to infer that from the lack of a subsequent la_objopen() call (for instance, using a timeout). Our application shim currently records the arguments and return value of dlopen(), along with errno/dlerror().

la_objsearch "libZ.so", cookie 0x7fc72acbb5e0, original(1)
la_objsearch "/home/users/nickb/notes/ld-audit/A/libZ.so", cookie 0x7fc72acbb5e0, RPATH/RUNPATH(4) la_objsearch "/lib/x86_64-linux-gnu/tls/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64-linux-gnu/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64-linux-gnu/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64-linux-gnu/tls/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64-linux-gnu/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64-linux-gnu/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64-linux-gnu/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64-linux-gnu/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/tls/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/tls/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64-linux-gnu/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/tls/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/tls/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/lib/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64)
la_objsearch "/lib/libZ.so", cookie 0x7fc72acbb5e0, default directory(64)
la_objsearch "/usr/lib/tls/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/tls/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/tls/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/x86_64/libZ.so", cookie 0x7fc72acbb5e0, default directory(64) la_objsearch "/usr/lib/libZ.so", cookie 0x7fc72acbb5e0, default directory(64)

I have wondered about security implications of dlopen_from(), but there don't seem to be any security barriers between shared objects in a single process, in the sense that anything one shared library is permitted to do, another shared library in the same process is also permitted to do. Maybe I'm mistaken about that?

I'll send some code over next week. Should we keep this on libc-help or take it elsewhere?

Nick B


On 2020-01-23 17:25, Carlos O'Donell wrote:

On Thu, Jan 23, 2020 at 11:41 AM Nick Barnes<nick@ellexus.com>  wrote:
On 2020-01-23 15:33, Carlos O'Donell wrote:

On Thu, Jan 23, 2020 at 10:28 AM Nick Barnes via libc-help
<libc-help@sourceware.org>  wrote:

The semantics of dlopen() depend on which shared object the calling
function is in (RUNPATH, RPATH, ORIGIN, etc). This makes it difficult
for shim libraries (using LD_PRELOAD) to wrap calls to dlopen(). There's
no documented way to get at the underlying functionality (the actual
implementing function, dl_open, which takes a caller address). I find
myself digging through the libc source code, trying to fake up internal
data structures which will allow me to fool dlopen() that I'm calling it
from some other shared library. The only alternative seems to be some
sort of ROP attack.

Can you provide a concrete example of a shim library that doesn't work
and how this would solve the problem?

The concrete example I'm familiar with (our Breeze and Mistral products) are proprietary, so I can't share sources of them, but I expect this to be a problem for anyone trying to wrap dlopen(), and likely to become more common as application and library developers become more and more conscious of dependency versioning and reproducibility (and so increasingly likely to set RPATH or RUNPATH).
Can you provide example code that shows what you are doing in Breeze
and Mistral? It need not be the exact applications, but having a
working example with various empty libraries that exercise the exact
options you want to use would be very useful. Such examples serve as
starting points for conversations.

Applications and environments which want to nail down their library dependency versions often ship with binary libraries and use RPATH or RUNPATH to ensure they are the ones loaded. The same is true of third-party libraries loaded by those applications. So it's not surprising when (say) Python 3.7 installed by Anaconda has an RPATH in its binary, which it uses to load libraries such as PyTorch 1.2.0, which in turn has a (different) RPATH, which uses $ORIGIN to make sure that it gets its own binary shipped libraries. Any system which uses LD_PRELOAD to wrap dlopen(), for instance to identify and catalogue dynamic dependencies when planning application migration into a container, will get into trouble here. The PyTorch library calls dlopen(), which the dynamic linker has resolved (thanks to LD_PRELOAD) to the wrapper in the LD_PRELOAD library. The wrapper runs, and calls dlopen() itself, needing to record information about the call and the result, but Glibc/libDL cannot find the library because it doesn't have the RPATH. The wrapper could try to fake it (by digging through the ELF header of the calling library, finding the RPATH and RUNPATH, inferring the ORIGIN and PLATFORM, and faking the search path used by dlopen), but this seems like a lot of work and is a reimplementation of large parts of glibc/elf/dl-open.c, so would have to keep pace with any future changes there.
Thanks for the example. Is there any reason you don't use the LD_AUDIT
interface for this? It is a transparent, loaded into a distinct
namespace, method for observing binding information. I even know some
users that deploy LD_AUDIT modules to prevent non-production approved
shared objects from ever being loaded!

Part of my evaluation with any new API is to see if there is another
way to handle the root problem.

If you really really need an interposable dlopen that works with RPATH
and RUNPATH, and all the set DSTs, then we need to understand that use
case.

Your current use case looks like it can be solved with LD_AUDIT.

Inevitably there are ways to get around this, but they are pretty fragile. I've spent several days implementing three different ones.
Was one of those ways using LD_AUDIT?

The semantics of dlopen() explicitly depend on the calling function, so inevitably the first thing the implementation does is obtain the return address and call another function with the original dlopen() arguments and that return address. My proposal is to expose (and rename) that other function. So the maintenance burden should be fairly low.
Yeah, certainly, given that the input to the new dlopen_from has the
caller's handle clearly identified to solve the search scope problem.

Your API proposal would have to go through review and may change and
so the maintenance burden might also change.

The requirement is: I want to call dlopen and have the search scope be
that of another object.

We need to design an API that solves this problem but doesn't expose
undue restrictions to the implementation. I'm not saying your design
does or doesn't, I haven't through too deeply about the implications
e.g. design, security, ELF specification, etc.

That depends largely on convincing maintainers that your new API has a
use case that users care about and is worth maintaining forever.

Yes, a tough sell. But if such an API existed, we would certainly use it, and so would anyone else serious about wrapping dlopen(). So that's a potential community of maintainers right there.
I'm not sure it's as tough a sell as you think. I already want to see
us implement fdlopen() also from BSD, and I think we should be leading
be example and implementing interfaces that users need to solve real
problems.

If we take a step back though it might be that RPATH and RUNPATH are
the real problem, and that addressing self-contained runtimes in some
other way would help. I think that any design review of this API would
naturally take that into consideration. Which is partly why I want to
see a good "put together" example of what Anaconda is doing (I'm
familiar with their product).

In summary:
- Put together an example "thing" to talk further about the use case?
- Discuss about "thing" and possible solutions starting with dlopen_from();

Cheers,
Carlos.

--
*Nick Barnes*
Senior Software Developer

Ellexus is the I/O profiling company.
www.ellexus.com <http://www.ellexus.com>

Ellexus Ltd is a limited company registered in England & Wales
Company registration no. 07166034
Registered address: 198 High Street, Tonbridge, Kent TN9 1BE, UK
Operating address: St John's Innovation Centre, Cowley Road, Cambridge CB4 0WS, UK
--
*Nick Barnes*
Senior Software Developer

Ellexus is the I/O profiling company.
www.ellexus.com <http://www.ellexus.com>

Ellexus Ltd is a limited company registered in England & Wales
Company registration no. 07166034
Registered address: 198 High Street, Tonbridge, Kent TN9 1BE, UK
Operating address: St John's Innovation Centre, Cowley Road, Cambridge CB4 0WS, UK

--
*Nick Barnes*
Senior Software Developer

Ellexus is the I/O profiling company.
www.ellexus.com <http://www.ellexus.com>

Ellexus Ltd is a limited company registered in England & Wales
Company registration no. 07166034
Registered address: 198 High Street, Tonbridge, Kent TN9 1BE, UK
Operating address: St John's Innovation Centre, Cowley Road, Cambridge
CB4 0WS, UK

Attachment: dlopen-sample.tgz
Description: application/compressed-tar


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]