Request For Enhancement (RFE): Please implement some way to add an ELF object (ET_DYN or ET_EXEC) that already is in memory, into the set of modules that are managed by the runtime loader rtld, instead of requiring dlopen() of a file from the filesystem. This capability is useful for managing modules that are created at runtime, and/or to help implement protection and access controls, etc. Here is a suggestion of syntax and semantics: a new function #include <link.h> /* macro-ize Elf32_Xxx vs. Elf64_Xxx */ void *handle = dlopen_phdr(ElfW(Phdr) const *phdr, int n_phdr, int flag); which is much like dlopen() except that the mmap()s already have been done before calling dlopen_phdr(). The "slide" value (adjustment in load address) may be computed from the difference between the input phdr parameter and the PT_PHDR.p_vaddr that is found in the vector of ElfW(Phdr)s.
I've been itching for this for quite some time. Hope it happens! Meanwhile, I'll implement a fallback using a temp file. Thanks, John
Indeed. I've wanted this as well. It would also be useful to support bundled up applications where everything lives in a single file including potentially multiple .so files embedded at known offsets within that one archive file when you cannot or do not want to extract the .so's to local storage (if any exists) in order to run the binary.
If you want to embed the .so's in the main program binary, why are you not just static linking them to begin with? That will give much better performance (no time wasted on relocations, no PIC overhead, etc.) and make your program more portable (no dependence on glibc-specific dynamic loading features or even on POSIX dlopen). With that said, I really question the validity of this feature request. Using the .so in-place from data embedded in the main program would only be possible if it's page-aligned, and would be a security risk anyway since the whole .so image would be writable, thus requiring write+exec permission on the pages that most security-enhanced systems don't even allow these days. Thus you'd have to make a copy of the whole .so image, and the copy would have to be anonymous memory that consumes actual physical runtime memory and commit charge, rather than being a file-backed mapping. In other words, it wastes a good deal more memory than loading a .so from a separate file.
(In reply to comment #3) > If you want to embed the .so's in the main program binary, why are you not just > static linking them to begin with? How is static linking useful when managing modules created at runtime? Various JITters do that. Usually they don't generate full ELF, but then GDB doesn't know how to debug them. > With that said, I really question the validity of this feature request. Using > the .so in-place from data embedded in the main program would only be possible > if it's page-aligned, and would be a security risk anyway since the whole .so > image would be writable, thus requiring write+exec permission on the pages that > most security-enhanced systems don't even allow these days. Thus you'd have to > make a copy of the whole .so image, and the copy would have to be anonymous > memory that consumes actual physical runtime memory and commit charge, rather > than being a file-backed mapping. In other words, it wastes a good deal more > memory than loading a .so from a separate file. You appear to be making a whole lot of unwarranted assumptions in your argument. Think UPX: it has the main executable (and could have shared libraries) compressed. It decompresses them to memory, and can arrange for them to be properly aligned, and mprotect()ed RO. It is wasteful to require UPX to write such images to disk only so they can be dlopen()ed and immediately unlink()ed.
My argument was based on the usage cases presented in this bug tracker thread. Anyway, it's wasteful and backwards for things like UPX to exist at all. They trade startup time (valuable) and runtime memory usage (valuable) for disk space (dirt cheap). Even if disk space is valuable, using a compressed filesystem managed by the kernel (where demand paging will be available) is the right solution. Putting a runtime binary decompressor in your application is just bad design. I maintain that any use of this feature would also be bad design. If you really want the possibility of putting embedded so files in your binary, it makes more sense to make toolchain feature for embedding them in the ELF binary using the linker (where they'll be aligned and mapped with the proper permissions) rather than supporting loading from arbitrary buffers.
Comment 2 was another use case: creating single-file executables for scripting languages. For example, Python applications can be bundled into a single executable .zip file. However, when the application uses C extensions (and most applications do), it has to extract the .so from the .zip to a temporary directory, just to allow dlopen() to load it. This is not only slow, but also creates various race conditions. If it was possible to dlopen() the .so *within in the zip file* (it could be stored there uncompressed, with the right alignment if necessary), or to load it and dlopen() it from a buffer, the extraction wouldn't be necessary. Note that dlopen()ing the libraries by mmap()ing selected parts of the zip file would allow for sharing between processes, and would therefore not consume more memory.
(In reply to comment #5) > My argument was based on the usage cases presented in this bug tracker thread. "An ELF object that already is in memory" means that the bytes are in the right place and have the correct access permissions, whether by mmap(), read(), or store-to-memory, followed by mprotect() as appropriate. Approximately, the bytes will occupy an interval of pages. Exactly, they will be an image of the PT_LOADs, slid by some whole number of pages. Equivalently, they will be what is described by struct dl_phdr_info during a callback from dl_iterate_phdr(). The pages need to be "blessed" as an in-memory module that the dynamic linker recognizes, and connected to the rest of the collection of modules in memory. No "mass copying" or re-arranging is necessary. In the proposed dlopen_phdr(), one of the ElfXX_Phdr will be a PT_PHDR, and the slide value for the module is equal to the difference between the actual address and the PT_PHDR.p_vaddr. (If there is no such PT_PHDR, then use zero.) Knowing that, then rtld can find the PT_DYNAMIC, and process it. Create the internal structures which keep track of a loaded module, apply the DT_SONAME, load the DT_NEEDED dependencies, connect the DT_SYMTAB, DT_STRTAB, and DT_{GNU_}HASH, perform the indicated relocations, call the DT_INIT_ARRAY functions, etc. Regarding the use case(s): Storage that is "dirt cheap" tends to be "dirt slow." A class 4 SDHC flash memory device supplies less than 4 MB/s, whereas RAM usually gives at least 100 MB/s. Most hand-held mobile devices do not offer a compressed filesystem. Managing files (including updates) using something like jffs2 requires complex code, battery energy, and perhaps a somewhat sophisticated user to understand the behavior of fragmentation. "Dirt cheap" does not mean a cost of zero, and every $0.10 matters. A device with 8MB RAM and 8GB flash storage cannot afford to use 6MB to store a program with library, if 3.5MB would be enough because of compression. "Decompress to filesystem, then dlopen" costs time (and an unneeded write to flash stoage.) Distributing 3.5MB "over the air" is understandable: it's a "song" (same size as typical MP3 audio). Distributing 6MB gets noticed. I would like to live in a world where such costs did not matter (or were absorbed by somebody else), but today I am forced to pay, and probably will be for at least a couple more years.
Another use case: languages like Lisp require a global, canonical, dynamic mapping of strings to symbols. When the language runtime implements this, it duplicates a lot of the dynamic loader's work. I would like to implement Lisp symbols as ELF symbols. This way, the system handles the common case of symbol names known at compile time, whether in the executable or libraries. The only things it doesn't handle are dynamically created symbols (INTERN in Lisp). Currently, my best option is to create the symbols as memory addresses arranged to resemble a DSO. I intercept calls to dlopen() and "flush" the current set of dynamically allocated symbols. This "flushing" operation involves writing an ELF object with a fixed load address that lines up with the symbol values in use, then loading the object. I do this so the dlopen'd library will see any symbols it may refer to. I suspect this will require a mutex around mmap and friends as well as DSO operations. All this file writing and mutexing would be unneeded if we had dlopen_phdr.
If the proposal is to require the in-memory dso to already be properly mapped (alignment, permissions, etc. issues) then I withdraw most of my criticisms. However I still disagree with John Reiser's arguments about costs. If storage is really slow and it's desirable to use ram instead, you have tmpfs at your disposal. And since we're talking about systems on which the GNU C library can be used, the idea that compressed filesystems might not be available is unconvincing. UPX is backwards technology that makes no sense on a system with virtual memory or any nontrivial kernel.
I see no problem with a requirement to already have the dso mapped and aligned with proper permissions beforehand. That makes sense. Remy described my "comment 2" use case in much better detail. The .so's are extension modules for a runtime being executed via the #! line on the bundle or similar. Python in my case but this applies equally to any dynamic language runtime. tmpfs is not an ideal solution as now you would be required to setup tmpfs, mount it, use it, and require some separate process configured not to be OOM killed to sit around and monitor your process that is using the tmpfs to be able to unmount it when the process dies for whatever reason to free up the resources. Not to mention that systems run without swap so a tmpfs would pin the full dso in memory rather than demand paging the parts being used as a mapping would do.
I am planning to work on this. If anyone has done any implementation work, please speak up.
> I am planning to work on this. If anyone has done any implementation work, > please speak up. How far did you get with implementing this? A new gcc jit would benefit from this functionality.
(In reply to Ondrej Bilka from comment #12) > How far did you get with implementing this? Not very far. We had a prototype, but it proved trickier than we expected, and in particular the semantics of dlclose() on such in-memory object proved unclear. I am currently working on a dlopen_with_offset(), which is just like dlopen, but with a given offset into the file. That would meet our actual needs, but necessarily those of UPX or gcc/jit.
For workarounds a closest that I could thing is use shm_open with random filename to create file descriptor.
(In reply to Ondrej Bilka from comment #14) > For workarounds a closest that I could think is use shm_open with random > filename to create file descriptor. That shows good imagination! The main desired functionality is that of "blessing" as a loaded module the data that is already resident in pages at the appropriate addresses, without creating new copies of pages. This is somewhat like reversing dl_iterate_phdr(); see Comment #7. Related to Comment #13: dlclose() would "remove" the accounting information and "forget" the internal object that was created by the corresponding dlopen(), but otherwise leave the data alone. Do not call munmap(), etc.
As I mentioned before, using an already-mapped-in-vm DSO with dlopen is not viable. Usually, DSOs have at least one page (where the end of .text and the beginning of .data share a page on disk) that must be mapped twice at different offsets, and likewise all subsequent data pages must be mapped offset by one page from their location in the image. Further, you need to have empty VM space for sufficiently many .bss pages past the end of the mapping. It would be possible to require the caller to arrange all of these things, but that's basically offloading A LOT of the ELF loading process onto the calling program and I don't think that makes for a reasonable public interface for glibc to provide. If you don't demand this crazy in-place usage of the DSO image, simply copying it to a temp file or shared memory object and loading it from there would work perfectly well.
(In reply to John Reiser from comment #15) > Related to Comment #13: dlclose() would "remove" the accounting information > and "forget" the internal object that was created by the corresponding > dlopen(), but otherwise leave the data alone. Do not call munmap(), etc. The problem semantics weren't about munmap -- that part was easy. They were about relocations. In UPX case, you probably don't have any. In our case, we actually do have a bona-fide DSO with relocations that is at some offset in another file. Calling "pretend"-dlopen() applies them. I guess dlclose() could undo them, but we didn't get that far. Not undoing them would cause them to be re-applied again on re-dlopen(), which would be wrong.
(In reply to Rich Felker from comment #16) > It would be possible to require the caller to arrange all of these > things, but that's basically offloading A LOT of the ELF loading process > onto the calling program and I don't think that makes for a reasonable > public interface for glibc to provide. Well, it's all in a day's work for the compiler writers who would directly use this. I like to make simple things easy and complex things possible.
It's already possible: you write into a temp file and call dlopen on the temp file. What you're asking for is not "making simple things easy and complex things possible" but rather "making simple things complex as a dubious premature optimization". As for your proposed Lisp implementation usage case, it's probably a bad idea. Even aside from the issue of avoiding symbol clashes with the C namespace (which you could avoid with prefixing of some sort, at the cost of added hashing/lookup cost), dlsym is simply not very efficient. POSIX requires it to accept invalid DSO handles (which glibc currently does not tolerate; see bug #14989) and report an error rather than crashing, which adds a good deal of otherwise-unnecessary overhead. I'm also unclear on how lookup time and space requirements scale with number of DSOs loaded (of which you may have a lot). But even if not for all these issues, it's just bad design to write one thing that depends on the implementation internals of another.
(In reply to Rich Felker from comment #19) > It's already possible: you write into a temp file and call dlopen on the > temp file. "Just bad design" in your words. > POSIX > requires it to accept invalid DSO handles (which glibc currently does not > tolerate; see bug #14989) Interesting, thanks! Have you thought about a hash table (or similar) mapping handle to header? > lookup time and space requirements scale with number of DSOs loaded (of > which you may have a lot). I grant that there may exist good reasons not to implement this feature in this time and place. Once we get our foot in the door with a minimal implementation, if scaling issues arise later, we optimize.
Created attachment 7266 [details] attachment-3251-0.html There is no writable storage or the ability to mount any in the situation Paul and I are looking to support.
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.
I would also appreciate this feature, for Python.
Here is someone's proof of concept implementation for 64-bit Linux https://github.com/m1m1x/memdlopen
(In reply to Gregory P. Smith from comment #21) > Created attachment 7266 [details] > attachment-3251-0.html This attachment seems broken. It has some html file instead of a patch, or is it just me? > There is no writable storage or the ability to mount any in the situation > Paul and I are looking to support. dlopen_with_offset seems to be a quite limited solution, as even fd is not always available. What do people think about dlopen_fstream(FILE *f, int flags); that can work with file streams? One can use fopencookie() to create any weird file stream, or one can use fmemopen() to read from memory.
Hi guys. I posted the dlmem() patch here: https://sourceware.org/bugzilla/show_bug.cgi?id=30100 It probably sucks and all that, but it works. Probably someone is interested to take it over and bring up with the glibc patching guidelines or whatever else that suits.