Request For Enhancement (RFE): Please enhance dlopen() of an ET_EXEC file to work in many cases. Currently dlopen() of an ET_EXEC file always fails with the dlerror() string "cannot dynamically load executable". However, the test program below shows by example that dlopen() of the corresponding ET_DYN file (just change Elf32_Ehdr.e_type from ET_EXEC to ET_DYN) can give useful results. In particular, the file is mapped into the address space, along with all its DT_NEEDED dependencies, and can be invoked successfully at its .e_entry point. If the handle returned by dlopen() were not NULL, then dlsym() probably would work, too. All of these are useful properties that would be nice to have. Thank you. -----hello.c #include <stdio.h> int main(int argc, char *argv[]) { printf("Hello world.\n"); return 0; } -----dlopen-exec.c /* Show that dlopen of an ET_EXEC file would mostly work, at least if the * address space that is requested is currently unoccupied. * * Compile and run via: * gcc -m32 -g -o hello32 hello.c * gcc -m32 -g -o dlopen-exec -Ttext-segment=0x0a000000 dlopen-exec.c -ldl * ./dlopen-exec * -Ttext-segment=0x0a000000 leaves enough room for the default 0x08048000. */ #include <dlfcn.h> #include <elf.h> #include <sys/fcntl.h> #include <sys/types.h> #include <sys/stat.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> /* i386 code to invoke an ET_EXEC image that has been loaded into memory. */ void raw_start(Elf32_Addr entry, int original_argc, char const **original_argv) { asm("mov 2*4(%ebp),%eax"); /* entry */ asm("mov 3*4(%ebp),%ecx"); /* original_argc */ asm("mov 4*4(%ebp),%edx"); /* original_argv */ asm("mov %edx,%esp"); /* Trim stack. */ asm("push %ecx"); /* New argc */ asm("sub %edx,%edx"); /* no rtld_fini function */ asm("jmp *%eax"); /* Goto entry. */ } char fname[] = "/tmp/dlopen-exec-XXXXXX"; int main(int argc, char const *argv[]) { /* Try to dlopen an ET_EXEC file. */ void *handle = dlopen("./hello32", RTLD_LAZY); if (0==handle) { fprintf(stderr, "dlopen ./hello32 failed:%s\n", dlerror()); } /* Write a new file that is the same except for ET_DYN. */ /* Error checking has been omitted in this section. */ int const fdi = open("./hello32", O_RDONLY); struct stat sb; fstat(fdi, &sb); Elf32_Ehdr *const ehdr = malloc(sb.st_size); read(fdi, ehdr, sb.st_size); close(fdi); ehdr->e_type = ET_DYN; int const fdo = mkstemp(fname); write(fdo, ehdr, sb.st_size); close(fdo); /* Try to dlopen the ET_DYN file. */ void *h2 = dlopen(fname, RTLD_LAZY); if (0==handle) { fprintf(stderr, "dlopen failed:%s\n", dlerror()); } else { fprintf(stderr, "Success: %s\n", fname); } unlink(fname); /* Clean up. */ /* dlopen() "succeeded" even though the return value was 0. * Demonstrate success by executing the loaded program. */ raw_start(ehdr->e_entry, argc, argv); return 0; } -----console log $ gcc -m32 -g -o hello32 hello.c $ gcc -m32 -g -o dlopen-exec -Ttext-segment=0x0a000000 dlopen-exec.c -ldl $ ./dlopen-exec dlopen ./hello32 failed:./hello32: cannot dynamically load executable dlopen failed:(null) Hello world. $ -----
Is there a reason you can't just build an executable you want to use this way with -fPIE -pie? A PIE is both an executable and an ET_DYN. The usual handling for ET_EXEC files uses MAP_FIXED, which will clobber any existing mappings. So it is dangerous to blindly load an ET_EXEC file. There is not really any very good way for dlopen to determine that the regions used by the particular ET_EXEC file are not already mapped.
My application is an auditor (checker/verifier) and the target application already has been built by someone else, usually without -fPIE -pie. > "The usual handling for ET_EXEC files uses MAP_FIXED, which will clobber any existing mappings." Yes. However, instead of calling mmap(.p_vaddr,,,MAP_FIXED,,): omit the MAP_FIXED, then compare the return value with .p_vaddr. If the two addresses are equal, then the space was available and has been filled with the correct contents. (As demonstrated by the test case, that is essentially the main effect of calling dlopen on the original file but with .e_type=ET_DYN.) If the return value from mmap does not equal .p_vaddr, then the pages weren't available for some reason, and you get to decide what to do.
Current behavior is actually even better: tantalizingly close! Fixing the copy+paste error: ----- void *h2 = dlopen(fname, RTLD_LAZY); - if (0==handle) { + if (0==h2) { fprintf(stderr, "dlopen failed:%s\n", dlerror()); ----- then running: ----- dlopen ./hello32 failed:./hello32: cannot dynamically load executable Success: /tmp/dlopen-exec-v66w8C Hello world. ----- Therefore "masquerading" the ET_EXEC as ET_DYN essentially works as long as the address space is available and the operating system is in a good mood (honors the hint when the first parameter of mmap is not NULL.) Many systems honor the hint as a matter of policy [default, or at least administratively chosen], so I'd like to take advantage of those cases.
Created attachment 4863 [details] patch to elf/dl-load.c This patch to elf/dl-load.c implements the requested enhancement.
Created attachment 4864 [details] revised testcase (for x86_64) This revised testcase allows for success of the requested enhancement.
The patch looks good to me! However, there are numerous style issues; the source is usually aligned by opening parenthesis, and boolean operators are always at the beginning of the next line. Also, I know why you use const == var, but everywhere else var == const is used, so it would be better to stick with that. The issues are in the second if condition, and I think the new mappref initializer is also nearly unreadable now and would benefit from splitting to multiple expressions.
Created attachment 4997 [details] revised patch to elf/dl-load.c revised patch (for coding style)
I won't change this because it can never work reliably in all situations where such a call can be made. There can be address space conflicts and they cannot even be detected. This inevitably will lead to problems. There are very good reasons why this never was imagined to be implemented.
Please explain how an undetected conflict over address space could arise. The interval of pages requested by mmap is the convex hull of all of the PT_LOAD, so the interval contains each PT_LOAD. The first argument to mmap is the address of the lowest page in the range, and the flags argument _omits_ MAP_FIXED. If the kernel's allocation policy honors the suggested placement, then the kernel believes that no page in the requested range was occupied before. If any page in the requested range _was_ occupied before, then the kernel will choose some other address, else return MAP_FAILED. In all cases, comparing the desired lowest address to the return value of mmap() correctly determines success or failure, including any conflicts that might exist.
What is the status of the bug un the post-Ulrich era?
See https://sourceware.org/glibc/wiki/Contribution%20checklist regarding submitting patches. This patch doesn't appear to have been submitted to libc-alpha in the patchwork era (March 2014 onwards), if at all, so it's not visible as a patch pending review (even with patches in patchwork, they should still be pinged weekly).
It should be noted that some hardened kernels (grsec/pax?) ignore the requested address when performing mmap without MAP_FIXED, so dlopen of ET_EXEC files would be impossible on such systems. I'm not opposed to the feature, but if it's added then such limitations might should be documented.
We cannot support this because it is not possible to perform correct relocations if another executable has already been loaded. There is also no way to correctly execute the ELF constructors of the second executable. If you want to inject code into another executable, you can use LD_PRELOAD or LD_AUDIT, which does not have these problems.
(In reply to Florian Weimer from comment #13) > We cannot support this because it is not possible to perform correct > relocations if another executable has already been loaded. There is also no > way to correctly execute the ELF constructors of the second executable. Please give specific examples or explanations why success (or a recognizable, specific, and informative error code) is not possible. The relocations of the dl_open()ed ./hello32 are preformed correctly enough to invoke printf() through the usual PLT (Program Linkage Table) [evidence in Comment #3], which directly contradicts the claim of Comment #13. The revised test case of the Description, and the revised patch in Comment #7 demonstrate that dl_open() of ET_EXEC can succeed. The remark of Comment #9 tells how to determine [non-]conflict of address space. Comment #13 has no example or explanation why calling the DT_INIT* functions must fail. > If you want to inject code into another executable, you can use LD_PRELOAD > or LD_AUDIT, which does not have these problems. Portions of the PT_INTERP and language-support run-time library initialization run before any LD_PRELOAD or LD_AUDIT library. The goal is complete control by the auditor. If there is to be any "injection of code", it will be dl_open()ing the target executable into the auditor.
(In reply to John Reiser from comment #14) > (In reply to Florian Weimer from comment #13) > > We cannot support this because it is not possible to perform correct > > relocations if another executable has already been loaded. There is also no > > way to correctly execute the ELF constructors of the second executable. > > Please give specific examples or explanations why success (or a > recognizable, specific, and informative error code) is not possible. Here is an example. The first program is mylocaltime-export: #include <stdio.h> #include <time.h> void mylocaltime (time_t t) { struct tm *tm = localtime (&t); printf ("tm_isdst (from other program): %d\n", tm->tm_isdst); printf ("daylight (from other program): %d\n", daylight); } int main (void) { return 0; } The second is mylocaltime-use: #include <dlfcn.h> #include <err.h> #include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <time.h> static void mylocaltime2 (time_t t) { struct tm *tm = localtime (&t); printf ("tm_isdst (from main program): %d\n", tm->tm_isdst); printf ("daylight (from main program): %d\n", daylight); } int main (void) { setenv ("TZ", "Europe/Berlin", 1); void *handle = dlopen ("./mylocaltime-export", RTLD_NOW); if (handle == NULL) errx (1, "dlopen: %s", dlerror ()); void *func = dlsym (handle, "mylocaltime"); if (func == NULL) errx (1, "dlsym: %s", dlerror ()); void (*fptr) (time_t) = func; mylocaltime2 (1555332781); fptr (1555332781); } Running the latter produces on x86-64: tm_isdst (from main program): 1 daylight (from main program): 1 tm_isdst (from other program): 1 daylight (from other program): 0 Such issues will be extremely difficult to debug.
We will not support this because it breaks ABI. Please use the LD_AUDIT mechanism or a modified loader that directly performs the steps you require instead.