Summary: | _nptl_setxid can loop forever if a dlmopen namespace tries to initialise pthreads after the main namespace does | ||
---|---|---|---|
Product: | glibc | Reporter: | Vivek Das Mohapatra <vivek> |
Component: | dynamic-link | Assignee: | Florian Weimer <fweimer> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | adhemerval.zanella, carlos, fweimer |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | 2.24 | ||
Target Milestone: | 2.34 | ||
Host: | Target: | ||
Build: | Last reconfirmed: | ||
Attachments: |
Test case - build two executables. One uses dlmopen and triggers the lock up, the other uses dlopen and does not.
Test case - build two executables. One uses dlmopen and triggers the lock up, the other uses dlopen and does not. Test case - loop-forever has several modes of operation which can trigger (or not) the deadlock |
Description
Vivek Das Mohapatra
2018-01-24 21:45:26 UTC
dlmopen needs fixing to keep the implementation in a consistent state with regards to threads, and that probably means keeping the implementation set of libraries the same across all dlmopen namespaces e.g. the same ld.so, libc.so, libpthread.so, etc. It is not libc, but libpthread, which contains __stack_user, but your analysis is essentially correct. All data which is directly related to the TCB needs to be shared across dlmopen boundaries, but we currently do not implement that. Blanket sharing of all of libc and its related libraries is not what everyone wants. LD_AUDIT modules tend to rely on a separate libc, for instance. My initial plan is to implement, if I can, the RTLD_UNIQUE flag I proposed which would make the targeted namespace re-use the mapping already in the main link map - I think that's got to be the basis for whatever the final behaviour and semantics turn out to be. Before I sink a lot of time into this, does that sound reasonable? (In reply to Vivek Das Mohapatra from comment #3) > My initial plan is to implement, if I can, the RTLD_UNIQUE flag I proposed > which would make the targeted namespace re-use the mapping already in the > main link map - I think that's got to be the basis for whatever the final > behaviour and semantics turn out to be. > > Before I sink a lot of time into this, does that sound reasonable? No, and I wrote it out in a wiki page with some notes about why. We should try to flesh out partial isolation first, but must keep full isolation working for LD_AUDIT. https://sourceware.org/glibc/wiki/LinkerNamespaces (In reply to Carlos O'Donell from comment #4) > We should try to flesh out partial isolation first, but must keep full > isolation working for LD_AUDIT. > > https://sourceware.org/glibc/wiki/LinkerNamespaces I take your point(s) about the limitations of RTLD_UNIQUE - my thought was that to start with I could use it for libc/libpthread (by using the flag explicitly in libcapsule) which don't have further dependencies - everything else I need to share is/would be handled by creating a shim library which managed data sharing at a higher level. The default behaviour would still be total isolation, as happens now. (In reply to Vivek Das Mohapatra from comment #5) > (In reply to Carlos O'Donell from comment #4) > > We should try to flesh out partial isolation first, but must keep full > > isolation working for LD_AUDIT. > > > > https://sourceware.org/glibc/wiki/LinkerNamespaces > > I take your point(s) about the limitations of RTLD_UNIQUE - my thought was > that to start with I could use it for libc/libpthread (by using the flag > explicitly in libcapsule) which don't have further dependencies - everything > else I need to share is/would be handled by creating a shim library which > managed data sharing at a higher level. > > The default behaviour would still be total isolation, as happens now. Consider the case where a new library in a new namespace dlopen's libpthread to create new threads. This dlopen will not be made with RTLD_UNIQUE and these new threads will be unknown to the base namespace. Instead the base namespace should load libpthread and make it available in the non-base namespace. This way the base namespace is aware that there are threads now, and that seteuid must iterate over them properly. I expect we need a SONAME list and all load's from that SONAME list must first happen in the base namespace, and then be linked into the non-base namespace. The SONAME list is minimally libc.so.6, libpthread.so.0, libdl.so.2, libgcc_s.so.1 (from gcc for unwinding), with maybe libm.so.6 for future proofing, basically anything that constitutes the implementation. This filtered loading should be turned off for total isolation namespaces required for LD_AUDIT using an internal dlopen flag. Looking into this approach: Will probably be a little while before I have something concrete. I will make sure not to trample on LD_AUDIT. So, staring at this code for a while - will the following approach to cloning an object into a secondary link map work: - allocate a new entry in the same way as _dl_new_object - memcpy the contents of the old entry to the new - set the next and previous pointers to NULL in the new struct - hook the next and prev pointers up to the new link map list The existing clone operation for ld.so seems to be a special case which pokes the real pointer into l_real in a freshly allocated struct, but doesn't care much about the details as the linker can't depend on anything and nothing relocates symbols from it (I think?) [I realise I haven't addressed the dont-apply-uniqueness-to-audit requirement yet - will look at that once I know how to make it work at all] Never mind, figured it out. Have a working proof of concept shared-link-map-entry implementation, now to make it audit-safe. Created attachment 10935 [details]
Test case - build two executables. One uses dlmopen and triggers the lock up, the other uses dlopen and does not.
usage: ./loop-forever
You may set any or all of the following env vars:
BLOCK=0 ./loop-forever # don't try to trigger the deadlock
SHARE=0 ./loop-forever # don't try to use RTLD_SHARED
DEBUG=1 ./loop-forever # print extra debug messages
RFC proof-of-concept implementation sent to list at: https://sourceware.org/ml/libc-help/2018-04/threads.html#00004 I would advise you to post it on libc-alpha [1], since it is the usual place for patch discussion. [1] https://sourceware.org/ml/libc-alpha/ Sent to libc-alpha. Created attachment 11058 [details]
Test case - loop-forever has several modes of operation which can trigger (or not) the deadlock
usage: ./loop-forever
You may set any or all of the following env vars:
BLOCK=0 ./loop-forever # don't try to trigger the deadlock
SHARE=0 ./loop-forever # don't try to use RTLD_SHARED
DEBUG=1 ./loop-forever # print extra debug messages
I believe we fixed this in glibc 2.34 with this commit: commit 90d7e7e5bd3b0683a27c658388b6515ce950c78e Author: Florian Weimer <fweimer@redhat.com> Date: Wed Apr 21 19:49:51 2021 +0200 elf: Introduce __tls_init_tp for second-phase TCB initialization TLS_INIT_TP is processor-specific, so it is not a good place to put thread library initialization code (it would have to be repeated for all CPUs). Introduce __tls_init_tp as a separate function, to be called immediately after TLS_INIT_TP. Move the existing stack list setup code for NPTL to this function. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> It depends on GL (dl_stack_user), which was introduced by: commit 1daccf403b1bd86370eb94edca794dc106d02039 Author: Florian Weimer <fweimer@redhat.com> Date: Mon Nov 16 19:33:30 2020 +0100 nptl: Move stack list variables into _rtld_global Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT, formerly __wait_lookup_done) can be implemented directly in ld.so, eliminating the unprotected GL (dl_wait_lookup_done) function pointer. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Plus some other changes that eventually eliminate late libpthread initialization. With late initialization completely gone, dlmopen or static dlmopen no longer clobbers the TCB. |