Created attachment 5283 [details]
proposed patch, incl. testcase
In case a library is opened with RTLD_LOCAL, dlclose()ing that library
will remove the local scope from all subsequently loaded libraries
unconditionally, even though such a library is marked as RTLD_NODELETE.
This causes subsequent lookups within that library to fail if the
library depends on other libraries than those already loaded within
the global scope.
This has been exposed in a real-world case where libproxy opens
a KDE4 plugin with RTLD_LOCAL, the plugin depends on libkde4_core
and libkde4_core is marked as NODELETE due to having a STB_GNU_UNIQ
symbol; the plugin is dlclose()d later but ld.so raises a fatal
error when libkde4_core global destructor is called (it depends
on libqt4, but libqt4 has been in the plugin's local scope only
and is gone now).
The patch takes an approach that is probably too conservative. Some common .so like libpthread are marked NODELETE and any RTLD_LOCAL-opened .so that depends on such will be held in memory forever. The real solution should be to rebuild the scope for the NODELETE object.
That test case doesn't show any bug. If you you use the handle for a dlopen'ed object to look up an object, then close the object, and finally use the returned function it is bound to fail. Whether the symbol has been found in a different object doesn't matter.
You have to provide a valid test case.
Created attachment 5749 [details]
Indeed, this is a better testcase really reflecting what the proxy library
does. The important part is that there needs to be two (independend)
libraries loaded, that one dependency of one of them is nodelete, and has a
finalizer that needs to lookup something in its own dependencies that wasn't
available before. Then with unpatched glibc:
# /tmp/mm/lib64/ld-linux-x86-64.so.2 --library-path /tmp/mm/lib64/ ./app
./app: symbol lookup error: /suse/matz/src/nodeletebug/lib2.so: undefined symbol: in_lib3
with patched glibc:
thanks for the testcase, changing status now.
Is this bug still reproducible on glibc >= 2.15?
I've failed to reproduce it with 2.16+, and I suppose commit
has something to do with it, because reverting it reintroduces the bug.
I really can't make up my mind right now if Andreas' patch is a fix for this
issue, or just hides it. The testcase here needed NODELETE libraries to force
some deps to stay around. Andreas' patch has this in it:
+ * elf/dl-close.c (_dl_close_worker): Reset private search list if
+ it wasn't used.
+ else if (new_list != NULL)
+ /* We didn't change the scope array, so reset the search
+ list. */
+ imap->l_searchlist.r_list = NULL;
+ imap->l_searchlist.r_nlist = 0;
So, what happens if we _do_ have changed the scope array, or used the
private search list? In other words, could the testcase from this report
be extended to make this happen and retrigger the bug, or is it fixed for
Both testcases (the attached one and the unload8 test) are equivalent: in the attached testcase lib2 cannot be unloaded due to NODELETE, in the unload8 testcase unload8mod2 cannot be unloaded due to the dlopen dependency from unload8mod3. Thus they trigger the same bug.
Should be fixed in 2.15.