Bug 12561

Summary: ld.so: dlclose() can remove required local scope elements of NODELETE linkmaps
Product: glibc Reporter: Petr Baudis <pasky>
Component: dynamic-linkAssignee: Ulrich Drepper <drepper.fsp>
Status: RESOLVED FIXED    
Severity: normal CC: aj, ismail, ldv, matz
Priority: P2 Flags: fweimer: security-
Version: 2.13   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: proposed patch, incl. testcase
better testcase

Description Petr Baudis 2011-03-10 02:19:54 UTC
Created attachment 5283 [details]
proposed patch, incl. testcase

In case a library is opened with RTLD_LOCAL, dlclose()ing that library
will remove the local scope from all subsequently loaded libraries
unconditionally, even though such a library is marked as RTLD_NODELETE.
This causes subsequent lookups within that library to fail if the
library depends on other libraries than those already loaded within
the global scope.

This has been exposed in a real-world case where libproxy opens
a KDE4 plugin with RTLD_LOCAL, the plugin depends on libkde4_core
and libkde4_core is marked as NODELETE due to having a STB_GNU_UNIQ
symbol; the plugin is dlclose()d later but ld.so raises a fatal
error when libkde4_core global destructor is called (it depends
on libqt4, but libqt4 has been in the plugin's local scope only
and is gone now).
Comment 1 Petr Baudis 2011-03-22 01:08:42 UTC
The patch takes an approach that is probably too conservative. Some common .so like libpthread are marked NODELETE and any RTLD_LOCAL-opened .so that depends on such will be held in memory forever. The real solution should be to rebuild the scope for the NODELETE object.
Comment 2 Ulrich Drepper 2011-04-10 20:02:22 UTC
That test case doesn't show any bug.  If you you use the handle for a dlopen'ed object to look up an object, then close the object, and finally use the returned function it is bound to fail.  Whether the symbol has been found in a different object doesn't matter.

You have to provide a valid test case.
Comment 3 Michael Matz 2011-05-25 15:15:55 UTC
Created attachment 5749 [details]
better testcase

Indeed, this is a better testcase really reflecting what the proxy library
does.  The important part is that there needs to be two (independend)
libraries loaded, that one dependency of one of them is nodelete, and has a
finalizer that needs to lookup something in its own dependencies that wasn't
available before.  Then with unpatched glibc:

# make
# /tmp/mm/lib64/ld-linux-x86-64.so.2 --library-path /tmp/mm/lib64/ ./app
./app: symbol lookup error: /suse/matz/src/nodeletebug/lib2.so: undefined symbol: in_lib3

with patched glibc:
# ./app
#
Comment 4 Andreas Jaeger 2011-05-25 19:29:24 UTC
thanks for the testcase, changing status now.
Comment 5 Dmitry V. Levin 2012-10-23 17:34:20 UTC
Is this bug still reproducible on glibc >= 2.15?

I've failed to reproduce it with 2.16+, and I suppose commit
http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=glibc-2.14-208-g39dd69d
has something to do with it, because reverting it reintroduces the bug.
Comment 6 Michael Matz 2012-10-25 14:07:57 UTC
I really can't make up my mind right now if Andreas' patch is a fix for this
issue, or just hides it.  The testcase here needed NODELETE libraries to force
some deps to stay around.  Andreas' patch has this in it:

+       * elf/dl-close.c (_dl_close_worker): Reset private search list if
+       it wasn't used.
...
+         else if (new_list != NULL)
+           {
+             /* We didn't change the scope array, so reset the search
+                list.  */
+             imap->l_searchlist.r_list = NULL;
+             imap->l_searchlist.r_nlist = 0;

So, what happens if we _do_ have changed the scope array, or used the 
private search list?  In other words, could the testcase from this report
be extended to make this happen and retrigger the bug, or is it fixed for
good?
Comment 7 Andreas Schwab 2012-10-25 17:14:29 UTC
Both testcases (the attached one and the unload8 test) are equivalent: in the attached testcase lib2 cannot be unloaded due to NODELETE, in the unload8 testcase unload8mod2 cannot be unloaded due to the dlopen dependency from unload8mod3.  Thus they trigger the same bug.
Comment 8 Andreas Schwab 2012-11-28 10:45:42 UTC
Should be fixed in 2.15.