This is the mail archive of the
mailing list for the glibc project.
Re: A per-user or per-application ld.so.cache?
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: libc-alpha at sourceware dot org, Ben Woodard <woodard at redhat dot com>
- Date: Mon, 8 Feb 2016 15:19:44 -0500
- Subject: Re: A per-user or per-application ld.so.cache?
- Authentication-results: sourceware.org; auth=none
- References: <56B8E105 dot 8030906 at redhat dot com> <56B8E810 dot 1040609 at redhat dot com>
On 02/08/2016 02:10 PM, Florian Weimer wrote:
> On 02/08/2016 07:40 PM, Carlos O'Donell wrote:
>> Under what conditions might it make sense to implement
>> a per-user ld.so.cache?
>> At Red Hat we have some customers, particularly in HPC,
>> which deploy quite large applications across systems that
>> they don't themselves maintain. In this case the given
>> application could have thousands of DSOs. When you load
>> up such an application the normal search paths apply
>> and that's not very optimal.
> Are these processes short-lived?
> Is symbol lookup performance an issue as well?
Yes. So are the various O(n^2) algorithms we need to fix
inside the loader, particularly the DSO sorts we use.
> What's the total size of all relevant DSOs, combined? What does the
> directory structure look like?
I don't know. We should as Ben Woodard. To get us that data.
> Which ELF dynamic linking features are used?
I don't know.
> Is the bulk of those DSOs pulled in with dlopen, after the initial
> dynamic link? If yes, does this happen directly (many DSOs dlopen'ed
> individually) or indirectly (few of them pull in a huge cascade of
I do not believe the bulk of the DSOs are pulled in with dlopen.
Though for python code I know that might be the reverse with each
python module being a DSO that is loaded by the interpreter.
Which means we probably have two cases:
* Long chains of DSOs (non-python applications)
* Short single DSO chains, but lots of them (python modules).
> If the processes are not short-lived and most of the DSOs are loaded
> after user code has started executing, I doubt an on-disk cache is the
> right solution.
Why would a long-lived process that uses dlopen fail to benefit from an
on-disk cache? The on-disk cache, as it is today, is used for a similar
situation already, why not extend it? The biggest difference is that
we trust the cache we have today and mmap into memory. We would have to
harden the code that processes that cache, but it should not be that
Would you mind expanding on your concern that the solution would not work?