This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: A per-user or per-application ld.so.cache?


Iâve been talking to the HPC tools and system guys and to my surprise they favor Florianâs approach which is to change glibc ld.so to cache the full directories of the visited in the process of finding a library. Subsequent lookups would first look in this cache before looking in subsequent directories in library search paths.

I presented to them both approaches and what I saw to be the advantages and disadvantages of both:

	The first approach is to add code to ld.so which reads the entire directory for the library search paths 
	the first time that it visits them and stores it in a cache in memory. Then instead of revisiting the 
	directories when it searches for the next library, it first searches in the the caches of the contents 
	of the directories. If it finds the file, then it tries to open there there first, if that doesnât work it drops 
	that cache entry and falls through to the normal library loading behavior.

	The advantages of this approach are:
	1) it wouldnât require any user retraining <- This turned out to be very important to them
	2) because they are built in memory each and every time and not stored on disk there would not be a
	problem with the cache files being out of date..

	The disadvantages are:
	1) It would consume memory for these caches. The developer advocating this said that he would be 
	willing to add an interface to drop these caches later when the app has completed its loading of objects.
	2) Every single compute node would still need to read every directory in the library search paths 
	once. <- This is one of the biggest downsides for HPC applications. It was this fact that led me to believe
	that they would prefer the second approach.
	3) If there are a lot of files in the directories other than the libraries being used the amount of memory
	being used for this cache could be notable. Notable still being measured in KB vs. MB though.
	4) It creates a second caching system parallel to the one in.so.cache
	5) Users would explicitly make code changes to drop in-memory caches

	-------------

	The second approach is to make ldconfig a command that a normal non-root user could run. It would then 
	build a ld.so.cache file either for the user or for a specific application. Then all the nodes load this cache 
	file and know exactly where to find their libraries. They wouldnât even have to read the directories unless 
	the cache file is out of date. 

	Advantages:
	1) could be run before the job once for all the nodes. 
	2) the same cache could be loaded by all compute nodes
	3) no directory reading operations needed at all unless the cache file is out of date
	4) the cache file could persist between runs

	Disadvantages
	1) some user training required
	2) the cache file could be out of date or not match the OS version or architecture. This would basically 
	not happen in our environment. Especially if users put ldconfig in their job launch script. <- This was a big 
	issue in their mind. I argued that rebuilding the cache for a particular application would only take a few
	seconds and it could easily be added to a startup script. My impression is that their notion of this 
	approach may have been biased by viewing the cache as something semi-permanent as opposed to 
	something more ephemeral that could be quickly recreated.
	3) the code that loads the cache file would need to be substantially hardened to make sure it couldnât 
	be abused.
	4) They also brought up that there are cases where the paths seen on the compute nodes are differen
	than the paths seen on the login nodes and in this case pre-computing a ldcache would difficult. I do 
	not see this as unresolvable as long as the user ldconfig also honors LD_LIBRARY_PATH when
	generating a ldcache for a particular application.

One mistake that I did make in this presentation is that I unintentionally presented it as an either-or choice âwhich one of these would you prefer?â rather than even considering the possibility of implementing both approaches.

-ben


> On Feb 8, 2016, at 11:44 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> 
> On 02/09/2016 01:57 AM, Florian Weimer wrote:
>> On 02/08/2016 09:36 PM, Carlos O'Donell wrote:
>> 
>>> Would you mind expanding on what you would find difficult? Words like better
>>> or worse, in a technical context, need explicit descriptions of what is
>>> better and what is worse.
>> 
>> I assume you want to keep a single cache file, right?
> 
> I had not considered otherwise, but Mike's suggestion of a LD_LIBRARY_CACHE
> which lists multiple files has it's own appeal.
> 
>> If I understand the current situation correctly, the system cache is not
>> just an optimization, it is also used to effectively extend the search
>> path because otherwise, ld.so would not load libraries from
>> /usr/lib64/atlas, for example.  (I have a file
>> /etc/ld.so.conf.d/atlas-x86_64.conf which lists the directory
>> /usr/lib64/atlas.)
> 
> Yes.
> 
>> I think this means that if you do not update cache, but install new
>> system DSO versions, you might no longer be able to find all DSOs.
>> Users would need some way to know when to update their caches.
> 
> System DSOs are part of /etc/ld.so.cache, and while users might use
> their own personal cache to load system DSOs from system directories,
> it is not recommended because the user doesn't know when those files
> get updated. It's possible, but not recommended, and one should let
> /etc/ld.so.cache handle that, and the sysadmin will update that cache 
> (or package installs will).
> 
> With that out of the way, the user is responsible for caching anything
> they have access to change.
> 
>> Or we'd have to do that as part of ld.so, but that doesn't seem to be
>> particularly attractive because of the limited facilities at that point
>> of process life.  This is why I asked if the loading is triggered only
>> after user code has run.
> 
> Right, it happens very early.
> 
>>> The user would have to run 'ldconfig', and perhaps by default we update the
>>> user cache and skip updating the global cache if the user lacks the persmissions
>>> to do so. Not that different from what we do today with Fedora/RHEL spec files
>>> when libraries are installed.
>> 
>> Yes, and I'm worried that keeping the cache in sync could be too confusing.
> 
> Then don't update the cache? Instead make the cache always work.
> 
> For example if you had a user/application cache that was relative to $HOME
> or $ORIGIN (dynamic string token), then it needs no updates and is relocatable?
> 
> If you want to accelerate your application you would use ldconfig to create
> a path relative cache file, and then set LD_LIBRARY_CACHE to that cache
> file, and when you start your ld.so it loads that cache.
> 
> Application developers could ship the cache file with the application and
> use a wrapper script to set the env var (like any other required env var
> for the application).
> 
> This has the added benefit of being able to accelerate RPATH lookups using
> the same strategy.
> 
> The whole plan certainly needs some more thought.
> 
> Cheers,
> Carlos.
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]