Bug 11431 - consider madvise for ld.so
Summary: consider madvise for ld.so
Status: SUSPENDED
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-25 14:33 UTC by Frank Ch. Eigler
Modified: 2014-06-30 18:24 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2010-03-25 14:33:24 UTC
Over the years, there have been several suggestions that
ld.so use madvise(...WILLNEED) after mmapping a DT_NEEDED
shared library to encourage more aggressive prefetching,
which has been measured to help with certain large libraries
such as firefox's.  suse carries a glibc patch for this, and
other projects have hacks in them to approximate this policy.

After a brief bugzilla/google search, I haven't heard any
official disposition of this heuristic.  If it is known to be
a bad idea, perhaps we can record the reasons for that here.
Comment 1 Ulrich Drepper 2010-04-04 02:27:25 UTC
And exactly how should madvise be used?

Take into account that there are a large number of DSOs and that they are
sometimes used.  Most of the time only a small fraction of any DSO is used in a
program.  Using madvise to get the DSO loaded will likely waste a lot of resources.

I've been thinking about adding a special flag to system libraries so that at
least they can be handled this way.  But it is a bad idea to do this in general.

If you want to convince me try using a large number of programs using lots of
DSOs in an OS environment with limited memory.
Comment 2 Frank Ch. Eigler 2010-04-06 20:36:06 UTC
Can you suggest some statistics we can gather in order to inform
heuristics within ld.so to trigger a madvise(MADV_WILLNEED)
(which itself is a heuristic for the kernel)?
Comment 3 Ulrich Drepper 2010-04-06 20:54:30 UTC
(In reply to comment #2)
> Can you suggest some statistics we can gather in order to inform
> heuristics within ld.so to trigger a madvise(MADV_WILLNEED)
> (which itself is a heuristic for the kernel)?

I don't think heuristics will work in ld.so.

It's pretty much an all or nothing selection.  For each use the madvise call is
useful or not.  This should be a setting in a file, perhaps it can be changed
after the file has been created (similar to the execstack setting).  Perhaps ELF
flags (e_flags).  Otherwise an entry in the program header.

Where heuristics may come in is in monitoring the system (using systemtap or so)
and look at the DSO use in apps.  I.e., when the app ends look how many pages of
the DSO have been used over the lifetime of the app.  If a certain threshold is
reached the flag could be set.

Another piece of information would be how much of each segment (text, data) is
used.  This could then help lowering the threshold for using the flag if the
commonly used code/data is first in the segment.  This would then not be a
simple flag but instead each PT_LOAD entry in the program header could have a
corresponding PT_READAHEAD entry or so.  By default it could be of zero size and
using profiling tools the ideal size can be determined.

Comment 4 Taras Glek 2010-04-06 20:59:36 UTC
(In reply to comment #2)
> Can you suggest some statistics we can gather in order to inform
> heuristics within ld.so to trigger a madvise(MADV_WILLNEED)
> (which itself is a heuristic for the kernel)?
> 

I don't have an effective general heuristic. The problem is worst for programs
that use a large chunk of large libraries. Without modifying the compile-time
linker, the best heuristic I can come up with is to only WILLNEED executable
segments > 4 mb and data segments > 1mb.

If we could add a compile-time ld flag to flag dependent libraries with
WILLNEED, that would be ideal for Firefox. That would take the guessing out of
ld.so and push it onto application devs who are likely know exactly how the
libraries are meant to be used. Additionally libc could use an api to reset the
madvise flags to completely erase any concerns about keeping too many pages cached. 
One can already change the madvise hints by parsing /proc/<pid>/maps, but there
is no way set madvise flags early on.
On the other hand, everyday developers should not be expected to understand
pagefaulting behavior of their apps, so the benefit would be limited.

Another sure-fire solution is to do something like the current prelink cronjob.
Except in this case the program would iterate /proc/<pid>/maps figure out which
libraries are paging in multi-megabyte chunks and record that similar to the
prelink cache.


My concern is that I have yet to see any concrete evidence that madvise hints
actually hurt in low memory conditions. I started a thread on LKML asking about
madvise behavior in low memory conditions. I am still waiting for an indepth
reply, but it seems that the kernel throttles caching behavior under memory
pressure.
Comment 5 Taras Glek 2010-04-06 21:02:49 UTC
The second mention of /proc/<pid>/maps should've been /proc/<pid>/smaps
Comment 6 Taras Glek 2010-04-06 22:53:40 UTC
Joannes Weiner just proved me wrong on LKML. madvise(WILLNEED) does not change
the paging behavior at all, I was misinterpreting my logs. All it does is
trigger readahead if sufficient free pages are available in the file cache.
Sounds like one can call madvise from ld.so with few downsides. In the worst
case there will be some file cache thrashing. I'd expect most systems to benefit.

If we take the conservative route, Ulrich, would you agree with triggering
madvise upon encountering a special flag in a file as you described?
Comment 7 Ulrich Drepper 2010-04-07 00:54:36 UTC
(In reply to comment #6)
> Sounds like one can call madvise from ld.so with few downsides. In the worst
> case there will be some file cache thrashing. I'd expect most systems to
> benefit.

I bet that is definitely _not_ the case.  The thrashing will just removed other
used pages.  And for what?  Experiments I've done a few years back showed that
only 15-20% of most DSO (especially the large ones) are used.  Blindly using
advise will hurt significantly unless you have huge amounts of memory.

I do not feel comfortable with unconditionally using madvise.  If somebody
writes a tool to harvest usage data, binutils patches to add appropriate program
header entries, and a little program to set these entries I have no problems
adding the support.  Then it is everybody's choice.  Somebody who wants
everything be read ahead can just mark all DSO like this.


Another point: using such a program header entry will allow the kernel to handle
the program binary itself and ld.so in the same way.
Comment 8 Gustavo Sverzut Barbieri 2010-09-26 15:43:11 UTC
I'm interested in this topic as well, so CC it is.
Comment 9 Petr Baudis 2011-02-04 01:29:33 UTC
It probably makes sense to mark this as SUSPENDED until more hard data is gathered or the more generic solution is implemented.
Comment 10 OndrejBilka 2013-05-09 20:23:20 UTC
For me looks like madvise here solves problem at wrong level.

I would be more comfortable with gcc collecting statistics when -fprofile-generate and adding constructor that madvises parts that are used at startup or frequently.
Comment 11 Jackie Rosen 2014-02-16 17:43:48 UTC Comment hidden (spam)