Over the years, there have been several suggestions that ld.so use madvise(...WILLNEED) after mmapping a DT_NEEDED shared library to encourage more aggressive prefetching, which has been measured to help with certain large libraries such as firefox's. suse carries a glibc patch for this, and other projects have hacks in them to approximate this policy. After a brief bugzilla/google search, I haven't heard any official disposition of this heuristic. If it is known to be a bad idea, perhaps we can record the reasons for that here.
And exactly how should madvise be used? Take into account that there are a large number of DSOs and that they are sometimes used. Most of the time only a small fraction of any DSO is used in a program. Using madvise to get the DSO loaded will likely waste a lot of resources. I've been thinking about adding a special flag to system libraries so that at least they can be handled this way. But it is a bad idea to do this in general. If you want to convince me try using a large number of programs using lots of DSOs in an OS environment with limited memory.
Can you suggest some statistics we can gather in order to inform heuristics within ld.so to trigger a madvise(MADV_WILLNEED) (which itself is a heuristic for the kernel)?
(In reply to comment #2) > Can you suggest some statistics we can gather in order to inform > heuristics within ld.so to trigger a madvise(MADV_WILLNEED) > (which itself is a heuristic for the kernel)? I don't think heuristics will work in ld.so. It's pretty much an all or nothing selection. For each use the madvise call is useful or not. This should be a setting in a file, perhaps it can be changed after the file has been created (similar to the execstack setting). Perhaps ELF flags (e_flags). Otherwise an entry in the program header. Where heuristics may come in is in monitoring the system (using systemtap or so) and look at the DSO use in apps. I.e., when the app ends look how many pages of the DSO have been used over the lifetime of the app. If a certain threshold is reached the flag could be set. Another piece of information would be how much of each segment (text, data) is used. This could then help lowering the threshold for using the flag if the commonly used code/data is first in the segment. This would then not be a simple flag but instead each PT_LOAD entry in the program header could have a corresponding PT_READAHEAD entry or so. By default it could be of zero size and using profiling tools the ideal size can be determined.
(In reply to comment #2) > Can you suggest some statistics we can gather in order to inform > heuristics within ld.so to trigger a madvise(MADV_WILLNEED) > (which itself is a heuristic for the kernel)? > I don't have an effective general heuristic. The problem is worst for programs that use a large chunk of large libraries. Without modifying the compile-time linker, the best heuristic I can come up with is to only WILLNEED executable segments > 4 mb and data segments > 1mb. If we could add a compile-time ld flag to flag dependent libraries with WILLNEED, that would be ideal for Firefox. That would take the guessing out of ld.so and push it onto application devs who are likely know exactly how the libraries are meant to be used. Additionally libc could use an api to reset the madvise flags to completely erase any concerns about keeping too many pages cached. One can already change the madvise hints by parsing /proc/<pid>/maps, but there is no way set madvise flags early on. On the other hand, everyday developers should not be expected to understand pagefaulting behavior of their apps, so the benefit would be limited. Another sure-fire solution is to do something like the current prelink cronjob. Except in this case the program would iterate /proc/<pid>/maps figure out which libraries are paging in multi-megabyte chunks and record that similar to the prelink cache. My concern is that I have yet to see any concrete evidence that madvise hints actually hurt in low memory conditions. I started a thread on LKML asking about madvise behavior in low memory conditions. I am still waiting for an indepth reply, but it seems that the kernel throttles caching behavior under memory pressure.
The second mention of /proc/<pid>/maps should've been /proc/<pid>/smaps
Joannes Weiner just proved me wrong on LKML. madvise(WILLNEED) does not change the paging behavior at all, I was misinterpreting my logs. All it does is trigger readahead if sufficient free pages are available in the file cache. Sounds like one can call madvise from ld.so with few downsides. In the worst case there will be some file cache thrashing. I'd expect most systems to benefit. If we take the conservative route, Ulrich, would you agree with triggering madvise upon encountering a special flag in a file as you described?
(In reply to comment #6) > Sounds like one can call madvise from ld.so with few downsides. In the worst > case there will be some file cache thrashing. I'd expect most systems to > benefit. I bet that is definitely _not_ the case. The thrashing will just removed other used pages. And for what? Experiments I've done a few years back showed that only 15-20% of most DSO (especially the large ones) are used. Blindly using advise will hurt significantly unless you have huge amounts of memory. I do not feel comfortable with unconditionally using madvise. If somebody writes a tool to harvest usage data, binutils patches to add appropriate program header entries, and a little program to set these entries I have no problems adding the support. Then it is everybody's choice. Somebody who wants everything be read ahead can just mark all DSO like this. Another point: using such a program header entry will allow the kernel to handle the program binary itself and ld.so in the same way.
I'm interested in this topic as well, so CC it is.
It probably makes sense to mark this as SUSPENDED until more hard data is gathered or the more generic solution is implemented.
For me looks like madvise here solves problem at wrong level. I would be more comfortable with gcc collecting statistics when -fprofile-generate and adding constructor that madvises parts that are used at startup or frequently.
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.