This is the mail archive of the
mailing list for the glibc project.
Piecemeal library loading causes slow startup of big apps
- From: Lorenzo Colitti <lorenzo at colitti dot com>
- To: libc-alpha at sources dot redhat dot com
- Cc: Owen Taylor <otaylor at redhat dot com>
- Date: Tue, 13 Sep 2005 18:26:32 +0200
- Subject: Piecemeal library loading causes slow startup of big apps
as my google SoC project I have been working on improving GNOME startup
time, and I see that dynamic linking is one of the culprits.
GNOME startup is mainly I/O bound, i.e. most of the time is spent
waiting for disk seeks. Proof-of-concept work I have done has reduced
the disk seeks caused by GNOME itself, but now I have reached the point
that most of the disk seeks are caused by ld.so loading dynamic libraries.
This is because libraries are not loaded immediately in one big
sequential read, but in bits and pieces. (I think this is because ld.so
mmap()s the library and only page faults the bits it needs into RAM.)
For example, gtk+ (~9MB) is loaded piecemeal in about 30 separate out-of
(gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 0-7
(gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 687-718
(gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 653-684
(gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 34-65
(gdm-binary/3150): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 8-33
(battstat-applet/4143): /usr/local/gnome/lib/libgtk-x11-2.0.so.0 447-475
These are real disk reads traced by hooking into the ext3 block read
function using a kernel patch. The format is:
(process/pid): filename start_4k_block-end_4k_block
This way of loading libraries visibly hurts performance. If I cat the
most frequently-used libraries to /dev/null early in the startup
process, I can shave about 10% (~2s) off startup time: reading the
libraries puts them in the buffer cache, and when the linker mmaps them
it doesn't end up causing seeks. This is obviously a hack, but I think
the process could be made a lot smarter than this.
For example, would LD_BIND_NOW help me (I suspect not)? Is there a
compile-time hint that can tell the linker load the whole library using
read() instead of mmap()? If not, could it be implemented?
P.S. Of course, this analysis was done on GNOME, but it's probably a
problem common to many large apps that people actually use, like firefox