This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Performance problem of dlopen


On Thu, Oct 24, 2002 at 01:30:09PM +0200, Joerg Budischewski wrote:
> ok, I have a attached a small C program which just loads shared 
> libraries given on command line and prints out the needed time. You can 
> simply build it with
> 
> gcc -ldl -o shlload shlload.c
> .
> 
> Additionally I have attached a textfile with a list of libraries for the 
> 643 (developer) build of openoffice (note: you may also use a production 
> build 1.0.x, simply replace 643 by 641. Some libraries may have changed, 
> but the effect is still measureable).
> 
> Make sure to have a . in your LD_LIBRARY_PATH. Starting
> 
> ./shlload -v libwrp643li.so
> 
> Gives on my system 3150ms needed to load.
> 
> Using
> xargs ./shlload -v < lst
> 
> gives about 1850ms on my system.

Had to modify the list (are you using GCC 3.0 and not 3.2 BTW?), and got
lower times (OOo 1.0.1):
LD_LIBRARY_PATH=. /tmp/x libwrp641li.so
Duration 1450
LD_LIBRARY_PATH=. /tmp/x `cat /tmp/l`
Duration 984
(PIII/600MHz).

Anyway, the difference is clear. If you load libwrp641li.so directly, the
symbol lookup path is set immediately to all the libsloaded because of it,
with the required breadth search, so say libtl641li.so's _ZTS11INetIStream
symbol is looked in:
31446:  symbol=_ZTS11INetIStream;  lookup in file=/tmp/x
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libdl.so.2
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/lib/libstdc++.so.5
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libc.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libm.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libgcc_s.so.1
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/ld-linux.so.2
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libwrp641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsvx641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsfx641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libfwe641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsb641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libso641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libj641li_g.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libtk641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsvt641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsvl641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libofa641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libvcl641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsot641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsal.so.3
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libvos2gcc3.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libtl641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libcppu.so.3
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libcppuhelper3gcc3.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libutl641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libcomphelp2.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libdl.so.2
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libpthread.so.0
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libstlport_gcc.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/lib/libstdc++.so.5
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libm.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libgcc_s.so.1
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libc.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libxo641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libgo641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libucbhelper1gcc3.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libsalhelper3gcc3.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=./libxcr641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/X11R6/lib/libX11.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/X11R6/lib/libXext.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/lib/openoffice/program/libpsp641li.so
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/lib/libfreetype.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/X11R6/lib/libSM.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/usr/X11R6/lib/libICE.so.6
31446:  symbol=_ZTS11INetIStream;  lookup in file=/lib/ld-linux.so.2
31446:  binding file ./libtl641li.so to ./libtl641li.so: normal symbol `_ZTS11INetIStream'

while if you dlopen the libraries one by one (preferrably in the
actual dependency order, e.g. as shown
by LD_LIBRARY_PATH=. prelink -nv ./libwrp641li.so), then on average you have
half as big symbol lookup path (starts from 3 or how many libs and grows
till all libs), the above symbol goes through:

31498:  symbol=_ZTS11INetIStream;  lookup in file=/tmp/x
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libdl.so.2
31498:  symbol=_ZTS11INetIStream;  lookup in file=/usr/lib/libstdc++.so.5
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libc.so.6
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libm.so.6
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libgcc_s.so.1
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/ld-linux.so.2
31498:  symbol=_ZTS11INetIStream;  lookup in file=./libtl641li.so
31498:  symbol=_ZTS11INetIStream;  lookup in file=./libsal.so.3
31498:  symbol=_ZTS11INetIStream;  lookup in file=./libvos2gcc3.so
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libdl.so.2
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libpthread.so.0
31498:  symbol=_ZTS11INetIStream;  lookup in file=./libstlport_gcc.so
31498:  symbol=_ZTS11INetIStream;  lookup in file=/usr/lib/libstdc++.so.5
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libm.so.6
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/libgcc_s.so.1
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/i686/libc.so.6
31498:  symbol=_ZTS11INetIStream;  lookup in file=/lib/ld-linux.so.2
31498:  binding file ./libtl641li.so to ./libtl641li.so: normal symbol `_ZTS11INetIStream'

There is one thing which ld.so can do better (it is even implemented
in ld.so, but disabled for now because e.g. librt.so depends
on the old behaviour) - ATM if symbol lookup encounters a weak symbol,
it keeps searching if it finds a strong symbol. If it does not, it uses
the first weak symbol found, otherwise it uses the strong symbol.
The currently disabled behaviour is stop when it finds first weak or
strong symbol.

But even when this is changed, the symbol lookup when loading the libs one
by one in the right order will be always faster, since on average the symbol
lookup path will be shorter.

>  > What would IMHO help most for OpenOffice startup spent in the dynamic
>  > linker
>  > (in addition to aggressive profiling and speeding up often used
>  > functions)
>  > is to merge shared libraries which are
>  > a) linked statically to the OOo programs
>  > b) dlopened during start of every OOo program (or at least all the
>  > important
>  >    ones; IMHO spadmin or setup startup performance is not critical,
>  > while
>  >    soffice with the usual invocations is)
>  > into libOpenOffice.so and link it into soffice etc. programs
>  > statically.
> Uhmm, I will comment on this later (maybe tomorrow).
> 
>  > That way you kill lots of duplication (which means less relocations,
>  > less
>  > memory used and fewer dynamic symbols) and also prelink(8) can speed
>  > it
> Where can I get prelink ?

ftp://people.redhat.com/jakub/prelink/
You need a recent glibc (like 2.3 or never, 2.3.93 in RHL 8 should be enough
too) for it too.
Note that if the above proposal is implemented, ie. one big libOpenOffice.so
plus modules which are not common to writer, draw, impress etc., then when
prelinked, symbol lookup time will be 0 for the main program and only will
show up for the additional modules (and even for them, the symbol lookup
path will be way shorter (by ~ 50 libs). If the main program will link
against say 10 shared libs, then also it will contain not many
symbol conflicts which prelink has to resolve at startup time (the cost
is the same as for normal RELATIVE relocs, which are on the other side not
performed when prelink is successful).

To give you some numbers (times in ticks on PIII/600Mhz)
Prelinked konqueror:
LD_DEBUG=statistics konqueror
31582:
31582:  runtime linker statistics:
31582:    total startup time in dynamic loader: 5786281 clock cycles
31582:              time needed for relocation: 1935703 clock cycles (33.4%)
31582:                   number of relocations: 0
31582:        number of relocations from cache: 2058
31582:             time needed to load objects: 3491610 clock cycles (60.3%)

LD_DEBUG=statistics konqueror.unprelinked
31628:
31628:  runtime linker statistics:
31628:    total startup time in dynamic loader: 259348260 clock cycles
31628:              time needed for relocation: 255535421 clock cycles (98.5%)
31628:                   number of relocations: 25794
31628:        number of relocations from cache: 59802
31628:             time needed to load objects: 3495178 clock cycles (1.3%)

LD_DEBUG=statistics /usr/lib/openoffice/program/soffice.bin
31642:
31642:  runtime linker statistics:
31642:    total startup time in dynamic loader: 3528042 clock cycles
31642:              time needed for relocation: 1046375 clock cycles (29.6%)
31642:                   number of relocations: 0
31642:        number of relocations from cache: 3870
31642:             time needed to load objects: 2127923 clock cycles (60.3%)

LD_DEBUG=statistics /usr/lib/openoffice/program/soffice.bin.unprelinked
31653:
31653:  runtime linker statistics:
31653:    total startup time in dynamic loader: 105077447 clock cycles
31653:              time needed for relocation: 102643480 clock cycles (97.6%)
31653:                   number of relocations: 13242
31653:        number of relocations from cache: 9507
31653:             time needed to load objects: 2091623 clock cycles (1.9%)

All these times are statistics until the program reaches its main,
with soffice.bin only linking against following libraries:

        libsvl641li.so => /usr/lib/openoffice/program/libsvl641li.so (0x41d76000)
        libvcl641li.so => /usr/lib/openoffice/program/libvcl641li.so (0x41947000)
        libcppu.so.3 => /usr/lib/openoffice/program/libcppu.so.3  (0x41445000)
        libcppuhelper3gcc3.so => /usr/lib/openoffice/program/libcppuhelper3gcc3.so (0x41324000)
        libtl641li.so => /usr/lib/openoffice/program/libtl641li.so (0x41890000)
        libvos2gcc3.so => /usr/lib/openoffice/program/libvos2gcc3.so (0x412f1000)
        libsal.so.3 => /usr/lib/openoffice/program/libsal.so.3 (0x4161b000)
        libutl641li.so => /usr/lib/openoffice/program/libutl641li.so (0x41575000)
        libucbhelper1gcc3.so => /usr/lib/openoffice/program/libucbhelper1gcc3.so (0x413a0000)
        libcomphelp2.so => /usr/lib/openoffice/program/libcomphelp2.so (0x41cf7000)
        libsalhelper3gcc3.so => /usr/lib/openoffice/program/libsalhelper3gcc3.so (0x4125f000)
        libXext.so.6 => /usr/X11R6/lib/libXext.so.6 (0x4124f000)
        libSM.so.6 => /usr/X11R6/lib/libSM.so.6 (0x41288000)
        libICE.so.6 => /usr/X11R6/lib/libICE.so.6 (0x4126f000)
        libX11.so.6 => /usr/X11R6/lib/libX11.so.6 (0x4116f000)
        libdl.so.2 => /lib/libdl.so.2 (0x4116a000)
        libpthread.so.0 => /lib/i686/libpthread.so.0 (0x4136e000)
        libstlport_gcc.so => /usr/lib/openoffice/program/libstlport_gcc.so (0x417d2000)
        libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0x414bf000)
        libm.so.6 => /lib/i686/libm.so.6 (0x41146000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x4143b000)
        libc.so.6 => /lib/i686/libc.so.6 (0x41015000)
        libpsp641li.so => /usr/lib/openoffice/program/libpsp641li.so (0x41c36000)
        libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x41474000)
        libsot641li.so => /usr/lib/openoffice/program/libsot641li.so (0x41293000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x41000000)

If all the /usr/lib/openoffice/program libs in the above list were
merged into one, I think the number of conflicts (3870) could go
down substantially (and thus even speed things up).

>  > The C++ ABI changed and unfortunately it is not exactly startup time
>  > friendly :(.
> Is there some simple explanation which can be given in a few words. I 
> e.g. can imagine, that every virtual function needs to be relocated or so ?

The difference is mainly format of virtual tables and
rtti tables, which are all using global weak symbols.

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]