This is the mail archive of the
elfutils-devel@sourceware.org
mailing list for the elfutils project.
Performance of libdw
- From: Andrey Ponomarenko <aponomarenko at rosalab dot ru>
- To: elfutils-devel at lists dot fedorahosted dot org
- Date: Fri, 12 Jul 2013 18:49:59 +0400
- Subject: Performance of libdw
Hi,
I've found an O(N^2) loop in the libdw library (N - number of symbols in
the library). The dwfl_module_getsym function is called N times, but it
calls resolve_symbol function each time and it loops through the number
of symbols again.
So `readelf --debug-dump=info shared_object.so` command is very slow on
large input objects. As an example, it takes several minutes to analyse
libOpenImageIO.so.1.1.11 (27mb), but readelf from binutils takes just
several seconds to do it.
The output of sprof for libelf:
% cumulative self self
time seconds seconds calls us/call name
42.16 28.42 28.42 1858018814 0.02 gelf_getsymshndx
39.49 55.04 26.62 1798636744 0.01 gelf_getshdr
18.35 67.41 12.37 1798635186 0.01 elf_getscn
0.00 67.41 0.00 1562 0.00 elf_nextscn
0.00 67.41 0.00 123 0.00 elf_ndxscn
0.00 67.41 0.00 61 0.00 elf_strptr
0.00 67.41 0.00 9 0.00 elf_getdata
0.00 67.41 0.00 7 0.00 gelf_getphdr
...
The output of sprof for libdw:
% cumulative self self
time seconds seconds calls us/call name
60.97 69.93 69.93 0 0.00 dwfl_module_getsym
26.25 100.04 30.11 0 0.00 search_table
6.73 107.76 7.72 0 0.00
dwfl_adjusted_address
4.87 113.34 5.58 0 0.00
dwfl_adjusted_st_value
0.61 114.04 0.70 0 0.00
dwfl_adjusted_aux_sym_addr
0.10 114.16 0.12 861603 0.14 dwarf_getattrs
0.10 114.27 0.11 0 0.00
__libdw_form_val_len
0.05 114.33 0.06 0 0.00 __libdw_find_attr
0.03 114.37 0.04 0 0.00 lookup
0.00 114.67 0.00 95891 0.00
dwfl_module_addrsym
...
The gelf_getsymshndx, gelf_getshdr and elf_getscn functions from libelf
are called 2 billion times and it takes about 60% of total execution
time. Another 26% of execution time are taken by the search_table
function that is called by the dwfl_module_addrsym and loops through the
number of symbols too. The dwfl_module_addrsym is called for each
symbol. So it also takes O(N^2).
How can dwfl_module_getsym and dwfl_module_addrsym functions may be
optimised to not to loop through all symbols in the library in order to
find information for just one of them? This can make the library to be
10 times faster.
Thank you.
--
Andrey Ponomarenko, ROSA Lab.