This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
[PATCH] Improve objdump -S performance
- From: Andi Kleen <andi at firstfloor dot org>
- To: binutils at sources dot redhat dot com
- Date: Sun, 26 Apr 2009 17:27:02 +0200
- Subject: [PATCH] Improve objdump -S performance
Hi,
objdump -S runs really slow (as in hours cpu time) compared to objdump -d
on large ELF files without debug information. Profiling shows nearly
all the time is spent in elf_find_function, which is called
as a fallback for the failing dwarf2 line lookup. elf_find_function
goes through all the symbols and since objdump calls that
for every instruction that's really slow.
One possibility to fix that would have been to use a better
data structure instead of an array for the symbols,
or at least do a binary search on a sorted array, but
that would all have needed new entry points in bfd and other
complications.
I ended up implementing this simple last hit cache. With
that objdump -S on the debug info less file is still factor ~3
slower than -d, but at least it's bearable now compared
to -d.
I didn't put the last hit cache into the BFD structure, but kept
it static for now to impact the binary interface of libbfd less.
-Andi
2009-04-24 Andi Kleen <ak@linux.intel.com>
* elf.c (elf_invalidate_cache, elf_cache_match,
elf_update_cache): Define for last hit cache.
(elf_find_function): Add calls to cache functions.
(_bfd_elf_close_and_cleanup): Call elf_invalidate_cache
Index: bfd/elf.c
===================================================================
RCS file: /cvs/src/src/bfd/elf.c,v
retrieving revision 1.480
diff -u -r1.480 elf.c
--- bfd/elf.c 26 Mar 2009 12:23:52 -0000 1.480
+++ bfd/elf.c 26 Apr 2009 15:17:30 -0000
@@ -52,6 +52,7 @@
static bfd_boolean elf_read_notes (bfd *, file_ptr, bfd_size_type) ;
static bfd_boolean elf_parse_notes (bfd *abfd, char *buf, size_t size,
file_ptr offset);
+static void elf_invalidate_cache (void);
/* Swap version information in and out. The version information is
currently size independent. If that ever changes, this code will
@@ -7044,6 +7045,51 @@
return bfd_default_set_arch_mach (abfd, arch, machine);
}
+/* Simple last hit cache to avoid quadratic behaviour in objdump with
+ each instruction. It's only quadratic for each symbol now.
+ Assumes noone calls BFD multi threaded. */
+
+static bfd_vma cached_low_func;
+static elf_symbol_type *cached_symbol;
+static const char *cached_filename;
+static bfd *cached_abfd;
+
+static void
+elf_invalidate_cache (void)
+{
+ cached_symbol = NULL;
+}
+
+static asymbol *
+elf_cache_match (bfd *abfd, asection *section, bfd_vma offset, const char **fn)
+{
+ elf_symbol_type *c = cached_symbol;
+
+ if (c == NULL || abfd != cached_abfd)
+ return NULL;
+
+ if (bfd_get_section (&c->symbol) != section)
+ return NULL;
+
+ if (c->symbol.value >= cached_low_func && c->symbol.value <= offset)
+ {
+ *fn = cached_filename;
+ return (asymbol *)c;
+ }
+
+ return NULL;
+}
+
+static void
+elf_update_cache (bfd *abfd, elf_symbol_type *symbol, const char *filename,
+ bfd_vma low_func)
+{
+ cached_abfd = abfd;
+ cached_symbol = symbol;
+ cached_filename = filename;
+ cached_low_func = low_func;
+}
+
/* Find the function to a particular section and offset,
for error reporting. */
@@ -7076,6 +7122,10 @@
low_func = 0;
state = nothing_seen;
+ func = elf_cache_match (abfd, section, offset, &filename);
+ if (func != NULL)
+ goto out;
+
for (p = symbols; *p != NULL; p++)
{
elf_symbol_type *q;
@@ -7116,6 +7166,9 @@
if (func == NULL)
return FALSE;
+ elf_update_cache (abfd, (elf_symbol_type *) func, filename, low_func);
+
+ out:
if (filename_ptr)
*filename_ptr = filename;
if (functionname_ptr)
@@ -7378,6 +7431,8 @@
_bfd_dwarf2_cleanup_debug_info (abfd);
}
+ elf_invalidate_cache();
+
return _bfd_generic_close_and_cleanup (abfd);
}
--
ak@linux.intel.com -- Speaking for myself only.