This is the mail archive of the mailing list for the systemtap project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

using libdwfl offline archive support

Now on the systemtap ftp site and coming soon to Fedora systems near you,
is elfutils-0.130.  This version adds support for offline debug archives.

An offline archive is an ar archive containing a complete ELF file for each
libdwfl module to be considered.  You can use such a file with libdwfl
offline support (e.g. in -e options to eu-addr2line et al), producing a
Dwfl_Module for each archive member, with the member name as module name.

Additionally, offline ET_REL files (.ko), either inside an offline archive
or not, may now have nonzero sh_addr settings and DWARF relocs previously
applied and removed.

The normal way to make an offline debug archive is with eu-unstrip and ar.
The new script (eu-)make-debug-archive automates this for you.  It boils
down to the equivalent of:

	mkdir tmp
	eu-unstrip -d tmp -m -a -R -K
	ar cq debug.a tmp/*
	rm -rf tmp

This uses the new -R flag (aka --relocate), which applies address changes
to ET_REL files as mentioned above.  This means that when using those files
inside the archive, there will be no relocation required, reducing the
startup cost and eliminating all COW touching of mmap'd file pages.

The -K option (--offline-kernel) uses the same library call that systemtap
uses.  This code now looks for an installed offline debug archive for the
kernel first thing.  If it finds one, it uses only that and does not
proceed with finding the vmlinux and .ko files.  With the standard path
setting, it will look for /lib/modules/RELEASE/debug.a and then
/usr/lib/debug/lib/modules/RELEASE/debug.a to contain the offline archive.
eu-make-debug-archive installs one in the latter place by default, but can
be used for arbitrary locations too.

eu-unstrip with the (new) -n option just finds all the debuginfo and prints
a line about each module.  So this is a no-op that approximates just the
startup cost of systemtap revving up libdwfl to start working.

-bash-3.2$ /usr/bin/time eu-unstrip -n -K > /dev/null
1.39user 1.42system 0:02.82elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+140132minor)pagefaults 0swaps
-bash-3.2$ eu-make-debug-archive --kernel --sudo
-bash-3.2$ /usr/bin/time eu-unstrip -n -K > /dev/null
0.08user 0.04system 0:00.13elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+9026minor)pagefaults 0swaps

To isolate the performance gains a little, I remade a debug archive by hand
without using -R.  To compare (the -e is equivalent to what -K does when
debug.a is installed):

-bash-3.2$ /usr/bin/time eu-unstrip -n > /dev/null -e /usr/lib/debug/lib/modules/`uname -r`/debug.a
0.09user 0.04system 0:00.13elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+9027minor)pagefaults 0swaps
-bash-3.2$ /usr/bin/time eu-unstrip -n -e /tmp/debug-norel.a > /dev/null
0.82user 0.33system 0:01.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+72871minor)pagefaults 0swaps

This shows that more than half the cost is cut by opening and mmap'ing one
file instead of 3051.  The additional savings from avoiding applying
relocations to .ko DWARF sections are almost 90% of the remainder, which is
more than I was expecting.  These are all hot-cache times, where all the
files are certainly already in core and the disk was probably not even
read.  In the first-use-today scenario, the time spent on i/o might make
this improvement an even larger factor.  Notably, the archive file made
without -R is nearly twice as large as the -R archive (because all those
.rela.debug* sections get truncated); this might be a factor, though that
the fault count is not merely twice as high but eight times suggests the
COW is a lot of it.

This is all new code and not tested very thoroughly.  Take all numbers
with salt as it might be saving all the overhead by removing all the
information, etc.  I have not tested this with systemtap at all.
If you have any strange new problems, then try:

	sudo rm -f /usr/lib/debug/lib/modules/*/debug.a

and see if they suddenly go away.  If eu-make-debug-archive --kernel
makes them come right back again, then tell me all about it.

Also, be sure you use the elfutils-0.130-3 rpm (0.130-0.3 on sourceware),
or have elfutils-0.130-fixes.patch applied if you build your own.  
The 0.130-1 that was there on Tuesday is buggy, as is the unpatched
elfutils-0.130.tar.gz file.

If systemtap has a new problem when using the archive, it may be a bug in
handling the address bias.  Heretofore, all modules have had a zero bias,
so some sloppy code could have squeaked by.  The rules haven't changed.
The uniformn proper application of address bias to information used with
{dwarf,elf,ebl}_* calls (vs dwfl_*) is already necessary to handle user
DSOs correctly.

eu-make-debug-archive --kernel is a "caching" no-op the second time
(without --force).  The script is simple, it just checks that the archive
is already newer than /lib/modules/$release/modules.dep.  That file
usually gets touched by installing extra kmod packages; then you need to
recreate the archive to include the new modules and their debuginfo.  (If
you update the archive without first installing the kmod-foo-debuginfo
too, you just get that module in the archive with no debuginfo.  If the
kmod has build ID, the debuginfo can be found anyway, but otherwise it
won't be.)  

libdwfl always uses the archive if it exists.  There is no way to make it
ignore the archive except to change debuginfo-path so it won't look in
/usr/lib/debug.  (It's too much of a nonportable kludge for the library
to look for the modules.dep timestamp.)  So if one ever uses
eu-make-debug-archive --kernel, then one has to be sure to run it before
using libdwfl if new kernel modules may have been installed for the
existing kernel version.

If systemtap wanted to be foolproof in this regard, it could always run
eu-make-debug-archive beforehand, or it could optimize the common case
with its own st_mtime comparison of the two expected files.

So, for 20x faster libdwfl startup, I'm glad I went off half-cocked after
mentioning the idea to Frank, without waiting for measurements of the
translator's time to see how significant libdwfl startup time is overall.
I still don't have any sense of what fraction of the cited "translator is
too slow" overhead this represents.  But I figure 2.6 seconds isn't nothing.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]