How to associate Elf with Dwfl_Module returned by dwfl_report_module

Milian Wolff mail@milianw.de
Wed Mar 21 13:01:00 GMT 2018


On Dienstag, 20. März 2018 23:05:49 CET Mark Wielaard wrote:
> Hi Milian,

Hey Mark :)

> On Sat, Mar 17, 2018 at 02:14:48PM +0100, Milian Wolff wrote:
> > a recurring issue in the libdwfl integration of perf and perfparser are
> > supposedly overlapping modules. The perf data file contains the exact
> > mappings for all files corresponding to the actual mmap events that
> > occurred during runtime, e.g.:
> > 
> > ```
> > $ perf script --show-mmap-events | grep MMAP | grep stdc
> > heaptrack_print 13962 87163.483450: PERF_RECORD_MMAP2 13962/13962:
> > [0x7fd0aea84000(0x387000) @ 0 08:03 413039 3864781083]: r-xp
> > /usr/lib/libstdc+ +.so.6.0.24
> > heaptrack_print 13962 87163.483454: PERF_RECORD_MMAP2 13962/13962:
> > [0x7fd0aebfc000(0x1ff000) @ 0x178000 08:03 413039 3864781083]: ---p
> > /usr/lib/ libstdc++.so.6.0.24
> > heaptrack_print 13962 87163.483458: PERF_RECORD_MMAP2 13962/13962:
> > [0x7fd0aedfb000(0xd000) @ 0x177000 08:03 413039 3864781083]: rw-p
> > /usr/lib/
> > libstdc++.so.6.0.24
> > heaptrack_print 13962 87163.484860: PERF_RECORD_MMAP2 13962/13962:
> > [0x7fd0aedfb000(0xc000) @ 0x177000 08:03 413039 3864781083]: r--p
> > /usr/lib/
> > libstdc++.so.6.0.24
> > ```
> > So far, both perf and perfparser are using dwfl_report_elf to report the
> > file. But that API is deducing the mapping addresses internally, which
> > may or may not be what we saw at runtime. I suspect that this is the
> > reason for some issues we are seeing, such as supposedly overlapping
> > modules.
> 
> How exactly are you calling dwfl_report_elf?

Here's the code for the perf tools:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/
perf/util/unwind-libdw.c?h=perf/core#n52

Here's the code for the perfparser:

http://code.qt.io/cgit/qt-creator/perfparser.git/tree/app/
perfsymboltable.cpp#n479

Let's concentrate on perf for now, but perfparser has similar logic:

We parse the mmap events in the perf.data file and store that information. 
Note that the perf.data file does not contain events for munmap calls. Then 
while unwinding the callstack of a perf sample, we lookup the most recent mmap 
event for every given instruction pointer address, and ensure that the 
corresponding ELF was registered with libdw.

> Specifically are you using false for the add_p_vaddr argument?

Yes, we are.

> And could you provide some example where the reported address is
> wrong/different from the start address of the Dwfl_Module?

I don't think it's the start address that is wrong, rather it's the end 
address. But it's hard for me to come up with a small selfcontained example at 
this stage. I am regularly seeing broken backtraces for samples where I have 
the gut feeling that missing reported ELFs are to blame. But we report 
everything, except for scenarios where the mmap events seemingly overlap. This 
overlapping is, as far as I can see, actually a side effect of remapping 
taking place in the dynamic linker (i.e. a single dlopen/dynamic linked 
library can yield multiple mmap events). One way or another, we end up with a 
situation where we cannot report an ELF to dwfl due to two issues:

a) either ELF tells us we are overlapping some module and just stops which is 
bad, since we would actually much prefer the newly reported ELF to take 
precedence

b) we find an mmap event that with a non-zero pgoff, and have no clue how to 
call dwfl_report_elf and just give up.

In both cases, I was hopeing for dwfl_report_module to help since it seemingly 
allows me to exactly recreate the mapping that was traced originally.

> > Looking at the Dwfl API, I cannot figure out how to feed the mapping
> > directly. There's dwfl_report_module, but how would I associate an Elf*
> > and int fd with it, as done by dwfl_report_elf?
> 
> When using dwfl_report_module the find_elf callback will be called that
> you registered with dwfl_begin. That callback is called "lazily" the
> first time dwfl_module_getelf is called. The callback can then set the
> Elf*. But that does mean you have to keep track yourself (or immediately
> call dwfl_module_getelf).

Ah, thanks!

> I would like to understand better what is really going wrong with
> dwfl_report_elf before diving into using dwfl_module_report.

See above, I would very much value your input. I'm still far from having fully 
grasped this situation.

Thanks

-- 
Milian Wolff
mail@milianw.de
http://milianw.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://sourceware.org/pipermail/elfutils-devel/attachments/20180321/45235595/attachment.sig>


More information about the Elfutils-devel mailing list