Can dwarf_getscopes{,_die} performance be improved?

Mark Wielaard mark@klomp.org
Thu Jun 25 22:38:45 GMT 2020


Hi Milian,

On Mon, 2020-06-22 at 10:29 +0200, Milian Wolff wrote:
> On Montag, 15. Juni 2020 18:54:41 CEST Josh Stone wrote:
> > On 6/13/20 10:40 AM, Milian Wolff wrote:
> > > Has anyone an idea on how to to post-process the DWARF data to optimize
> > > the
> > > lookup of inlined frames?
> > 
> > SystemTap implements its own cache for repeated lookups -- see
> > dwflpp::get_die_parents().
> 
> Thanks, I've come up with something similar over the weekend before reading 
> your mail. The performance boost is huge (5x and more).
> 
> Looking at your code, I think that I'm not yet handling a few corner cases 
> (such as imported units). That, paired with the fact that at least three users 
> of this API have apparently by now come up with a similar solution clearly 
> makes a case for upstreaming this into a common API.

Yes, I think having an elfutils/libdw API for this would be very
useful. And it would also be useful for eu-addr2line and eu-stack when
looking up inlined functions.

The imported (partial) units are a little tricky because they cross CUs
(and sometimes even Dwarfs for example when dealing with dwz/multi-
files). It also means that a Die inside the partial unit can have
multiple parents, because they might have been imported through
different imports. But I guess that if we associate a parent cache with
one CU, then this is clean (unless the CU imports the same partial unit
multiple times...).

> I believe that there is a lot of data that potentially needs to be cached. 
> Additionally, doing it behind the scenes may raise questions regarding multi 
> threaded usage of the API (see my other mail relating to that).
> 
> Which means: an explicit API to opt-in to this behavior is safest and best I 
> believe. Maybe something as simple as the following would be sufficient?
> 
> ```
> /* Cache parent DIEs chain to speed up repeated dwarf_getscopes calls.
> 
>    Returns -1 for errors or 0 if the parent chain was cached already. */
> extern int dwarf_cache_parent_dies(Dwarf_Die *cudie);
> ```
> 
> Alternatively a function that returns the cache could be considered, which 
> would then require new versions of dwarf_getscopes* that take the cache as an 
> argument.

I think an API that makes the "caching" explicit might be best. Maybe
we can call it a DieTree? We would then have a function to create a
DieTree and (new) functions that take a DieTree (and a Dwarf_Die) to
operate on it. The user can then also destroy the DieTree again when
done.

Cheers,

Mark


More information about the Elfutils-devel mailing list