Can dwarf_getscopes{,_die} performance be improved?
Mark Wielaard
mark@klomp.org
Thu Jun 25 22:38:45 GMT 2020
Hi Milian,
On Mon, 2020-06-22 at 10:29 +0200, Milian Wolff wrote:
> On Montag, 15. Juni 2020 18:54:41 CEST Josh Stone wrote:
> > On 6/13/20 10:40 AM, Milian Wolff wrote:
> > > Has anyone an idea on how to to post-process the DWARF data to optimize
> > > the
> > > lookup of inlined frames?
> >
> > SystemTap implements its own cache for repeated lookups -- see
> > dwflpp::get_die_parents().
>
> Thanks, I've come up with something similar over the weekend before reading
> your mail. The performance boost is huge (5x and more).
>
> Looking at your code, I think that I'm not yet handling a few corner cases
> (such as imported units). That, paired with the fact that at least three users
> of this API have apparently by now come up with a similar solution clearly
> makes a case for upstreaming this into a common API.
Yes, I think having an elfutils/libdw API for this would be very
useful. And it would also be useful for eu-addr2line and eu-stack when
looking up inlined functions.
The imported (partial) units are a little tricky because they cross CUs
(and sometimes even Dwarfs for example when dealing with dwz/multi-
files). It also means that a Die inside the partial unit can have
multiple parents, because they might have been imported through
different imports. But I guess that if we associate a parent cache with
one CU, then this is clean (unless the CU imports the same partial unit
multiple times...).
> I believe that there is a lot of data that potentially needs to be cached.
> Additionally, doing it behind the scenes may raise questions regarding multi
> threaded usage of the API (see my other mail relating to that).
>
> Which means: an explicit API to opt-in to this behavior is safest and best I
> believe. Maybe something as simple as the following would be sufficient?
>
> ```
> /* Cache parent DIEs chain to speed up repeated dwarf_getscopes calls.
>
> Returns -1 for errors or 0 if the parent chain was cached already. */
> extern int dwarf_cache_parent_dies(Dwarf_Die *cudie);
> ```
>
> Alternatively a function that returns the cache could be considered, which
> would then require new versions of dwarf_getscopes* that take the cache as an
> argument.
I think an API that makes the "caching" explicit might be best. Maybe
we can call it a DieTree? We would then have a function to create a
DieTree and (new) functions that take a DieTree (and a Dwarf_Die) to
operate on it. The user can then also destroy the DieTree again when
done.
Cheers,
Mark
More information about the Elfutils-devel
mailing list