This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] libdw: Make dwarf_getfuncs find all (defining) DW_TAG_subprogram DIEs.


On 09/20/2013 06:55 AM, Mark Wielaard wrote:
> Hi Josh,
> 
> On Thu, 2013-09-19 at 18:42 -0700, Josh Stone wrote:
>> With systemtap-debuginfo-2.3-1.fc19.x86_64, my patched stap.git found
>> 75713 probes in 'process("/usr/bin/stap").function("*")' (as counted by
>> the pass-2 -v line).  Unpatched stap.git with your patched elfutils.git
>> found only 61056 probes.  :(
> 
> The attached patch fixes the partial/imported unit issue you found.
> Which actually made the implementation simpler. I also added some
> comments to the libdw.h dwarf_getfuncs documentation.
> 
> But it still doesn't seem to find as many probes as you do:
> Pass 2: analyzed script: 72231 probe(s) [...]
> 
> Of course this does much more than just finding the defining
> subprograms. So we might not be comparing the exact same thing. Could
> you post you systemtap patch somewhere? I am just using the packaged
> systemtap-2.3-1.fc19.x86_64 on itself with LD_LIBRARY_PATH pointing to a
> elfutils git build with the below patch.

I think the apparent difference in your new patch is some oddity in
stap's git master.  With your new patch for elfutils.git w/ stap-2.3, I
also get 72231, and w/ stap.git I get 75713.

The additions in stap.git look fishy.  There are a lot with pc=0, which
shouldn't happen, like:
probe abort@@GLIBC_2.2.5@:-1 process=/usr/bin/stap reloc=.dynamic pc=0x0

Then some have no file:line info, but these might be fine, like:
probe _fini@:-1 process=/usr/bin/stap reloc=.dynamic pc=0x1edf14

But anyway, it doesn't appear to be an elfutils issue.


As requested, I attached my systemtap patch to bypass dwarf_getfuncs and
do our own full walk.  It sometimes makes stap faster overall, but at
the cost of more memory (having preemptively cached more things).  It
finds the exact same functions as your new dwarf_getfuncs.

Here are some (non-scientific) comparisons of time and memory use, with:
  "old-eu" = systemtap.git and elfutils.git
  "new-eu" = systemtap.git and elfutils.git with your patch
  "mine" = systemtap.git with my patch and plain elfutils.git

> stap -vl 'process("/usr/bin/stap").function("*")'
> old-eu: 34613 probe(s), 293796virt/94812res/12848shr/84688data kb, in 580usr/20sys/601real ms.
> new-eu: 75713 probe(s), 405508virt/175636res/13120shr/196372data kb, in 1610usr/40sys/1649real ms.
> mine: 75713 probe(s), 426504virt/196700res/13136shr/217368data kb, in 1530usr/40sys/1570real ms.

> stap -vl 'process("/usr/bin/stap").function("main")'
> old-eu: 1 probe(s), 243908virt/45228res/10920shr/34800data kb, in 50usr/10sys/49real ms.
> new-eu: 1 probe(s), 244344virt/47144res/12328shr/35208data kb, in 220usr/0sys/227real ms.
> mine: 1 probe(s), 270636virt/73208res/12336shr/61500data kb, in 240usr/10sys/249real ms.

So my patch saves a little time on the wildcard, loses in the specific
case, and uses a big chunk more memory either way.  Of course, "old-eu"
doesn't find nearly as many probes for the wildcard, so it just shows
the cost of correctness here.

How about looking at the kernel?
(kernel-debuginfo-3.10.11-200.fc19.x86_64)

> stap -vl 'kernel.function("*")'
> old-eu: 97588 probe(s), 556896virt/326000res/105412shr/223176data kb, in 3040usr/80sys/3128real ms.
> new-eu: 97588 probe(s), 552236virt/321392res/105436shr/218488data kb, in 3490usr/90sys/3583real ms.
> mine: 97588 probe(s), 619132virt/388240res/105448shr/285384data kb, in 3300usr/100sys/3402real ms.

> stap -vl syscall.open
> old-eu: 2 probe(s), 398600virt/162556res/100152shr/62776data kb, in 450usr/20sys/470real ms.
> new-eu: 2 probe(s), 394272virt/158604res/100544shr/58416data kb, in 960usr/20sys/980real ms.
> mine: 2 probe(s), 476192virt/240440res/100544shr/140336data kb, in 1140usr/20sys/1167real ms.

Again, my patch has a slight time advantage on the wildcard, but loses
that on more specific probes, and uses much more memory.  But this also
shows that the new dwarf_getfuncs slowed down about .5s either way, with
no benefit.  Although, if we ever find a way to get the kernel using
partial units, we will need this anyway.

I'm not really inclined to commit my systemtap patch at this point - it
uses too much memory for inconsistent time gain/loss.  Its only clear
advantage is that we don't have to wait for new elfutils. :P
diff --git a/dwflpp.cxx b/dwflpp.cxx
index 176b602..ccb6f2f 100644
--- a/dwflpp.cxx
+++ b/dwflpp.cxx
@@ -510,55 +510,6 @@ dwflpp::func_is_exported()
 }
 
 void
-dwflpp::cache_inline_instances (Dwarf_Die* die)
-{
-  // If this is an inline instance, link it back to its origin
-  Dwarf_Die origin;
-  if (dwarf_tag(die) == DW_TAG_inlined_subroutine &&
-      dwarf_attr_die(die, DW_AT_abstract_origin, &origin))
-    {
-      vector<Dwarf_Die>*& v = cu_inl_function_cache[origin.addr];
-      if (!v)
-        v = new vector<Dwarf_Die>;
-      v->push_back(*die);
-    }
-
-  // Recurse through other scopes that may contain inlines
-  Dwarf_Die child, import;
-  if (dwarf_child(die, &child) == 0)
-    do
-      {
-        switch (dwarf_tag (&child))
-          {
-          // tags that could contain inlines
-          case DW_TAG_compile_unit:
-          case DW_TAG_module:
-          case DW_TAG_lexical_block:
-          case DW_TAG_with_stmt:
-          case DW_TAG_catch_block:
-          case DW_TAG_try_block:
-          case DW_TAG_entry_point:
-          case DW_TAG_inlined_subroutine:
-          case DW_TAG_subprogram:
-            cache_inline_instances(&child);
-            break;
-
-          // imported dies should be followed
-          case DW_TAG_imported_unit:
-            if (dwarf_attr_die(&child, DW_AT_import, &import))
-              cache_inline_instances(&import);
-            break;
-
-          // nothing to do for other tags
-          default:
-            break;
-          }
-      }
-    while (dwarf_siblingof(&child, &child) == 0);
-}
-
-
-void
 dwflpp::iterate_over_inline_instances (int (* callback)(Dwarf_Die * die, void * arg),
                                        void * data)
 {
@@ -566,7 +517,7 @@ dwflpp::iterate_over_inline_instances (int (* callback)(Dwarf_Die * die, void *
   assert (func_is_inline ());
 
   if (cu_inl_function_cache_done.insert(cu->addr).second)
-    cache_inline_instances(cu);
+    build_cu_caches(cu);
 
   vector<Dwarf_Die>* v = cu_inl_function_cache[function->addr];
   if (!v)
@@ -583,16 +534,39 @@ dwflpp::iterate_over_inline_instances (int (* callback)(Dwarf_Die * die, void *
 
 
 void
-dwflpp::cache_die_parents(cu_die_parent_cache_t* parents, Dwarf_Die* die)
+dwflpp::cache_dies(cu_die_parent_cache_t* parents,
+                   cu_function_cache_t* functions,
+                   Dwarf_Die* die)
 {
   // Record and recurse through DIEs we care about
-  Dwarf_Die child, import;
+  const char* name;
+  Dwarf_Die child, import, origin;
   if (dwarf_child(die, &child) == 0)
     do
       {
+        bool record_child = false, recurse = false;
+
         switch (dwarf_tag (&child))
           {
-          // normal tags to recurse
+          case DW_TAG_subprogram:
+            record_child = recurse = true;
+            if (!dwarf_hasattr(&child, DW_AT_declaration)
+                && (name = dwarf_diename(&child)))
+              functions->insert(make_pair(string(name), child));
+            break;
+
+          case DW_TAG_inlined_subroutine:
+            record_child = recurse = true;
+            if (dwarf_attr_die(&child, DW_AT_abstract_origin, &origin))
+              {
+                vector<Dwarf_Die>*& v = cu_inl_function_cache[origin.addr];
+                if (!v)
+                  v = new vector<Dwarf_Die>;
+                v->push_back(child);
+              }
+            break;
+
+          // normal tags to record and recurse
           case DW_TAG_compile_unit:
           case DW_TAG_module:
           case DW_TAG_lexical_block:
@@ -600,18 +574,15 @@ dwflpp::cache_die_parents(cu_die_parent_cache_t* parents, Dwarf_Die* die)
           case DW_TAG_catch_block:
           case DW_TAG_try_block:
           case DW_TAG_entry_point:
-          case DW_TAG_inlined_subroutine:
-          case DW_TAG_subprogram:
           case DW_TAG_namespace:
           case DW_TAG_class_type:
           case DW_TAG_structure_type:
-            parents->insert(make_pair(child.addr, *die));
-            cache_die_parents(parents, &child);
+            record_child = recurse = true;
             break;
 
           // record only, nothing to recurse
           case DW_TAG_label:
-            parents->insert(make_pair(child.addr, *die));
+            record_child = true;
             break;
 
           // imported dies should be followed
@@ -619,7 +590,7 @@ dwflpp::cache_die_parents(cu_die_parent_cache_t* parents, Dwarf_Die* die)
             if (dwarf_attr_die(&child, DW_AT_import, &import))
               {
                 parents->insert(make_pair(import.addr, *die));
-                cache_die_parents(parents, &import);
+                cache_dies(parents, functions, &import);
               }
             break;
 
@@ -627,25 +598,50 @@ dwflpp::cache_die_parents(cu_die_parent_cache_t* parents, Dwarf_Die* die)
           default:
             break;
           }
+
+        if (record_child)
+          parents->insert(make_pair(child.addr, *die));
+        if (recurse)
+          cache_dies(parents, functions, &child);
       }
     while (dwarf_siblingof(&child, &child) == 0);
 }
 
 
-cu_die_parent_cache_t*
-dwflpp::get_die_parents()
+void
+dwflpp::build_cu_caches(Dwarf_Die* cu)
 {
-  assert (cu);
-
   cu_die_parent_cache_t *& parents = cu_die_parent_cache[cu->addr];
   if (!parents)
     {
+      // [invariant] functions and parents are always built together for a CU
+      cu_function_cache_t *& functions = cu_function_cache[cu->addr];
+      assert(!functions);
+
       parents = new cu_die_parent_cache_t;
-      cache_die_parents(parents, cu);
+      functions = new cu_function_cache_t;
+      cache_dies(parents, functions, cu);
       if (sess.verbose > 4)
-        clog << _F("die parent cache %s:%s size %zu", module_name.c_str(),
-                   cu_name().c_str(), parents->size()) << endl;
+        {
+          const char* cu_name = dwarf_diename(cu);
+          clog << _F("die parent cache %s:%s size %zu", module_name.c_str(),
+                     cu_name, parents->size()) << endl;
+          clog << _F("function cache %s:%s size %zu", module_name.c_str(),
+                     cu_name, functions->size()) << endl;
+        }
+      mod_info->update_symtab(functions);
     }
+}
+
+
+cu_die_parent_cache_t*
+dwflpp::get_die_parents()
+{
+  assert (cu);
+
+  cu_die_parent_cache_t *& parents = cu_die_parent_cache[cu->addr];
+  if (!parents)
+    build_cu_caches(cu);
   return parents;
 }
 
@@ -931,24 +927,13 @@ dwflpp::declaration_resolve(Dwarf_Die *type)
 }
 
 
-int
-dwflpp::cu_function_caching_callback (Dwarf_Die* func, void *arg)
+cu_function_cache_t*
+dwflpp::get_cu_functions(Dwarf_Die* cu)
 {
-  cu_function_cache_t* v = static_cast<cu_function_cache_t*>(arg);
-  const char *name = dwarf_diename(func);
-  if (!name)
-    return DWARF_CB_OK;
-
-  v->insert(make_pair(string(name), *func));
-  return DWARF_CB_OK;
-}
-
-
-int
-dwflpp::mod_function_caching_callback (Dwarf_Die* cu, void *arg)
-{
-  dwarf_getfuncs (cu, cu_function_caching_callback, arg, 0);
-  return DWARF_CB_OK;
+  cu_function_cache_t *& functions = cu_function_cache[cu->addr];
+  if (!functions)
+    build_cu_caches(cu);
+  return functions;
 }
 
 
@@ -960,17 +945,7 @@ dwflpp::iterate_over_functions (int (* callback)(Dwarf_Die * func, base_query *
   assert (module);
   assert (cu);
 
-  cu_function_cache_t *v = cu_function_cache[cu->addr];
-  if (v == 0)
-    {
-      v = new cu_function_cache_t;
-      cu_function_cache[cu->addr] = v;
-      dwarf_getfuncs (cu, cu_function_caching_callback, v, 0);
-      if (sess.verbose > 4)
-        clog << _F("function cache %s:%s size %zu", module_name.c_str(),
-                   cu_name().c_str(), v->size()) << endl;
-      mod_info->update_symtab(v);
-    }
+  cu_function_cache_t *v = get_cu_functions(cu);
 
   cu_function_cache_t::iterator it;
   cu_function_cache_range_t range = v->equal_range(function);
@@ -1037,6 +1012,17 @@ dwflpp::iterate_over_functions (int (* callback)(Dwarf_Die * func, base_query *
 
 
 int
+dwflpp::mod_function_caching_callback (Dwarf_Die* cu, void *arg)
+{
+  dwflpp* o = static_cast<dwflpp*>(arg);
+  cu_function_cache_t * functions = o->get_cu_functions(cu);
+  cu_function_cache_t * mod_functions = o->mod_function_cache[o->module_dwarf];
+  mod_functions->insert(functions->begin(), functions->end());
+  return DWARF_CB_OK;
+}
+
+
+int
 dwflpp::iterate_single_function (int (* callback)(Dwarf_Die * func, base_query * q),
                                  base_query * q, const string& function)
 {
@@ -1047,12 +1033,11 @@ dwflpp::iterate_single_function (int (* callback)(Dwarf_Die * func, base_query *
   if (!module_dwarf)
     return rc;
 
-  cu_function_cache_t *v = mod_function_cache[module_dwarf];
+  cu_function_cache_t *& v = mod_function_cache[module_dwarf];
   if (v == 0)
     {
       v = new cu_function_cache_t;
-      mod_function_cache[module_dwarf] = v;
-      iterate_over_cus (mod_function_caching_callback, v, false);
+      iterate_over_cus (mod_function_caching_callback, this, false);
       if (sess.verbose > 4)
         clog << _F("module function cache %s size %zu", module_name.c_str(),
                    v->size()) << endl;
diff --git a/dwflpp.h b/dwflpp.h
index ba5e9bd..908286b 100644
--- a/dwflpp.h
+++ b/dwflpp.h
@@ -345,12 +345,15 @@ private:
 
   std::set<void*> cu_inl_function_cache_done; // CUs that are already cached
   cu_inl_function_cache_t cu_inl_function_cache;
-  void cache_inline_instances (Dwarf_Die* die);
 
   mod_cu_die_parent_cache_t cu_die_parent_cache;
-  void cache_die_parents(cu_die_parent_cache_t* parents, Dwarf_Die* die);
   cu_die_parent_cache_t *get_die_parents();
 
+  void cache_dies(cu_die_parent_cache_t* parents,
+                  cu_function_cache_t* functions,
+                  Dwarf_Die* die);
+  void build_cu_caches(Dwarf_Die* cu);
+
   Dwarf_Die* get_parent_scope(Dwarf_Die* die);
 
   /* The global alias cache is used to resolve any DIE found in a
@@ -374,8 +377,8 @@ private:
                                                   const std::string&, void *),
                                  void * data);
 
+  cu_function_cache_t* get_cu_functions(Dwarf_Die* cu);
   static int mod_function_caching_callback (Dwarf_Die* func, void *arg);
-  static int cu_function_caching_callback (Dwarf_Die* func, void *arg);
 
   bool has_single_line_record (dwarf_query * q, char const * srcfile, int lineno);
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]