How to debug broken unwinding?

Mark Wielaard
Thu Jun 15 17:45:00 GMT 2017

On Tue, 2017-06-13 at 18:17 +0200, Mark Wielaard wrote:
> The tricky case is that last case. If the previous frame is a signal
> frame then this frame is also an activation (so you don't need to
> subtract 1 to get the actual address that was executing). I am not sure
> I fully understand anymore why that is. This must be a frame that is
> itself not an initial or signal frame, but that is called (activated)
> from a signal frame.
> But assuming that is the correct semantics then your original
> observation seems to explain what is going on. The code doesn't account
> for the fact that during an unwind other modules can be reported that
> might make it able to unwind the current frame. And it caches the
> failure state of the unwind of the current frame, so it doesn't retry
> when asked.
> I think we could reset the state of the current frame if any more
> modules are reported. But is it possible for your parser or perf to
> report modules as soon as it knows about them, not lazily during an
> unwind? Is reporting lazily at the last possible moment an optimization
> that really helps? I would assume that you end up reporting all modules
> anyway. So it actually seems easier to report everything upfront.

The following should reset the state of the current thread frame if a
module is added during unwinding. I haven't replicated your issue yet to
verify. And adding modules outside of dwfl_report_begin/end during
unwind wasn't really intended to work. But since there is code out there
doing it anyway maybe we should try to support it. Does the following
help your case?

diff --git a/libdwfl/dwfl_module.c b/libdwfl/dwfl_module.c
index 510bd69..dc81538 100644
--- a/libdwfl/dwfl_module.c
+++ b/libdwfl/dwfl_module.c
@@ -156,6 +156,9 @@ use (Dwfl_Module *mod, Dwfl_Module **tailp, Dwfl *dwfl)
       dwfl->lookup_module = NULL;
+  /* Note that the module list changed.  */
+  dwfl->dwfl_module_state++;
   return mod;
diff --git a/libdwfl/frame_unwind.c b/libdwfl/frame_unwind.c
index 4dc9c43..9b2c2d4 100644
--- a/libdwfl/frame_unwind.c
+++ b/libdwfl/frame_unwind.c
@@ -520,6 +520,7 @@ new_unwound (Dwfl_Frame *state)
   unwound = malloc (sizeof (*unwound) + sizeof (*unwound->regs) * nregs);
   if (unlikely (unwound == NULL))
     return NULL;
+  thread->dwfl_module_state = process->dwfl->dwfl_module_state;
   state->unwound = unwound;
   unwound->thread = thread;
   unwound->unwound = NULL;
@@ -700,7 +701,22 @@ internal_function
 __libdwfl_frame_unwind (Dwfl_Frame *state)
   if (state->unwound)
-    return;
+    {
+      /* If it wasn't an error, then go for it.  */
+      if (state->unwound->pc_state != DWFL_FRAME_STATE_ERROR)
+	return;
+      /* If it was an error and the dwfl module list is the same, too bad.  */
+      uint64_t t_state, d_state;
+      t_state = state->thread->dwfl_module_state;
+      d_state = state->thread->process->dwfl->dwfl_module_state;
+      if (t_state == d_state)
+	return;
+      /* OK, lets try again. */
+      free (state->unwound);
+      state->unwound = NULL;
+    }
   /* Do not ask dwfl_frame_pc for ISACTIVATION, it would try to unwind STATE
      which would deadlock us.  */
   Dwarf_Addr pc;
diff --git a/libdwfl/libdwflP.h b/libdwfl/libdwflP.h
index 7d5f795..9d38373 100644
--- a/libdwfl/libdwflP.h
+++ b/libdwfl/libdwflP.h
@@ -131,6 +131,11 @@ struct Dwfl
   Dwfl_Module **lookup_module;	/* Module associated with segment, or null.  */
   int *lookup_segndx;		/* User segment index, or -1.  */
+  /* State (counter) of Dwfl module list.  Used to check by Dwfl_Thread
+     when updating Dwfl_Frame state if there was a previous error that
+     might be resolved when using an updated Dwfl module list.  */
+  uint64_t dwfl_module_state;
   /* Cache from last dwfl_report_segment call.  */
   const void *lookup_tail_ident;
   GElf_Off lookup_tail_vaddr;
@@ -244,6 +249,10 @@ struct Dwfl_Thread
      Later the processed frames get freed and this pointer is updated.  */
   Dwfl_Frame *unwound;
   void *callbacks_arg;
+  /* State of the Dwfl Modules last time an unwind wof the Dwfl_Frame
+     was attempted that resulted in an error.  Indicates it can be
+     retried if it is different from the state of the Dwfl.  */
+  uint64_t dwfl_module_state;
 /* See its typedef in libdwfl.h.  */

More information about the Elfutils-devel mailing list