This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch v7 3/5] x86* unwinder: libdwfl/


On Sat, 19 Oct 2013 03:05:23 +0200, Mark Wielaard wrote:
> Hi Jan,
> 
> Sorry, still skipped some parts. Still need to go through
> dwfl_frame_core.c and dwfl_frame_pid.c. But I did review all the rest.
> 
> Again mostly nitpicks. The only real issue I would like you to
> think/comment on is around the dwfl_thread_next comments.

Maybe you want threads to be iterated through callbacks?

Maybe we should make a separate mail thread for yet another API update.


> Everything
> else looks fine. I did have some comments here and there, but nothing I
> think must be changed, just would be good to see if my comments make
> sense, even if they are nitpicks that aren't worth it to change right
> now. Except for the dwfl_thread_next thing I believe the public
> interface is abstract/good enough to make changes to implementation
> details later if we wish.
> 
> I'll comment on dwfl_frame_core.c and dwfl_frame_pid.c later, but a
> quick look didn't immediately show anything obviously bad there either.
> 
> On Sun, 2013-10-13 at 12:24 +0200, Jan Kratochvil wrote:
> > On Fri, 11 Oct 2013 22:47:14 +0200, Jan Kratochvil wrote:
> > libdw/
> > 2013-06-23  Jan Kratochvil  <jan.kratochvil@redhat.com>
> > 
> > 	* cfi.h (struct Dwarf_Frame_s): Make the comment more specific.
> > 	* libdw.map (ELFUTILS_0.156): Add dwfl_attach_state, dwfl_pid,
> > 	dwfl_thread_dwfl, dwfl_thread_tid, dwfl_frame_thread,
> > 	dwfl_thread_state_registers, dwfl_thread_state_register_pc,
> > 	dwfl_next_thread, dwfl_thread_getframes and dwfl_frame_pc.
> 
> This part looks fine.
> 
> > libdwfl/
> > 2013-10-13  Jan Kratochvil  <jan.kratochvil@redhat.com>
> > 	    Mark Wielaard  <mjw@redhat.com>
> 
> That is a bit too much credit :) You wrote 99.9% of this.
> I only nitpick.

I guess I can+should also your Signed-off-by line there.


> > 	* Makefile.am (AM_CPPFLAGS): Add ../libasm.
> 
> Why?

Because we need libeblP.h to check 'ebl->class == ELFCLASS32' as AFAIK there
is no public ebl API to fetch 'ebl->class'.

We need to know ELFCLASS32 so that we can do:
dwfl_frame_reg_set:
  if (ebl->class == ELFCLASS32)
    val &= 0xffffffff;

At least I believe if DWARF unwinds some register it should be cut to its real
hardware width.

There are various ways how to solve it but none of them find as a clear win:
 * As is.
 * New backend ebl method like normalize_reg() which cuts b32..b63 on i386.
   For example s390 backend later will have normalize_pc() to cut b31.
 * New ebl_get_class().


> > +++ b/libdwfl/dwfl_frame.c
> >
> > +#ifndef MIN
> > +# define MIN(a, b) ((a) < (b) ? (a) : (b))
> > +#endif
> 
> Not used in this file.

Deleted.


> > +/* Set STATE->pc_set from STATE->regs according to the backend.  Return true on
> > +   success, false on error.  */
> > +static bool
> > +state_fetch_pc (Dwfl_Frame *state)
> > +{
> > +  switch (state->pc_state)
> > +    {
> > +    case DWFL_FRAME_STATE_PC_SET:
> > +      return true;
> > +    case DWFL_FRAME_STATE_PC_UNDEFINED:
> > +      abort ();
> > +    case DWFL_FRAME_STATE_ERROR:;
> 
> trailing ";"

dwfl_frame.c:45:7: error: a label can only be part of a statement and a declaration is not a statement

But I guess a { block } is more appropriate there anyway so I have changed it.


> > +      Ebl *ebl = state->thread->process->ebl;
> > +      Dwarf_CIE abi_info;
> > +      if (ebl_abi_cfi (ebl, &abi_info) != 0)
> > +	{
> > +	  __libdwfl_seterrno (DWFL_E_LIBEBL);
> > +	  return false;
> > +	}
> > +      unsigned ra = abi_info.return_address_register;
> > +      /* dwarf_frame_state_reg_is_set is not applied here.  */
> > +      if (ra >= ebl_frame_nregs (ebl))
> > +	{
> > +	  __libdwfl_seterrno (DWFL_E_LIBEBL_BAD);
> > +	  return false;
> > +	}
> > +      state->pc = state->regs[ra];
> > +      state->pc_state = DWFL_FRAME_STATE_PC_SET;
> > +      return true;
> > +    }
> > +  abort ();
> > +}
> 
> I find the DWFL_FRAME_STATE_ERROR state somewhat confusing.
> Would it make sense to rename that to DWFL_FRAME_STATE_UNSET?

I find *_UNSET a bit confusing with existing DWFL_FRAME_STATE_PC_UNDEFINED.

It could be called for example *_UNINITIALIZED.


> Or have both an UNSET and an ERROR state?
> 
> In which cases do frames exist that have an unset/error state?

First I would like to clarify all these DWFL_FRAME_STATE_* enum values are
only internal to libdwfl/ , they are also only in libdwflP.h.

I have unified *_UNSET and *_ERROR as the code is more simple that way, I can
just initialize it to *_ERROR and if I see an error I just return.  Only if
the code sets it to some other state in the end the processing has been
successful.

After this explanation if you still think *_UNINITIALIZED and *_ERROR should be
separate then I sure can do so.


> > +/* Do not call it on your own, to be used by thread_* functions only.  */
> > +
> > +static void
> > +state_free (Dwfl_Frame *state)
> > +{
> > +  Dwfl_Thread *thread = state->thread;
> > +  assert (thread->unwound == state);
> > +  thread->unwound = state->unwound;
> > +  free (state);
> > +}
> 
> This is always used as:
> 
>   while (thread->unwound)
>     state_free (thread->unwound);
> 
> So just fold that loop into the function and call it thread_state_free
> or something.

I wanted to make each object type self-contained, this is destructor of
Dwfl_Frame.  Mixing it with Dwfl_Thread object seems not right for OO design.

Do you still want that change?


> > +/* Do not call it on your own, to be used by thread_* functions only.  */
> > +
> > +static Dwfl_Frame *
> > +state_alloc (Dwfl_Thread *thread)
> > +{
> > +  assert (thread->unwound == NULL);
> > +  Ebl *ebl = thread->process->ebl;
> > +  size_t nregs = ebl_frame_nregs (ebl);
> > +  if (nregs == 0)
> > +    return NULL;
> > +  assert (nregs < sizeof (((Dwfl_Frame *) NULL)->regs_set) * 8);
> > +  Dwfl_Frame *state = malloc (sizeof (*state) + sizeof (*state->regs) * nregs);
> > +  if (state == NULL)
> > +    return NULL;
> > +  state->thread = thread;
> > +  state->signal_frame = false;
> > +  state->pc_state = DWFL_FRAME_STATE_ERROR;
> > +  memset (state->regs_set, 0, sizeof (state->regs_set));
> > +  thread->unwound = state;
> > +  state->unwound = NULL;
> > +  return state;
> > +}
> 
> Comment might be a bit more specific. Only used at the start of
> dwfl_thread_getframes.

This is unrelated where this self-contained object is currenly used.

Do you still want that change?


> And maybe rename it thread_state_alloc if you
> rename the above to show they should match up.

I can rename those but I am lost what design/naming it tracks.

Maybe I could rather rename s/Dwfl_Frame *state/Dwfl_Frame *frame/ ?


> > +/* Free and unlink THREAD from the internal lists.  PREV_THREAD must be NULL if
> > +   THREAD was the first one or PREV_THREAD must be the preceding thread.  */
> > +static void
> > +thread_free (Dwfl_Thread *thread, Dwfl_Thread *prev_thread)
> > +{
> > +  Dwfl_Process *process = thread->process;
> > +  assert (prev_thread == NULL || prev_thread->process == process);
> > +  assert (prev_thread != NULL || process->thread == thread);
> > +  assert (prev_thread == NULL || prev_thread->next == thread);
> > +  if (thread->thread_detach_needed)
> > +    {
> > +      assert (thread->tid > 0);
> > +      if (process->callbacks->thread_detach)
> > +	process->callbacks->thread_detach (thread, thread->callbacks_arg);
> > +    }
> 
> I think you don't need the thread_detach_needed flag in thread. This
> function is only called from __libdwfl_process_free or dwfl_next_thread

The goal was that Dwfl_Thread object will no longer be self-contained.  It
should not matter where and how it is being used.  Moreover if you throw an
exception from some callback then thread_detach_needed makes sense.

But that is probably too C++ correct for plain C so I have removed
thread_detach_needed.


> at which point the thread must already have been detached. The attached
> state is only true within dwfl_thread_getframes after
> set_initial_registers has been called, and when needed thread_detach is
> already called in that function itself.
> 
> > +  while (thread->unwound)
> > +    state_free (thread->unwound);
> > +  if (prev_thread == NULL)
> > +    process->thread = thread->next;
> > +  else
> > +    prev_thread->next = thread->next;
> > +  free (thread);
> > +}
> > +
> > +/* Allocate new Dwfl_Thread and link it to PROCESS.  PREV_THREAD must be NULL
> > +   if this is the first thread for PROCESS, otherwise PREV_THREAD must be the
> > +   last thread of PROCESS.  */
> > +static Dwfl_Thread *
> > +thread_alloc (Dwfl_Process *process, Dwfl_Thread *prev_thread)
> > +{
> > +  assert (prev_thread == NULL || prev_thread->process == process);
> > +  assert (prev_thread != NULL || process->thread == NULL);
> > +  assert (prev_thread == NULL || prev_thread->next == NULL);
> > +  Dwfl_Thread *thread = malloc (sizeof (*thread));
> > +  if (thread == NULL)
> > +    return NULL;
> > +  thread->process = process;
> > +  thread->unwound = NULL;
> > +  thread->tid = 0;
> > +  thread->next = NULL;
> > +  if (prev_thread == NULL)
> > +    process->thread = thread;
> > +  else
> > +    prev_thread->next = thread;
> > +  return thread;
> > +}
> 
> OK. So process->thread is always the first thread (or NULL) and
> prev_thread when this function is called will be the last(est)_thread.

Yes.  The prev_thread parameter exists there because Dwfl_Process does not
have pointer to the tail of the list.  And we do not want to traverse the list
each time as it would be O(n^2) for the whole program run.


> > +void
> > +internal_function
> > +__libdwfl_process_free (Dwfl_Process *process)
> > +{
> > +  Dwfl *dwfl = process->dwfl;
> > +  if (process->callbacks->detach != NULL)
> > +    process->callbacks->detach (dwfl, process->callbacks_arg);
> > +  while (process->thread)
> > +    thread_free (process->thread, NULL);
> > +  assert (dwfl->process == process);
> > +  dwfl->process = NULL;
> > +  if (process->ebl_close)
> > +    ebl_closebackend (process->ebl);
> > +  free (process);
> > +}
> 
> OK. Had to think for a sec why the while (process->thread) loop worked,
> even after having just read thread_free...

Anything to change?


> > +/* Allocate new Dwfl_Process for DWFL.  */
> > +static void
> > +process_alloc (Dwfl *dwfl)
> > +{
> > +  Dwfl_Process *process = malloc (sizeof (*process));
> > +  if (process == NULL)
> > +    return;
> > +  process->dwfl = dwfl;
> > +  process->thread = NULL;
> > +  dwfl->process = process;
> > +}
> 
> I think this would be slightly clearer if it returned the Dwfl_Process
> pointer (or NULL on failure). But not a big deal.

Just the caller would have to assert (process == dwfl->process); anyway so
I found it easier this way.


> > +bool
> > +dwfl_attach_state (Dwfl *dwfl, int machine, pid_t pid,
> > +		   const Dwfl_Thread_Callbacks *thread_callbacks, void *arg)
> > +{
> > +  if (thread_callbacks == NULL || thread_callbacks->next_thread == NULL
> > +      || thread_callbacks->set_initial_registers == NULL)
> > +    {
> > +      __libdwfl_seterrno (DWFL_E_UNKNOWN_ERROR);
> > +      return false;
> > +    }
> 
> That is not a great error. Maybe we need DWFL_E_INVALID_ARGUMENT?

Yes, done:
	  DWFL_ERROR (INVALID_ARGUMENT, N_("Invalid argument"))


> > +pid_t
> > +dwfl_pid (Dwfl *dwfl)
> > +{
> > +  if (dwfl->process == NULL)
> > +    {
> > +      __libdwfl_seterrno (DWFL_E_NO_ATTACH_STATE);
> > +      return -1;
> > +    }
> > +  return dwfl->process->pid;
> > +}
> > +INTDEF(dwfl_pid)
> 
> Setting dwfl_errno seems a little redundant since there is no other
> reason this call can fail. Thought it is harmless I guess.

I thought the general elfutils public API rule is that if a function fails it
sets the library's errno.


> > +Dwfl_Thread *
> > +dwfl_next_thread (Dwfl *dwfl, Dwfl_Thread *prev_thread)
> > +{
> > +  assert (prev_thread == NULL || prev_thread->process->dwfl == dwfl);
> 
> Instead of an assert maybe another candidate for
> DWFL_E_INVALID_ARGUMENT?

Yes, done.


> > +  if (prev_thread != NULL && prev_thread->next != NULL)
> > +    return prev_thread->next;
> > +  Dwfl_Process *process = dwfl->process;
> > +  if (process == NULL)
> > +    {
> > +      __libdwfl_seterrno (DWFL_E_NO_ATTACH_STATE);
> > +      return NULL;
> > +    }
> 
> hohum, so someone is using a Dwfl_Thread * after calling dwfl_end?
> Or is there any other way this could happen? Here maybe an assert might
> be in order instead of returning an error?

You can get this error when you never call dwfl_attach_state() for DWfl:

	Dwfl *dwfl = dwfl_begin (&callbacks);
	dwfl_next_thread (dwfl, NULL);


> > +  Dwfl_Thread *nthread = thread_alloc (process, prev_thread);
> 
> Wait, don't we have to check prev_thread == NULL && process->thread !=
> NULL first?

thread_alloc() always works, both for allocating first thread and the last
thread.  It asserts you do not pass prev_thread as a non-last one.


> > +  if (nthread == NULL)
> > +    {
> > +      __libdwfl_seterrno (DWFL_E_NOMEM);
> > +      return NULL;
> > +    }
> > +  nthread->tid = process->callbacks->next_thread (dwfl, nthread,
> > +						  process->callbacks_arg,
> > +						  &nthread->callbacks_arg);
> > +  if (nthread->tid < 0)
> > +    {
> > +      thread_free (nthread, prev_thread);
> > +      __libdwfl_seterrno (DWFL_E_NEXT_THREAD_FAIL);
> > +      return NULL;
> > +    }
> 
> Pondering whether the dwfl_errno can't already be set by the next_thread
> callback. If so, we are overriding some potentially useful error message
> with something fairly generic.

I have updated Dwfl_Thread_Callbacks->next_thread() comment to explicitly
state so.
  if (nthread->tid < 0)
    {
      Dwfl_Error saved_errno = dwfl_errno (void)
      thread_free (nthread, prev_thread);
      __libdwfl_seterrno (saved_errno);
      return NULL;
    }
+
  /* Called to iterate through threads.  Returns next TID (thread ID) on
     success, a negative number on failure and zero if there are no more
     threads.  dwfl_errno () should be set if negative number has been
     returned.  NTHREAD is the new thread being created.  *THREAD_ARGP may be


> > +  if (nthread->tid == 0)
> > +    {
> > +      thread_free (nthread, prev_thread);
> > +      __libdwfl_seterrno (DWFL_E_NOERROR);
> > +      return NULL;
> > +    }
> > +  return nthread;
> > +}
> > +INTDEF(dwfl_next_thread)
> 
> One issue here seems to be that you cannot easily reread or rediscover
> all threads of a process. Once you walked through all current threads
> that seems to be it. I think we might have to allow "restarting" the
> thread list by providing a NULL prev_thread a second time. That would
> "invalidate" all current threads attached and return "fresh new
> threads". Does that make sense?

In some way yes but I do not think such corner case is worth complicating the
API.  Currently dwfl_next_thread is designed so that you can use it even for
already iterated threads, it just always returns the next one.  Therefore using
dwfl_next_thread to fetch the first thread should not be different.

There remains a problem that currently the API is designed for non-intrusive
but possibly wrong backtracing, such as that for /usr/bin/perf.  Modules get
reported still before any thread is frozen, virtual memory mapping may change
before dwfl_thread_getframes gets called, moreover the mapping even may change
from other threads while backtracing one of the threads.

Therefore keeping Dwfl valid for too long does not make much sense now.

In future there may be an extension to freeze all threads of a process before
reporting its virtual memory mapping.  But then again resetting the threads
iterator is not useful as frozen threads cannot create/destroy another thread.


> > +int
> > +dwfl_thread_getframes (Dwfl_Thread *thread,
> > +		       int (*callback) (Dwfl_Frame *state, void *arg),
> > +		       void *arg)
> > +{
> > +  assert (thread->unwound == NULL);
> 
> What makes sure this is always true? See below, are we cleaning up on
> success or only on error?

There was missing cleanup of thread->unwound from successful unwind.
Still here should not be assert() anyway as we could be called from CALLBACK
(that would be very wrong from the application but it may happen).

Changed it to:
  if (thread->unwound != NULL)
    {
      /* We had to be called from inside CALLBACK.  */
      __libdwfl_seterrno (DWFL_E_ATTACH_STATE_CONFLICT);
      return -1;
    }


> Make sure we document that if set_initial_registers returns false, we
> don't call thread_detach.

OK, done:
     safe to unwind.  Returns true on success or false and sets dwfl_errno ()
     on failure.  In the case of a failure thread_detach will not be called.
     This method must not be NULL.  */


> > +  thread->thread_detach_needed = true;
> 
> I don't think you need this to keep track of when to call detach_thread,
> and it makes the code less readable IMHO (there now is always an assert
> that checks it is true). See also the comment above.

It is there to keep Dwfl_Thread self-contained as explained somewhere above.
It should be possible to call thread_free() "any time" and it should not break.


> > +  if (! state_fetch_pc (thread->unwound))
> > +    {
> > +      assert (thread->thread_detach_needed);
> > +      if (process->callbacks->thread_detach)
> > +	process->callbacks->thread_detach (thread, thread->callbacks_arg);
> > +      thread->thread_detach_needed = false;
> > +      while (thread->unwound)
> > +	state_free (thread->unwound);
> > +      return -1;
> > +    }
> > +
> > +  Dwfl_Frame *state = thread->unwound;
> > +  do
> > +    {
> > +      int err = callback (state, arg);
> > +      if (err != DWARF_CB_OK)
> > +	{
> > +	  assert (thread->thread_detach_needed);
> > +	  if (process->callbacks->thread_detach)
> > +	    process->callbacks->thread_detach (thread, thread->callbacks_arg);
> > +	  thread->thread_detach_needed = false;
> > +	  while (thread->unwound)
> > +	    state_free (thread->unwound);
> > +	  return err;
> > +	}
> > +      __libdwfl_frame_unwind (state);
> > +      state = state->unwound;
> > +    }
> > +  while (state && state->pc_state == DWFL_FRAME_STATE_PC_SET);
> 
> We cleaned up the frames before returning err, shouldn't we also here?

Yes, fixed, thanks.


> > +  Dwfl_Error err = dwfl_errno ();
> > +  assert (thread->thread_detach_needed);
> > +  if (process->callbacks->thread_detach)
> > +    process->callbacks->thread_detach (thread, thread->callbacks_arg);
> > +  thread->thread_detach_needed = false;
> > +  if (state == NULL || state->pc_state == DWFL_FRAME_STATE_ERROR)
> > +    {
> > +      __libdwfl_seterrno (err);
> > +      return -1;
> > +    }
> > +  assert (state->pc_state == DWFL_FRAME_STATE_PC_UNDEFINED);
> 
> Why this assert?

The code above checks if DWFL_FRAME_STATE_PC_SET and later if
DWFL_FRAME_STATE_ERROR so this point of code should be reached only for the
case of DWFL_FRAME_STATE_PC_UNDEFINED.  It is both documenting the code and
also providing a protection against forgotten code path if a new
DWFL_FRAME_STATE_* value gets defined in the future.


> > diff --git a/libdwfl/dwfl_frame_pc.c b/libdwfl/dwfl_frame_pc.c
> > +bool
> > +dwfl_frame_pc (Dwfl_Frame *state, Dwarf_Addr *pc, bool *isactivation)
> > +{
> > +  assert (state->pc_state == DWFL_FRAME_STATE_PC_SET);
> > +  *pc = state->pc;
> > +  if (isactivation)
> > +    {
> > +      /* Bottom frame?  */
> > +      if (state == state->thread->unwound)
> > +	*isactivation = true;
> > +      /* *ISACTIVATION is logical or of current and previous frame state.  */
> > +      else if (state->signal_frame)
> > +	*isactivation = true;
> > +      else
> > +	{
> > +	  /* Not affected by unsuccessfully unwound frame.  */
> > +	  __libdwfl_frame_unwind (state);
> > +	  if (state->unwound == NULL
> > +	      || state->unwound->pc_state != DWFL_FRAME_STATE_PC_SET)
> > +	    *isactivation = false;
> > +	  else
> > +	    *isactivation = state->unwound->signal_frame;
> > +	}
> 
> I don't understand this last else block part where isactivation depends
> on the unwound state. Could you add a comment explaining or a reference
> to some document that defines this part?

I considered it was documented by these two lines:

> > +      /* *ISACTIVATION is logical or of current and previous frame state.  */
> > +     /* Not affected by unsuccessfully unwound frame.  */

Is this expansion of comments good enough now?

      /* Bottom frame?  */
      if (state == state->thread->unwound)
        *isactivation = true;
      /* *ISACTIVATION is logical union of whether current or previous frame
         state is SIGNAL_FRAME.  */
      else if (state->signal_frame)
        *isactivation = true;
      else
        {
          /* If the previous frame has unwound unsuccessfully just silently do
             not consider it could be a SIGNAL_FRAME.  */
          __libdwfl_frame_unwind (state);
          if (state->unwound == NULL
              || state->unwound->pc_state != DWFL_FRAME_STATE_PC_SET)
            *isactivation = false;
          else
            *isactivation = state->unwound->signal_frame;
        }


> > diff --git a/libdwfl/dwfl_frame_unwind.c b/libdwfl/dwfl_frame_unwind.c
> 
> Nitpick. Maybe rename to just frame_unwind.c or libdwfl_frame_unwind.c
> to indicate that it doesn't contain any public functions?

Done.  (in a commit only, not in the posted diff)


> > +static int
> > +bra_compar (const void *key_voidp, const void *elem_voidp)
> > +{
> > +  Dwarf_Word offset = (uintptr_t) key_voidp;
> > +  const Dwarf_Op *op = elem_voidp;
> > +  return (offset > op->offset) - (offset < op->offset);
> > +}
> 
> Nitpick: Rename bra_compare?

It is a parameter to bsearch() which calls that parameter 'compar'.
I have no idea why is that parameter 'compar' but I try to be compliant with it.
As qsort() also calls it 'compar' it does not seem to be a POSIX typo.


> > +/* FIXME: Handle bytecode deadlocks and overflows.  */
> 
> In the systemtap runtime unwinder we "cheated" and don't support
> backward branches/skips (I doubt those are really ever used in CFI).
> Would that help here?

I do not find some limit on maximum executed instructions complicated, it just
was not implemented in this first critical-path only part.

Skipping backward branches is an interesting idea if you have found it works
in practice.


> > +static bool
> > +expr_eval (Dwfl_Frame *state, Dwarf_Frame *frame, const Dwarf_Op *ops,
> > +	   size_t nops, Dwarf_Addr *result)
> > +{
[...]
> > +  size_t stack_used = 0, stack_allocated = 0;
> 
> > +  bool
> > +  push (Dwarf_Addr val)
> > +  {
> > +    if (stack_used == stack_allocated)
[...]
> > +    return true;
> > +  }
> 
> Nitpick. I like functions to have one line of whitespace before/after to
> make them stand out.

OK, done.


> > +  bool
[...]
> > +  }
> 
> Likewise.

Done likewise.


> > +  Dwarf_Addr val1, val2;
> > +  bool is_location = false;
> > +  for (const Dwarf_Op *op = ops; op < ops + nops; op++)
> > +    switch (op->atom)
> > +    {
> > +      case DW_OP_reg0 ... DW_OP_reg31:
> > +	if (! state_get_reg (state, op->atom - DW_OP_reg0, &val1)
> > +	    || ! push (val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_regx:
> > +	if (! state_get_reg (state, op->number, &val1) || ! push (val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_breg0 ... DW_OP_breg31:
> > +	if (! state_get_reg (state, op->atom - DW_OP_breg0, &val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	val1 += op->number;
> > +	if (! push (val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_bregx:
> > +	if (! state_get_reg (state, op->number, &val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	val1 += op->number2;
> > +	if (! push (val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_lit0 ... DW_OP_lit31:
> > +	if (! push (op->atom - DW_OP_lit0))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_plus_uconst:
> > +	if (! pop (&val1) || ! push (val1 + op->number))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_call_frame_cfa:;
> > +	Dwarf_Op *cfa_ops;
> > +	size_t cfa_nops;
> > +	Dwarf_Addr cfa;
> > +	if (dwarf_frame_cfa (frame, &cfa_ops, &cfa_nops) != 0
> > +	    || ! expr_eval (state, frame, cfa_ops, cfa_nops, &cfa)
> 
> Here when you recurse into expr_eval you could pass NULL for frame (and
> check that here too) to indicate no DW_OP_call_frame_cfa is expected.

Great idea, done.
        if (frame == NULL
            || dwarf_frame_cfa (frame, &cfa_ops, &cfa_nops) != 0
            || ! expr_eval (state, NULL, cfa_ops, cfa_nops, &cfa)
            || ! push (cfa))

Added for expr_eval():
/* If FRAME is NULL is are computing CFI frame base.  In such case another 
   DW_OP_call_frame_cfa is no longer permitted.  */



> > +	    || ! push (cfa))
> > +	  {
> > +	    __libdwfl_seterrno (DWFL_E_LIBDW);
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	is_location = true;
> > +	break;
> > +      case DW_OP_stack_value:
> > +	is_location = false;
> > +	break;
> > +      case DW_OP_deref:
> > +	if (! pop (&val1)
> > +	    || process->callbacks->memory_read == NULL
> > +	    || ! process->callbacks->memory_read (process->dwfl, val1, &val1,
> > +						  process->callbacks_arg)
> > +	    || ! push (val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> 
> Nitpick. In the case of process->callbacks->memory_read == NULL no
> dwfl_errno is set.

OK, fixed:
      case DW_OP_deref:
        if (process->callbacks->memory_read == NULL)
          {
            __libdwfl_seterrno (DWFL_E_LIBDW);
            free (stack);
            return false;
          }
        if (! pop (&val1)
            || ! process->callbacks->memory_read (process->dwfl, val1, &val1,
                                                  process->callbacks_arg)
            || ! push (val1))
          {
            free (stack);
            return false;
          }
        break;


> > +	break;
> > +      case DW_OP_nop:
> > +	break;
> > +      case DW_OP_dup:
> > +	if (! pop (&val1) || ! push (val1) || ! push (val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_const1u:
> > +      case DW_OP_const1s:
> > +      case DW_OP_const2u:
> > +      case DW_OP_const2s:
> > +      case DW_OP_const4u:
> > +      case DW_OP_const4s:
> > +      case DW_OP_const8u:
> > +      case DW_OP_const8s:
> > +      case DW_OP_constu:
> > +      case DW_OP_consts:
> > +	if (! push (op->number))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +      case DW_OP_bra:
> > +	if (! pop (&val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	if (val1 == 0)
> > +	  break;
> > +	/* FALLTHRU */
> > +      case DW_OP_skip:;
> > +	Dwarf_Word offset = op->offset + 1 + 2 + (int16_t) op->number;
> > +	const Dwarf_Op *found = bsearch ((void *) (uintptr_t) offset, ops, nops,
> > +					 sizeof (*ops), bra_compar);
> 
> Clever. But I wonder if we could do something in dwarf_getlocation to
> make this easier? Maybe record the decoded op number there somehow and
> store it in number2? Not something to solve/do now though.

Maybe.  Not now, though.


> > +	if (found == NULL)
> > +	  {
> > +	    free (stack);
> > +	    /* PPC32 vDSO has such invalid operations.  */
> 
> How horrible. Old kernels only?

IIRC I was looking even at some recent PPC but not too recent PPC as I do not
have access to too recent PPCs.  It is still definitely needed for RHEL-5+,
just not sure if it applies for the 'ports' branch, IMO not.
(That is 'ports' branch is about old hosts, not about old targets.)


> > +	    __libdwfl_seterrno (DWFL_E_INVALID_DWARF);
> > +	    return false;
> > +	  }
> > +	/* Undo the 'for' statement increment.  */
> > +	op = found - 1;
> > +	break;
> > +      case DW_OP_drop:
> > +	if (! pop (&val1))
> > +	  {
> > +	    free (stack);
> > +	    return false;
> > +	  }
> > +	break;
> > +#define BINOP(atom, op)							\
> > +      case atom:							\
> > +	if (! pop (&val2) || ! pop (&val1) || ! push (val1 op val2))	\
> > +	  {								\
> > +	    free (stack);						\
> > +	    return false;						\
> > +	  }								\
> > +	break;
> > +      BINOP (DW_OP_and, &)
> > +      BINOP (DW_OP_shl, <<)
> > +      BINOP (DW_OP_plus, +)
> > +      BINOP (DW_OP_mul, *)
> > +#undef BINOP
> > +#define BINOP_SIGNED(atom, op)						\
> > +      case atom:							\
> > +	if (! pop (&val2) || ! pop (&val1)				\
> > +	    || ! push ((int64_t) val1 op (int64_t) val2))		\
> > +	  {								\
> > +	    free (stack);						\
> > +	    return false;						\
> > +	  }								\
> > +	break;
> > +      BINOP_SIGNED (DW_OP_le, <=)
> > +      BINOP_SIGNED (DW_OP_ge, >=)
> > +      BINOP_SIGNED (DW_OP_eq, ==)
> > +      BINOP_SIGNED (DW_OP_lt, <)
> > +      BINOP_SIGNED (DW_OP_gt, >)
> > +      BINOP_SIGNED (DW_OP_ne, !=)
> > +#undef BINOP_SIGNED
> > +      default:
> > +	__libdwfl_seterrno (DWFL_E_UNSUPPORTED_DWARF);
> > +	return false;
> 
> Which OPs are missing (DW_OP_div and DW_OP_mod it seems, be careful with
> divide by zero)? Lets make sure we distinquish between OPs that are
> valid in CFI, but we don't handle and OPs that aren't valid in CFI
> (INVALID_DWARF).

Some CFI-valid OPs are really missing, they just were not in the critical-path
initial post.


> > +    }
> > +  if (! pop (result))
> > +    {
> > +      free (stack);
> > +      return false;
> > +    }
> > +  free (stack);
> > +  if (is_location
> > +      && (process->callbacks->memory_read == NULL
> > +	  || ! process->callbacks->memory_read (process->dwfl, *result, result,
> > +						process->callbacks_arg)))
> > +    return false;
> 
> Again that pesky process->callbacks->memory_read == NULL doesn't set
> dwfl_errno.

Fixed.

  if (is_location)
    {
      if (process->callbacks->memory_read == NULL)
        {
          __libdwfl_seterrno (DWFL_E_LIBDW);
          return false;
        }
      if (! process->callbacks->memory_read (process->dwfl, *result, result,
                                             process->callbacks_arg)))
        return false;
    }


> > +  return true;
> > +}
> > +
> > +/* The logic is to call __libdwfl_seterrno for any CFI bytecode interpretation
> > +   error so one can easily catch the problem with a debugger.  Still there are
> > +   archs with invalid CFI for some registers where the registers are never used
> > +   later.  Therefore we continue unwinding leaving the registers undefined.
> > +
> > +   The only exception is PC itself, when there is an error unwinding PC we
> > +   return false.  Otherwise we would return successful end of backtrace seeing
> > +   an undefined PC register (due to an error unwinding it).  */
> 
> The first part of the comment makes sense. The last part doesn't match
> the implementation, see also at the end.

I have deleted the second paragraph.


> > +static void
> > +handle_cfi (Dwfl_Frame *state, Dwarf_Addr pc, Dwarf_CFI *cfi)
> > +{
> > +  Dwarf_Frame *frame;
> > +  if (INTUSE(dwarf_cfi_addrframe) (cfi, pc, &frame) != 0)
> > +    {
> > +      __libdwfl_seterrno (DWFL_E_LIBDW);
> > +      return;
> > +    }
> > +  Dwfl_Thread *thread = state->thread;
> > +  Dwfl_Process *process = thread->process;
> > +  Ebl *ebl = process->ebl;
> > +  size_t nregs = ebl_frame_nregs (ebl);
> 
> This could return zero. Is that a problem here (return error)? Or did we
> already check earlier (assert > 0)?

dwfl_thread_getframes() already checks if (ebl_frame_nregs (ebl) == 0)
and dwfl_thread_getframes() is the only way how to get Dwfl_Frame *.
Therefore put an assert here.


> > +  Dwfl_Frame *unwound;
> > +  unwound = malloc (sizeof (*unwound) + sizeof (*unwound->regs) * nregs);
> > +  state->unwound = unwound;
> > +  unwound->thread = thread;
> > +  unwound->unwound = NULL;
> > +  unwound->signal_frame = frame->fde->cie->signal_frame;
> > +  unwound->pc_state = DWFL_FRAME_STATE_ERROR;
> > +  memset (unwound->regs_set, 0, sizeof (unwound->regs_set));
> > +  for (unsigned regno = 0; regno < nregs; regno++)
> > +    {
> > +      Dwarf_Op reg_ops_mem[3], *reg_ops;
> > +      size_t reg_nops;
> > +      if (dwarf_frame_register (frame, regno, reg_ops_mem, &reg_ops,
> > +				&reg_nops) != 0)
> > +	{
> > +	  __libdwfl_seterrno (DWFL_E_LIBDW);
> > +	  continue;
> > +	}
> > +      Dwarf_Addr regval;
> > +      if (reg_nops == 0)
> > +	{
> > +	  if (reg_ops == reg_ops_mem)
> > +	    {
> > +	      /* REGNO is undefined.  */
> > +	      unsigned ra = frame->fde->cie->return_address_register;
> > +	      if (regno == ra)
> > +		unwound->pc_state = DWFL_FRAME_STATE_PC_UNDEFINED;
> > +	      continue;
> > +	    }
> > +	  else if (reg_ops == NULL)
> > +	    {
> > +	      /* REGNO is same-value.  */
> > +	      if (! state_get_reg (state, regno, &regval))
> > +		continue;
> > +	    }
> > +	  else
> > +	    {
> > +	      __libdwfl_seterrno (DWFL_E_INVALID_DWARF);
> > +	      continue;
> > +	    }
> > +	}
> > +      else if (! expr_eval (state, frame, reg_ops, reg_nops, &regval))
> > +	{
> > +	  /* PPC32 vDSO has various invalid operations, ignore them.  The
> > +	     register will look as unset causing an error later, if used.
> > +	     But PPC32 does not use such registers.  */
> > +	  continue;
> > +	}
> > +      if (! dwfl_frame_reg_set (unwound, regno, regval))
> > +	{
> > +	  __libdwfl_seterrno (DWFL_E_INVALID_REGISTER);
> > +	  continue;
> > +	}
> > +    }
> > +  if (unwound->pc_state == DWFL_FRAME_STATE_ERROR
> > +      && dwfl_frame_reg_get (unwound, frame->fde->cie->return_address_register,
> > +			     &unwound->pc))
> > +    {
> > +      /* PPC32 __libc_start_main properly CFI-unwinds PC as zero.  Currently
> > +	 none of the archs supported for unwinding have zero as a valid PC.  */
> > +      if (unwound->pc == 0)
> > +	unwound->pc_state = DWFL_FRAME_STATE_PC_UNDEFINED;
> > +      else
> > +	unwound->pc_state = DWFL_FRAME_STATE_PC_SET;
> > +    }
> > +}
> 
> If at the end unwound->pc_state == DWFL_FRAME_STATE_ERROR should we
> signal an error by setting state->unwound?

I do not understand you here.  handle_cfi() at its top already does:

  unwound = malloc (sizeof (*unwound) + sizeof (*unwound->regs) * nregs);
  state->unwound = unwound;

So state->unwound->pc_state == DWFL_FRAME_STATE_ERROR which means an error.


> > +void
> > +internal_function
> > +__libdwfl_frame_unwind (Dwfl_Frame *state)
> > +{
> > +  if (state->unwound)
> > +    return;
> 
> I am not sure when this happens.

It was more meaningful with the former API, before you introduced the callback
there.  One had randomly accessible all the frames at once.


> Is this for when someone calls dwfl_frame_pc and it hits that last
> activation check I don't fully understand?

Currently yes.


> > +  Dwarf_Addr pc;
> > +  bool ok = INTUSE(dwfl_frame_pc) (state, &pc, NULL);
> > +  assert (ok);
> > +  /* Do not ask dwfl_frame_pc for ISACTIVATION, it would try to unwind STATE
> > +     which would deadlock us.  */
> 
> Move this comment up a bit to where you actually call dwfl_frame_pc.

Done.


> And the comment here is "check whether this is the initial frame or a
> signal frame. Then we need to unwind from the adjusted pc."

Done; just the comment should be exactly the opposite one:

  /* Check whether this is the initial frame or a signal frame.  
     Then we need to unwind from the original, unadjusted PC.  */


> > +  if (state != state->thread->unwound && ! state->signal_frame)
> > +    pc--;
> > +  Dwfl_Module *mod = INTUSE(dwfl_addrmodule) (state->thread->process->dwfl, pc);
> > +  if (mod != NULL)
> > +    {
> > +      Dwarf_Addr bias;
> > +      Dwarf_CFI *cfi_eh = INTUSE(dwfl_module_eh_cfi) (mod, &bias);
> > +      if (cfi_eh)
> > +	{
> > +	  handle_cfi (state, pc - bias, cfi_eh);
> > +	  if (state->unwound)
> > +	    return;
> > +	}
> > +      Dwarf_CFI *cfi_dwarf = INTUSE(dwfl_module_dwarf_cfi) (mod, &bias);
> > +      if (cfi_dwarf)
> > +	{
> > +	  handle_cfi (state, pc - bias, cfi_dwarf);
> > +	  if (state->unwound)
> > +	    return;
> > +	}
> > +    }
> > +  __libdwfl_seterrno (DWFL_E_NO_DWARF);
> > +}
> 
> handle_cfi might set dwflerrno when state->unwound == NULL, in that case
> we probably shouldn't override it with the generic DWFL_E_NO_DWARF,
> might want to put that in the else branch of (mod != NULL)?

The reason is that while the idea of DWFL_FRAME_STATE_PC_UNDEFINED terminating
the backtrace is nice in real world it does not happen.  It was (I have) only
recently fixed it for x86* and non-x86* archs may never be fixed.  So the
decision is left up to the caller whether DWFL_E_NO_DWARF is considered as
a proper backtrace terminator.  tests/run-backtrace.sh has for it:
  # In some cases we cannot reliably find out we got behind _start.
  if cmp -s <(echo "${abs_builddir}/backtrace: dwfl_thread_getframes: No DWARF information found") <(uniq <$1); then

There may be multiple different error codes if CFI is not found, at least
besides DWFL_E_NO_DWARF also DWFL_E_NO_MATCH.  So that DWFL_E_NO_DWARF is
rather a part of API.  I should have documented it:
/* Iterate through the frames for a thread.  Returns zero if all frames
   have been processed by the callback, returns -1 on error, or the
   value of the callback when not DWARF_CB_OK.  -1 returned on error will set
   dwfl_errno ().  DWFL_E_NO_DWARF means no further unwind information was
   found, which is up to the caller whether it is considered the same as
   returned zero; some backtraced systems will return value -1 and
   DWFL_E_NO_DWARF instead of the more correct value zero.  Keeps calling the
   callback with the next frame while the callback returns DWARF_CB_OK, till
   there are no more frames.  On start will call the set_initial_registers
   callback and on return will call the detach_thread callback of the
   Dwfl_Thread.  */


> Also a nitpick, but might this be slightly nicer to read if this and
> handle_cfi returned stat->unwound to make the caller error checking more
> "natural"?

This is only an internal function and I find more complicated in the caller to
solve what if the function returned NULL but state->unwound is not NULL etc.
I would prefer to keep it as is.


> > +typedef struct
> > +{
> > +  /* Called to iterate through threads.  Returns next TID (thread ID) on
> > +     success, a negative number on failure and zero if there are no more
> > +     threads.  NTHREAD is the new thread being created.  *THREAD_ARGP may be
> > +     optionally set by the implementation, THREAD_ARGP is never NULL.
> > +     This method must not be NULL.  */ 
> > +  pid_t (*next_thread) (Dwfl *dwfl, Dwfl_Thread *nthread, void *dwfl_arg,
> > +			void **thread_argp)
> > +    __nonnull_attribute__ (1, 2);
> 
> Maybe explicitly mention that THREAD_ARG will be passed to other
> callbacks that take a Dwfl_Thread argument? 

Put there:
     *THREAD_ARGP will be passed to set_initial_registers or thread_detach
     callbacks together with Dwfl_Thread *thread.  


> > +  /* Called during unwinding to access memory (stack) state.  Returns true for
> > +     successfully read *RESULT or false and sets dwfl_errno () on failure.
> > +     This method may be NULL - in such case dwfl_thread_getframes will return
> > +     only the initial frame.  */
> > +  bool (*memory_read) (Dwfl *dwfl, Dwarf_Addr addr, Dwarf_Word *result,
> > +                       void *dwfl_arg)
> > +    __nonnull_attribute__ (1, 3);
> > +
> > +  /* Called on initial unwind to get the initial register state of the first
> > +     frame.  Should call dwfl_thread_state_registers, possibly multiple times
> > +     for different ranges and possibly also dwfl_thread_state_register_pc, to
> > +     fill in initial (DWARF) register values.  After this call, till at least
> > +     thread_detach is called, the thread is assumed to be frozen, so that it is
> > +     safe to unwind.  Returns true on success or false and sets dwfl_err () on
> > +     failure.  This method must not be NULL.  */ 
> > +  bool (*set_initial_registers) (Dwfl_Thread *thread, void *thread_arg)
> > +    __nonnull_attribute__ (1);
> > +
> > +  /* Called by dwfl_end.  All thread_detach method calls have been already
> > +     done.  This method may be NULL.  */
> > +  void (*detach) (Dwfl *dwfl, void *dwfl_arg)
> > +    __nonnull_attribute__ (1);
> > +
> > +  /* Called when unwinding is done.  No callback will be called after
> > +     this method has been called.  Iff set_initial_registers was called for
> > +     a TID thread_detach will be called before the detach method above.
> > +     This method may be NULL.  */
> > +  void (*thread_detach) (Dwfl_Thread *thread, void *thread_arg)
> > +    __nonnull_attribute__ (1);
> > +} Dwfl_Thread_Callbacks;
> 
> OK. But as mentioned above lets document that thread_detach will only be
> called if set_initial_registers was called first and returned true.

set_initial_registers now states:
     on failure.  In the case of a failure thread_detach will not be called.

And thread_detach now also states:
     this method has been called.  Iff set_initial_registers was called for
     a TID and it returned success thread_detach will be called before the 
     detach method above.  This method may be NULL.  */


> > +/* Gets the next known thread, if any.  To get the initial thread
> > +   provide NULL as previous thread PREV_THREAD.  On error function returns NULL
> > +   and sets dwfl_errno ().  When no more threads are found function returns
> > +   NULL and dwfl_errno () is set to 0 - dwfl_errmsg (0) returns NULL then.  */
> > +Dwfl_Thread *dwfl_next_thread (Dwfl *dwfl, Dwfl_Thread *prev_thread)
> > +  __nonnull_attribute__ (1);
> 
> Does this interface imply you can "rewalk" the threads given some
> returned Dwfl_Thread * you once got, that might not be the last one
> returned by this function for that particular Dwfl? The implementation
> implies so, but do we want to guarantee that? (I think I would like to
> require the user to always pass in the last returned Dwfl_Thread. But
> what about the life-time of Dwfl_Thread objects in that case?) Also we
> should document what happens if you pass NULL as prev_thread a second
> time. Does it invalidate all previous Dwfl_Threads returned before? (I
> would say yes, and it does a fresh new walk in case thread
> appeared/disappeared).

It was already discussed above that new fresh walk is not useful.
So yes, it "rewalks" the threads.

I was trying to keep the inteface free from callbacks, when the callbacks are
there already the other option is to make a callback-style iterator even for
the thread list.  In such case dwfl_next_thread() would sure no longer exist.


> > +/* Iterate through the frames for a thread.  Returns zero if all frames
> > +   have been processed by the callback, returns -1 on error, or the
> > +   value of the callback when not DWARF_CB_OK.  Keeps calling the
> > +   callback with the next frame while the callback returns
> > +   DWARF_CB_OK, till there are no more frames.  On start will call the
> > +   set_initial_registers callback and on return will call the
> > +   detach_thread callback of the Dwfl_Thread.  */
> > +int dwfl_thread_getframes (Dwfl_Thread *thread,
> > +                           int (*callback) (Dwfl_Frame *state, void *arg),
> > +                           void *arg)
> > +  __nonnull_attribute__ (1, 2);
> > +
> > +/* Return *PC (program counter) for thread-specific frame STATE.
> > +   Set *ISACTIVATION according to DWARF frame "activation" definition.
> > +   Typically you need to substract 1 from *PC if *ACTIVATION is false to safely
> > +   find function of the caller.  ACTIVATION may be NULL.  PC must not be NULL.
> > +   Function returns false if it failed to find *PC.  */
> > +bool dwfl_frame_pc (Dwfl_Frame *state, Dwarf_Addr *pc, bool *isactivation)
> > +  __nonnull_attribute__ (1, 2);
> 
> OK.
> 
> Do we also need a dwfl_detach_state ()?
> In case a user wants to reuse a Dwfl with new pid/callbacks?

It was already discussed before. Dwfl represents virtual memory mapping, with
different PID - or even after resuming current PID for a moment - the mapping
is no longer useful.


> > +++ b/libdwfl/libdwflP.h
> > +#include "libeblP.h"
> 
> Do you need this? Can't you use libebl.h?

It was discussed above by the text starting with:
	Because we need libeblP.h to check 'ebl->class == ELFCLASS32' as AFAIK there
	is no public ebl API to fetch 'ebl->class'.


> > +typedef struct Dwfl_Process Dwfl_Process;
> >  
> >  /* gettext helper macros.  */
> >  #define _(Str) dgettext ("elfutils", Str)
> > @@ -74,7 +77,19 @@
> >    DWFL_ERROR (BADELF, N_("not a valid ELF file"))			      \
> >    DWFL_ERROR (WEIRD_TYPE, N_("cannot handle DWARF type description"))	      \
> >    DWFL_ERROR (WRONG_ID_ELF, N_("ELF file does not match build ID"))	      \
> > -  DWFL_ERROR (BAD_PRELINK, N_("corrupt .gnu.prelink_undo section data"))
> > +  DWFL_ERROR (BAD_PRELINK, N_("corrupt .gnu.prelink_undo section data"))      \
> > +  DWFL_ERROR (LIBEBL_BAD, N_("Internal error due to ebl"))		      \
> > +  DWFL_ERROR (CORE_MISSING, N_("Missing data in core file"))		      \
> > +  DWFL_ERROR (INVALID_REGISTER, N_("Invalid register"))			      \
> > +  DWFL_ERROR (PROCESS_MEMORY_READ, N_("Error reading process memory"))	      \
> > +  DWFL_ERROR (PROCESS_NO_ARCH, N_("Have not found ELF module in a process"))  \
> 
> Maybe "Couldn't determine ELF architecture"?

It is "Couldn't determine architecture from any of the ELF files in Dwfl".
So the singular form of "ELF" is not right there.

Used:
  DWFL_ERROR (PROCESS_NO_ARCH, N_("Couldn't find architecture of any ELF"))   \


> > +  DWFL_ERROR (PARSE_PROC, N_("Error parsing /proc filesystem"))		      \
> > +  DWFL_ERROR (NO_THREAD, N_("No thread found"))				      \
> 
> This one is never used.

Deleted.


> > +  DWFL_ERROR (INVALID_DWARF, N_("Invalid DWARF"))			      \
> > +  DWFL_ERROR (UNSUPPORTED_DWARF, N_("Unsupported DWARF"))		      \
> > +  DWFL_ERROR (NEXT_THREAD_FAIL, N_("Unable to find more threads"))	      \
> > +  DWFL_ERROR (ATTACH_STATE_CONFLICT, N_("Dwfl already has attached state"))   \
> > +  DWFL_ERROR (NO_ATTACH_STATE, N_("Dwfl has no attached state"))
> 
[...]
> > +/* This holds information common for all the threads/tasks/TIDs of one process
> > +   for backtraces.  */
> > +
> > +struct Dwfl_Process
> > +{
> > +  struct Dwfl *dwfl;
> > +  pid_t pid;
> > +  const Dwfl_Thread_Callbacks *callbacks;
> > +  void *callbacks_arg;
> > +  struct ebl *ebl;
> > +  bool ebl_close:1;
> > +  Dwfl_Thread *thread;
> > +};
> 
> thread could use a comment explaining it is the (currently) last thread.
> Or maybe rename it to last_thread or cur_thread to make that clear?

It is the first thread.  This is why callers have to pass that "prev_thread"
as the last thread is not stored anywhere.

I have renamed it to "first_thread".


> > +/* See its typedef in libdwfl.h.  */
> > +
> > +struct Dwfl_Thread
> > +{
> > +  Dwfl_Process *process;
> > +  Dwfl_Thread *next;
> > +  pid_t tid;
> > +  bool thread_detach_needed : 1;
> > +  /* Bottom frame.  */
> > +  Dwfl_Frame *unwound;
> > +  void *callbacks_arg;
> > +};
> 
> I don't think you really need thread_detach_needed. attaching/getting
> initial registers and detaching should only happen in
> dwfl_thread_getframes.

Therefore removed it.


> I found unwound a little confusing, maybe rename it to outer_frame?
> But maybe it is just that bottom/outer are slightly confusing names in
> general.

Exactly.


> current_frame maybe? Or I am nitpicking and you should just
> ignore me for this one.

Ignoring.


> > +/* See its typedef in libdwfl.h.  */
> > +
> > +struct Dwfl_Frame
> > +{
> > +  Dwfl_Thread *thread;
> > +  /* Previous (outer) frame.  */
> > +  Dwfl_Frame *unwound;
> 
> Do we really need to keep all old frames around like this?

You have redesigned from scratch the API.  The original goal was to:
 * Have no callbacks.
 * Keep cached everything that's possible - never do anything twice.

I did not want to influence your decisions so I kept it as is.
But I see now the change to the callbacks I wanted to avoid is not complete.


> Can't we cleanup up the old frame in dwfl_thread_getframes after having
> called the callback for it (and creating the next)? Not that it wastes
> huge amounts of space. More curious if it would result in a simpler
> datastructure.

With the current design there should be a callback for threads and then all
the inter-struct pointers can be removed.

Now the API is in half-way between the two designs, sharind disadvantages of
both design styles.


> > +  bool signal_frame : 1;
> > +  enum
> > +  {
> > +    /* This structure is still being initialized or there was an error
> > +       initializing it.  */
> > +    DWFL_FRAME_STATE_ERROR,
> > +    /* PC field is valid.  */
> > +    DWFL_FRAME_STATE_PC_SET,
> > +    /* PC field is undefined, this means the next (inner) frame was the
> > +       outermost frame.  */
> > +    DWFL_FRAME_STATE_PC_UNDEFINED
> > +  } pc_state;
> 
> See comment above, could/should we split the ERROR and UNSET state?

Explained far above.


> > +  /* Either initialized from appropriate REGS element or on some archs
> > +     initialized separately as the return address has no DWARF register.  */
> > +  Dwarf_Addr pc;
> > +  /* (1 << X) bitmask where 0 <= X < ebl_frame_nregs.  */
> > +  uint64_t regs_set[3];
> > +  /* REGS array size is ebl_frame_nregs.  */
> > +  Dwarf_Addr regs[];
> > +};
> 
> bitmask tells which regs are valid.

Put there:
  /* REGS array size is ebl_frame_nregs.  
     REGS_SET tells which of the REGS are valid.  */
  Dwarf_Addr regs[];


> > +/* Update STATE->UNWOUND for the unwound frame.
> > +   Functions sets dwfl_errno ().  */
> > +extern void __libdwfl_frame_unwind (Dwfl_Frame *state)
> > +  internal_function;
> 
> Document "On error STATE->UNWOUND will be NULL and sets dwfl_errno ()".

Put there:
/* Update STATE->unwound for the unwound frame.
   On error STATE->unwound == NULL
   or STATE->unwound->pc_state == DWFL_FRAME_STATE_ERROR;
   in such case dwfl_errno () is set.
   If STATE->unwound->pc_state == DWFL_FRAME_STATE_PC_UNDEFINED
   then STATE was the last valid frame.  */


> (And nitpick maybe just let this return state->unwound to make error
> checking simpler.)

Explained far above.


Thanks,
Jan


diff --git a/libdwfl/dwfl_frame.c b/libdwfl/dwfl_frame.c
index ee9e328..34b4727 100644
--- a/libdwfl/dwfl_frame.c
+++ b/libdwfl/dwfl_frame.c
@@ -30,10 +30,6 @@
 #include <sys/ptrace.h>
 #include <unistd.h>
 
-#ifndef MIN
-# define MIN(a, b) ((a) < (b) ? (a) : (b))
-#endif
-
 /* Set STATE->pc_set from STATE->regs according to the backend.  Return true on
    success, false on error.  */
 static bool
@@ -45,23 +41,25 @@ state_fetch_pc (Dwfl_Frame *state)
       return true;
     case DWFL_FRAME_STATE_PC_UNDEFINED:
       abort ();
-    case DWFL_FRAME_STATE_ERROR:;
-      Ebl *ebl = state->thread->process->ebl;
-      Dwarf_CIE abi_info;
-      if (ebl_abi_cfi (ebl, &abi_info) != 0)
-	{
-	  __libdwfl_seterrno (DWFL_E_LIBEBL);
-	  return false;
-	}
-      unsigned ra = abi_info.return_address_register;
-      /* dwarf_frame_state_reg_is_set is not applied here.  */
-      if (ra >= ebl_frame_nregs (ebl))
-	{
-	  __libdwfl_seterrno (DWFL_E_LIBEBL_BAD);
-	  return false;
-	}
-      state->pc = state->regs[ra];
-      state->pc_state = DWFL_FRAME_STATE_PC_SET;
+    case DWFL_FRAME_STATE_ERROR:
+      {
+	Ebl *ebl = state->thread->process->ebl;
+	Dwarf_CIE abi_info;
+	if (ebl_abi_cfi (ebl, &abi_info) != 0)
+	  {
+	    __libdwfl_seterrno (DWFL_E_LIBEBL);
+	    return false;
+	  }
+	unsigned ra = abi_info.return_address_register;
+	/* dwarf_frame_state_reg_is_set is not applied here.  */
+	if (ra >= ebl_frame_nregs (ebl))
+	  {
+	    __libdwfl_seterrno (DWFL_E_LIBEBL_BAD);
+	    return false;
+	  }
+	state->pc = state->regs[ra];
+	state->pc_state = DWFL_FRAME_STATE_PC_SET;
+      }
       return true;
     }
   abort ();
@@ -108,18 +106,12 @@ thread_free (Dwfl_Thread *thread, Dwfl_Thread *prev_thread)
 {
   Dwfl_Process *process = thread->process;
   assert (prev_thread == NULL || prev_thread->process == process);
-  assert (prev_thread != NULL || process->thread == thread);
+  assert (prev_thread != NULL || process->first_thread == thread);
   assert (prev_thread == NULL || prev_thread->next == thread);
-  if (thread->thread_detach_needed)
-    {
-      assert (thread->tid > 0);
-      if (process->callbacks->thread_detach)
-	process->callbacks->thread_detach (thread, thread->callbacks_arg);
-    }
   while (thread->unwound)
     state_free (thread->unwound);
   if (prev_thread == NULL)
-    process->thread = thread->next;
+    process->first_thread = thread->next;
   else
     prev_thread->next = thread->next;
   free (thread);
@@ -132,7 +124,7 @@ static Dwfl_Thread *
 thread_alloc (Dwfl_Process *process, Dwfl_Thread *prev_thread)
 {
   assert (prev_thread == NULL || prev_thread->process == process);
-  assert (prev_thread != NULL || process->thread == NULL);
+  assert (prev_thread != NULL || process->first_thread == NULL);
   assert (prev_thread == NULL || prev_thread->next == NULL);
   Dwfl_Thread *thread = malloc (sizeof (*thread));
   if (thread == NULL)
@@ -142,7 +134,7 @@ thread_alloc (Dwfl_Process *process, Dwfl_Thread *prev_thread)
   thread->tid = 0;
   thread->next = NULL;
   if (prev_thread == NULL)
-    process->thread = thread;
+    process->first_thread = thread;
   else
     prev_thread->next = thread;
   return thread;
@@ -155,8 +147,8 @@ __libdwfl_process_free (Dwfl_Process *process)
   Dwfl *dwfl = process->dwfl;
   if (process->callbacks->detach != NULL)
     process->callbacks->detach (dwfl, process->callbacks_arg);
-  while (process->thread)
-    thread_free (process->thread, NULL);
+  while (process->first_thread)
+    thread_free (process->first_thread, NULL);
   assert (dwfl->process == process);
   dwfl->process = NULL;
   if (process->ebl_close)
@@ -172,7 +164,7 @@ process_alloc (Dwfl *dwfl)
   if (process == NULL)
     return;
   process->dwfl = dwfl;
-  process->thread = NULL;
+  process->first_thread = NULL;
   dwfl->process = process;
 }
 
@@ -183,7 +175,7 @@ dwfl_attach_state (Dwfl *dwfl, int machine, pid_t pid,
   if (thread_callbacks == NULL || thread_callbacks->next_thread == NULL
       || thread_callbacks->set_initial_registers == NULL)
     {
-      __libdwfl_seterrno (DWFL_E_UNKNOWN_ERROR);
+      __libdwfl_seterrno (DWFL_E_INVALID_ARGUMENT);
       return false;
     }
   if (dwfl->process != NULL)
@@ -270,7 +262,11 @@ INTDEF(dwfl_frame_thread)
 Dwfl_Thread *
 dwfl_next_thread (Dwfl *dwfl, Dwfl_Thread *prev_thread)
 {
-  assert (prev_thread == NULL || prev_thread->process->dwfl == dwfl);
+  if (prev_thread != NULL && prev_thread->process->dwfl != dwfl)
+    {
+      __libdwfl_seterrno (DWFL_E_INVALID_ARGUMENT);
+      return NULL;
+    }
   if (prev_thread != NULL && prev_thread->next != NULL)
     return prev_thread->next;
   Dwfl_Process *process = dwfl->process;
@@ -290,8 +286,9 @@ dwfl_next_thread (Dwfl *dwfl, Dwfl_Thread *prev_thread)
 						  &nthread->callbacks_arg);
   if (nthread->tid < 0)
     {
+      Dwfl_Error saved_errno = dwfl_errno ();
       thread_free (nthread, prev_thread);
-      __libdwfl_seterrno (DWFL_E_NEXT_THREAD_FAIL);
+      __libdwfl_seterrno (saved_errno);
       return NULL;
     }
   if (nthread->tid == 0)
@@ -311,6 +308,7 @@ dwfl_thread_getframes (Dwfl_Thread *thread,
 {
   if (thread->unwound != NULL)
     {
+      /* We had to be called from inside CALLBACK.  */
       __libdwfl_seterrno (DWFL_E_ATTACH_STATE_CONFLICT);
       return -1;
     }
@@ -333,13 +331,10 @@ dwfl_thread_getframes (Dwfl_Thread *thread,
 	state_free (thread->unwound);
       return -1;
     }
-  thread->thread_detach_needed = true;
   if (! state_fetch_pc (thread->unwound))
     {
-      assert (thread->thread_detach_needed);
       if (process->callbacks->thread_detach)
 	process->callbacks->thread_detach (thread, thread->callbacks_arg);
-      thread->thread_detach_needed = false;
       while (thread->unwound)
 	state_free (thread->unwound);
       return -1;
@@ -351,10 +346,8 @@ dwfl_thread_getframes (Dwfl_Thread *thread,
       int err = callback (state, arg);
       if (err != DWARF_CB_OK)
 	{
-	  assert (thread->thread_detach_needed);
 	  if (process->callbacks->thread_detach)
 	    process->callbacks->thread_detach (thread, thread->callbacks_arg);
-	  thread->thread_detach_needed = false;
 	  while (thread->unwound)
 	    state_free (thread->unwound);
 	  return err;
@@ -365,10 +358,10 @@ dwfl_thread_getframes (Dwfl_Thread *thread,
   while (state && state->pc_state == DWFL_FRAME_STATE_PC_SET);
 
   Dwfl_Error err = dwfl_errno ();
-  assert (thread->thread_detach_needed);
   if (process->callbacks->thread_detach)
     process->callbacks->thread_detach (thread, thread->callbacks_arg);
-  thread->thread_detach_needed = false;
+  while (thread->unwound)
+    state_free (thread->unwound);
   if (state == NULL || state->pc_state == DWFL_FRAME_STATE_ERROR)
     {
       __libdwfl_seterrno (err);
diff --git a/libdwfl/dwfl_frame_pc.c b/libdwfl/dwfl_frame_pc.c
index 9781777..99001dd 100644
--- a/libdwfl/dwfl_frame_pc.c
+++ b/libdwfl/dwfl_frame_pc.c
@@ -42,12 +42,14 @@ dwfl_frame_pc (Dwfl_Frame *state, Dwarf_Addr *pc, bool *isactivation)
       /* Bottom frame?  */
       if (state == state->thread->unwound)
 	*isactivation = true;
-      /* *ISACTIVATION is logical or of current and previous frame state.  */
+      /* *ISACTIVATION is logical union of whether current or previous frame
+	 state is SIGNAL_FRAME.  */
       else if (state->signal_frame)
 	*isactivation = true;
       else
 	{
-	  /* Not affected by unsuccessfully unwound frame.  */
+	  /* If the previous frame has unwound unsuccessfully just silently do
+	     not consider it could be a SIGNAL_FRAME.  */
 	  __libdwfl_frame_unwind (state);
 	  if (state->unwound == NULL
 	      || state->unwound->pc_state != DWFL_FRAME_STATE_PC_SET)
diff --git a/libdwfl/dwfl_frame_unwind.c b/libdwfl/dwfl_frame_unwind.c
index d1119fe..4e32958 100644
--- a/libdwfl/dwfl_frame_unwind.c
+++ b/libdwfl/dwfl_frame_unwind.c
@@ -59,6 +59,8 @@ bra_compar (const void *key_voidp, const void *elem_voidp)
   return (offset > op->offset) - (offset < op->offset);
 }
 
+/* If FRAME is NULL is are computing CFI frame base.  In such case another
+   DW_OP_call_frame_cfa is no longer permitted.  */
 /* FIXME: Handle bytecode deadlocks and overflows.  */
 
 static bool
@@ -73,6 +75,7 @@ expr_eval (Dwfl_Frame *state, Dwarf_Frame *frame, const Dwarf_Op *ops,
     }
   Dwarf_Addr *stack = NULL;
   size_t stack_used = 0, stack_allocated = 0;
+
   bool
   push (Dwarf_Addr val)
   {
@@ -90,6 +93,7 @@ expr_eval (Dwfl_Frame *state, Dwarf_Frame *frame, const Dwarf_Op *ops,
     stack[stack_used++] = val;
     return true;
   }
+
   bool
   pop (Dwarf_Addr *val)
   {
@@ -101,6 +105,7 @@ expr_eval (Dwfl_Frame *state, Dwarf_Frame *frame, const Dwarf_Op *ops,
     *val = stack[--stack_used];
     return true;
   }
+
   Dwarf_Addr val1, val2;
   bool is_location = false;
   for (const Dwarf_Op *op = ops; op < ops + nops; op++)
@@ -165,8 +170,9 @@ expr_eval (Dwfl_Frame *state, Dwarf_Frame *frame, const Dwarf_Op *ops,
 	Dwarf_Op *cfa_ops;
 	size_t cfa_nops;
 	Dwarf_Addr cfa;
-	if (dwarf_frame_cfa (frame, &cfa_ops, &cfa_nops) != 0
-	    || ! expr_eval (state, frame, cfa_ops, cfa_nops, &cfa)
+	if (frame == NULL
+	    || dwarf_frame_cfa (frame, &cfa_ops, &cfa_nops) != 0
+	    || ! expr_eval (state, NULL, cfa_ops, cfa_nops, &cfa)
 	    || ! push (cfa))
 	  {
 	    __libdwfl_seterrno (DWFL_E_LIBDW);
@@ -179,8 +185,13 @@ expr_eval (Dwfl_Frame *state, Dwarf_Frame *frame, const Dwarf_Op *ops,
 	is_location = false;
 	break;
       case DW_OP_deref:
+	if (process->callbacks->memory_read == NULL)
+	  {
+	    __libdwfl_seterrno (DWFL_E_LIBDW);
+	    free (stack);
+	    return false;
+	  }
 	if (! pop (&val1)
-	    || process->callbacks->memory_read == NULL
 	    || ! process->callbacks->memory_read (process->dwfl, val1, &val1,
 						  process->callbacks_arg)
 	    || ! push (val1))
@@ -283,22 +294,24 @@ expr_eval (Dwfl_Frame *state, Dwarf_Frame *frame, const Dwarf_Op *ops,
       return false;
     }
   free (stack);
-  if (is_location
-      && (process->callbacks->memory_read == NULL
-	  || ! process->callbacks->memory_read (process->dwfl, *result, result,
-						process->callbacks_arg)))
-    return false;
+  if (is_location)
+    {
+      if (process->callbacks->memory_read == NULL)
+	{
+	  __libdwfl_seterrno (DWFL_E_LIBDW);
+	  return false;
+	}
+      if (! process->callbacks->memory_read (process->dwfl, *result, result,
+					     process->callbacks_arg))
+	return false;
+    }
   return true;
 }
 
 /* The logic is to call __libdwfl_seterrno for any CFI bytecode interpretation
    error so one can easily catch the problem with a debugger.  Still there are
    archs with invalid CFI for some registers where the registers are never used
-   later.  Therefore we continue unwinding leaving the registers undefined.
-
-   The only exception is PC itself, when there is an error unwinding PC we
-   return false.  Otherwise we would return successful end of backtrace seeing
-   an undefined PC register (due to an error unwinding it).  */
+   later.  Therefore we continue unwinding leaving the registers undefined.  */
 
 static void
 handle_cfi (Dwfl_Frame *state, Dwarf_Addr pc, Dwarf_CFI *cfi)
@@ -313,6 +326,7 @@ handle_cfi (Dwfl_Frame *state, Dwarf_Addr pc, Dwarf_CFI *cfi)
   Dwfl_Process *process = thread->process;
   Ebl *ebl = process->ebl;
   size_t nregs = ebl_frame_nregs (ebl);
+  assert (nregs > 0);
   Dwfl_Frame *unwound;
   unwound = malloc (sizeof (*unwound) + sizeof (*unwound->regs) * nregs);
   state->unwound = unwound;
@@ -386,11 +400,13 @@ __libdwfl_frame_unwind (Dwfl_Frame *state)
 {
   if (state->unwound)
     return;
+  /* Do not ask dwfl_frame_pc for ISACTIVATION, it would try to unwind STATE
+     which would deadlock us.  */
   Dwarf_Addr pc;
   bool ok = INTUSE(dwfl_frame_pc) (state, &pc, NULL);
   assert (ok);
-  /* Do not ask dwfl_frame_pc for ISACTIVATION, it would try to unwind STATE
-     which would deadlock us.  */
+  /* Check whether this is the initial frame or a signal frame.
+     Then we need to unwind from the original, unadjusted PC.  */
   if (state != state->thread->unwound && ! state->signal_frame)
     pc--;
   Dwfl_Module *mod = INTUSE(dwfl_addrmodule) (state->thread->process->dwfl, pc);
diff --git a/libdwfl/libdwfl.h b/libdwfl/libdwfl.h
index 8f94b90..11f22dc 100644
--- a/libdwfl/libdwfl.h
+++ b/libdwfl/libdwfl.h
@@ -581,8 +581,11 @@ typedef struct
 {
   /* Called to iterate through threads.  Returns next TID (thread ID) on
      success, a negative number on failure and zero if there are no more
-     threads.  NTHREAD is the new thread being created.  *THREAD_ARGP may be
+     threads.  dwfl_errno () should be set if negative number has been
+     returned.  NTHREAD is the new thread being created.  *THREAD_ARGP may be
      optionally set by the implementation, THREAD_ARGP is never NULL.
+     *THREAD_ARGP will be passed to set_initial_registers or thread_detach
+     callbacks together with Dwfl_Thread *thread.  
      This method must not be NULL.  */ 
   pid_t (*next_thread) (Dwfl *dwfl, Dwfl_Thread *nthread, void *dwfl_arg,
 			void **thread_argp)
@@ -601,8 +604,9 @@ typedef struct
      for different ranges and possibly also dwfl_thread_state_register_pc, to
      fill in initial (DWARF) register values.  After this call, till at least
      thread_detach is called, the thread is assumed to be frozen, so that it is
-     safe to unwind.  Returns true on success or false and sets dwfl_err () on
-     failure.  This method must not be NULL.  */ 
+     safe to unwind.  Returns true on success or false and sets dwfl_errno ()
+     on failure.  In the case of a failure thread_detach will not be called.
+     This method must not be NULL.  */ 
   bool (*set_initial_registers) (Dwfl_Thread *thread, void *thread_arg)
     __nonnull_attribute__ (1);
 
@@ -613,8 +617,8 @@ typedef struct
 
   /* Called when unwinding is done.  No callback will be called after
      this method has been called.  Iff set_initial_registers was called for
-     a TID thread_detach will be called before the detach method above.
-     This method may be NULL.  */
+     a TID and it returned success thread_detach will be called before the
+     detach method above.  This method may be NULL.  */
   void (*thread_detach) (Dwfl_Thread *thread, void *thread_arg)
     __nonnull_attribute__ (1);
 } Dwfl_Thread_Callbacks;
@@ -672,11 +676,15 @@ Dwfl_Thread *dwfl_next_thread (Dwfl *dwfl, Dwfl_Thread *prev_thread)
 
 /* Iterate through the frames for a thread.  Returns zero if all frames
    have been processed by the callback, returns -1 on error, or the
-   value of the callback when not DWARF_CB_OK.  Keeps calling the
-   callback with the next frame while the callback returns
-   DWARF_CB_OK, till there are no more frames.  On start will call the
-   set_initial_registers callback and on return will call the
-   detach_thread callback of the Dwfl_Thread.  */
+   value of the callback when not DWARF_CB_OK.  -1 returned on error will set
+   dwfl_errno ().  DWFL_E_NO_DWARF means no further unwind information was
+   found, which is up to the caller whether it is considered the same as
+   returned zero; some backtraced systems will return value -1 and
+   DWFL_E_NO_DWARF instead of the more correct value zero.  Keeps calling the
+   callback with the next frame while the callback returns DWARF_CB_OK, till
+   there are no more frames.  On start will call the set_initial_registers
+   callback and on return will call the detach_thread callback of the
+   Dwfl_Thread.  */
 int dwfl_thread_getframes (Dwfl_Thread *thread,
                            int (*callback) (Dwfl_Frame *state, void *arg),
                            void *arg)
diff --git a/libdwfl/libdwflP.h b/libdwfl/libdwflP.h
index 1de09be..0f18c66 100644
--- a/libdwfl/libdwflP.h
+++ b/libdwfl/libdwflP.h
@@ -82,15 +82,15 @@ typedef struct Dwfl_Process Dwfl_Process;
   DWFL_ERROR (CORE_MISSING, N_("Missing data in core file"))		      \
   DWFL_ERROR (INVALID_REGISTER, N_("Invalid register"))			      \
   DWFL_ERROR (PROCESS_MEMORY_READ, N_("Error reading process memory"))	      \
-  DWFL_ERROR (PROCESS_NO_ARCH, N_("Have not found ELF module in a process"))  \
+  DWFL_ERROR (PROCESS_NO_ARCH, N_("Couldn't find architecture of any ELF"))   \
   DWFL_ERROR (PARSE_PROC, N_("Error parsing /proc filesystem"))		      \
-  DWFL_ERROR (NO_THREAD, N_("No thread found"))				      \
   DWFL_ERROR (INVALID_DWARF, N_("Invalid DWARF"))			      \
   DWFL_ERROR (UNSUPPORTED_DWARF, N_("Unsupported DWARF"))		      \
   DWFL_ERROR (NEXT_THREAD_FAIL, N_("Unable to find more threads"))	      \
   DWFL_ERROR (ATTACH_STATE_CONFLICT, N_("Dwfl already has attached state"))   \
   DWFL_ERROR (NO_ATTACH_STATE, N_("Dwfl has no attached state"))	      \
-  DWFL_ERROR (NO_UNWIND, N_("unwinding not supported for this architecture"))
+  DWFL_ERROR (NO_UNWIND, N_("Unwinding not supported for this architecture")) \
+  DWFL_ERROR (INVALID_ARGUMENT, N_("Invalid argument"))
 
 #define DWFL_ERROR(name, text) DWFL_E_##name,
 typedef enum { DWFL_ERRORS DWFL_E_NUM } Dwfl_Error;
@@ -218,7 +218,7 @@ struct Dwfl_Process
   void *callbacks_arg;
   struct ebl *ebl;
   bool ebl_close:1;
-  Dwfl_Thread *thread;
+  Dwfl_Thread *first_thread;
 };
 
 /* See its typedef in libdwfl.h.  */
@@ -258,7 +258,8 @@ struct Dwfl_Frame
   Dwarf_Addr pc;
   /* (1 << X) bitmask where 0 <= X < ebl_frame_nregs.  */
   uint64_t regs_set[3];
-  /* REGS array size is ebl_frame_nregs.  */
+  /* REGS array size is ebl_frame_nregs.
+     REGS_SET tells which of the REGS are valid.  */
   Dwarf_Addr regs[];
 };
 
@@ -525,8 +526,12 @@ extern Dwfl_Module *__libdwfl_report_offline (Dwfl *dwfl, const char *name,
 extern void __libdwfl_process_free (Dwfl_Process *process)
   internal_function;
 
-/* Update STATE->UNWOUND for the unwound frame.
-   Functions sets dwfl_errno ().  */
+/* Update STATE->unwound for the unwound frame.
+   On error STATE->unwound == NULL
+   or STATE->unwound->pc_state == DWFL_FRAME_STATE_ERROR;
+   in such case dwfl_errno () is set.
+   If STATE->unwound->pc_state == DWFL_FRAME_STATE_PC_UNDEFINED
+   then STATE was the last valid frame.  */
 extern void __libdwfl_frame_unwind (Dwfl_Frame *state)
   internal_function;
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]