This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch v7 3/5] x86* unwinder: libdwfl/


On Thu, 2013-10-24 at 18:36 +0200, Jan Kratochvil wrote:
> On Tue, 22 Oct 2013 23:31:50 +0200, Mark Wielaard wrote:
> > I would just call it when constructing the Dwfl_Process and store it
> > there since we are reusing it every time. Then the above simply becomes
> > if (process->class == ELFCLASS32)
> 
> I do not think such microoptimizations are worth it nowadays.
> 
> Moreover I even removed the inline definition and changed it to normal
> declaration (with definition in libdwfl/frame_unwind.c).  gcc -flto should do
> such kind of stuff (it degrades performance in my tests but thats offtopic).
> 
> I hope that's fine with you.

Yeah. It would be premature optimization probably.
If there are any performance issues we should profile before I guess.

> > > > I don't understand this last else block part where isactivation depends
> > > > on the unwound state. Could you add a comment explaining or a reference
> > > > to some document that defines this part?
> > > 
> > > I considered it was documented by these two lines:
> > > 
> > > > > +      /* *ISACTIVATION is logical or of current and previous frame state.  */
> > > > > +     /* Not affected by unsuccessfully unwound frame.  */
> > > 
> > > Is this expansion of comments good enough now?
> > > 
> > >       /* Bottom frame?  */
> > >       if (state == state->thread->unwound)
> > >         *isactivation = true;
> > >       /* *ISACTIVATION is logical union of whether current or previous frame
> > >          state is SIGNAL_FRAME.  */
> > >       else if (state->signal_frame)
> > >         *isactivation = true;
> > >       else
> > >         {
> > >           /* If the previous frame has unwound unsuccessfully just silently do
> > >              not consider it could be a SIGNAL_FRAME.  */
> > >           __libdwfl_frame_unwind (state);
> > >           if (state->unwound == NULL
> > >               || state->unwound->pc_state != DWFL_FRAME_STATE_PC_SET)
> > >             *isactivation = false;
> > >           else
> > >             *isactivation = state->unwound->signal_frame;
> > >         }
> 
> ------------------------------------------------------------------------------
> (gdb) l 1
> 1	#include <signal.h>
> 2	#include <unistd.h>
> 3	void f(int signo) {
> 4	}
> 5	int main(void) {
> 6	  signal(SIGINT,f);
> 7	  pause ();
> 8	  return 0;
> 9	}
> (gdb) b f
> Breakpoint 1 at 0x400587: file sigint.c, line 4.
> (gdb) r
> Starting program: sigint 
> <ctrl-c>
> Program received signal SIGINT, Interrupt.
> 0x00000030304bd330 in __pause_nocancel () at ../sysdeps/unix/syscall-template.S:81
> 81	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
> (gdb) signal SIGINT
> Continuing with signal SIGINT.
> Breakpoint 1, f (signo=2) at sigint.c:4
> 4	}
> (gdb) bt
> #0  f (signo=2) at sigint.c:4
> #1  <signal handler called>
> #2  0x00000030304bd330 in __pause_nocancel () at ../sysdeps/unix/syscall-template.S:81
> #3  0x00000000004005a1 in main () at sigint.c:7
> (gdb) up
> #1  <signal handler called>
> (gdb) disas
> Dump of assembler code for function __restore_rt:
> => 0x0000003030435a90 <+0>:	mov    $0xf,%rax
>    0x0000003030435a97 <+7>:	syscall 
>    0x0000003030435a99 <+9>:	nopl   0x0(%rax)
> End of assembler dump.
> ------------------------------------------------------------------------------
> > 
> > I think I just don't fully understand the semantics of the 'S'
> > augmentation. That is an eh_frame CFI extension, not described in the
> > DWARF spec. I understand that if this is a signal_frame then the PC
> > doesn't need to be adjusted.
> 
> state->signal_frame is true only for frame #1.  I rather do not see so clear
> why we should not adjust PC for the signal frame itself.

Indeed. So, that is because we unwind into the restore_rt stub that will
do the sig_return syscall isn't it? It is special because it was never
really called, it is just the place where we return when a signal
happens.

>   But as one can see
> we must not adjust PC for the signal frame as we are on very first instruction
> of __restore_rt.

Aha, ok, so restore_rt is the frame with 'S'. And what I had assumed the
signal handler, function f, doesn't, since it is just a normal function.

> > But why don't we have adjust the PC for
> > this frame if we were called from a signal_frame?
> 
> We are unwinding the opposite direction so we are determining whether to
> adjust PC for the frame that "called" (was interrupted by) the signal frame.
> The frame called from a signal frame (#0) is already processed before we ever
> found there is some signal frame and that one is unrelated.
> 
> I hope it is clear why the frame unwound from a signal frame (#2) must not
> have adjusted PC.

Right, I had the unwinding frame order wrong. Indeed a frame that is
interrupted for a signal frame, shouldn't have the PC adjusted. Because
it didn't do a real "call instruction" that might have moved the PC
already.

So. The initial frame doesn't need PC adjustment since it was
interrupted and has the actual PC value. A frame that is returns to a
signal frame doesn't need its PC adjusted since it wasn't entered
through a call (and if we cannot determine where we return to, then
assume the frame isn't special). And a signal frame itself doesn't need
its PC adjusted because it is "special".

Yes, now all the comments and the code make sense. Sorry for all the
confusion. This is tricky. Thanks for the example.

> > > Some CFI-valid OPs are really missing, they just were not in the critical-path
> > > initial post.
> > 
> > IMHO it would be really good if we listed the known missing ones in one
> > (fall-through) case statement that is UNSUPPORTED_DWARF and make the
> > default INVALID_DWARF. That way we also have a concrete TODO list.
> 
> Put into libdwfl/frame_unwind.c:
>       // case DW_OP_addr:
>       // case DW_OP_GNU_encoded_addr:
>       // case DW_OP_pick:
>       // case DW_OP_over:
>       // case DW_OP_swap:
>       // case DW_OP_rot:
>       // case DW_OP_deref_size:
>       // case DW_OP_abs:
>       // case DW_OP_neg:
>       // case DW_OP_not:
>       // case DW_OP_div:
>       // case DW_OP_minus:
>       // case DW_OP_mod:
>       // case DW_OP_or:
>       // case DW_OP_shr:
>       // case DW_OP_shra:
>       // case DW_OP_xor:

I would really like it if these were all just in one block and had:

        __libdwfl_seterrno (DWFL_E_UNSUPPORTED_DWARF);
        return false;

And then change the default: to:

        __libdwfl_seterrno (DWFL_E_INVALID_DWARF);
        return false;

That makes it slightly easier to see if it is "our fault" or bad unwind
info.

> > > > > +  if (state != state->thread->unwound && ! state->signal_frame)
> > > > > +    pc--;
> > > > > +  Dwfl_Module *mod = INTUSE(dwfl_addrmodule) (state->thread->process->dwfl, pc);
> > > > > +  if (mod != NULL)
> > > > > +    {
> > > > > +      Dwarf_Addr bias;
> > > > > +      Dwarf_CFI *cfi_eh = INTUSE(dwfl_module_eh_cfi) (mod, &bias);
> > > > > +      if (cfi_eh)
> > > > > +	{
> > > > > +	  handle_cfi (state, pc - bias, cfi_eh);
> > > > > +	  if (state->unwound)
> > > > > +	    return;
> > > > > +	}
> > > > > +      Dwarf_CFI *cfi_dwarf = INTUSE(dwfl_module_dwarf_cfi) (mod, &bias);
> > > > > +      if (cfi_dwarf)
> > > > > +	{
> > > > > +	  handle_cfi (state, pc - bias, cfi_dwarf);
> > > > > +	  if (state->unwound)
> > > > > +	    return;
> > > > > +	}
> > > > > +    }
> > > > > +  __libdwfl_seterrno (DWFL_E_NO_DWARF);
> > > > > +}
> > > > 
> > > > handle_cfi might set dwflerrno when state->unwound == NULL, in that case
> > > > we probably shouldn't override it with the generic DWFL_E_NO_DWARF,
> > > > might want to put that in the else branch of (mod != NULL)?
> > > 
> > > The reason is that while the idea of DWFL_FRAME_STATE_PC_UNDEFINED terminating
> > > the backtrace is nice in real world it does not happen.  It was (I have) only
> > > recently fixed it for x86* and non-x86* archs may never be fixed.  So the
> > > decision is left up to the caller whether DWFL_E_NO_DWARF is considered as
> > > a proper backtrace terminator.  tests/run-backtrace.sh has for it:
> > >   # In some cases we cannot reliably find out we got behind _start.
> > >   if cmp -s <(echo "${abs_builddir}/backtrace: dwfl_thread_getframes: No DWARF information found") <(uniq <$1); then
> > > 
> > > There may be multiple different error codes if CFI is not found, at least
> > > besides DWFL_E_NO_DWARF also DWFL_E_NO_MATCH.  So that DWFL_E_NO_DWARF is
> > > rather a part of API.
> > 
> > hmmm, that doesn't really work since the DWFL_E_ error codes are private
> > (libdwflP.h). You can get an error string describing the error, but not
> > really compare them easily. An error is an error as far as the user is
> > concerned.
> > 
> > Since we cannot guarantee unwinding finishes cleanly, the user just has
> > to accept that unwinding often just ends in an error. Ugly, but
> > unavoidable I am afraid. We will just have to work towards making the
> > whole toolchain DWARF unwinding "clean" eventually (some far future
> > dream). At least DWARF4 6.4.4 seems very clear on how to mark the end of
> > an unwind.
> 
> OK, true, in such case I have removed the DWFL_E_NO_DWARF override and updated
> the testcase:
>    # In some cases we cannot reliably find out we got behind _start.
> -  if cmp -s <(echo "${abs_builddir}/backtrace: dwfl_thread_getframes: No DWARF information found") <(uniq <$1); then
> +  if cmp -s <(echo "${abs_builddir}/backtrace: dwfl_thread_getframes: no matching address range") <(uniq <$1); then
> 
> Updated dwfl_thread_getframes comment to:
> /* Iterate through the frames for a thread.  Returns zero if all frames
>    have been processed by the callback, returns -1 on error, or the value of
>    the callback when not DWARF_CB_OK.  -1 returned on error will 
>    set dwfl_errno ().  Some systems return error instead of zero on end of the 
>    backtrace, for cross-platform compatibility callers should consider error as
>    a zero.  Keeps calling the callback with the next frame while the callback

Sorry for a late suggestion, but I only thought of it now. Before I
didn't fully appreciate why you wanted to see the difference between
"real error" and "no information available".

Since dwfl_thread_getframes returns an int, could we change it to:

/* Iterate through the frames for a thread.  Returns zero if all
   frames have been processed successfully by the callback.  Returns
   1 when there are no more frames because there is no unwind
   information found for the last frame's PC value,
   returns -1 on error, or the value of the callback when not
   DWARF_CB_OK.  When the return value is -1 or 1 will set
   dwfl_errno () to indicate why unwinding stopped.  dwfl_errno will be
   zero if the callback returned something different from DWARF_CB_OK.
   Keeps calling the callback with the next frame while the callback
   returns DWARF_CB_OK, till there are no more frames.  On start will
   call the set_initial_registers callback and on return will call the
   detach_thread callback of the Dwfl_Thread.  */

That way you can more easily decide whether to report an error or not.
-1 is a "true" error. 1 is most likely not a real error (though it could
be).

Then in dwfl_thread_getframes you would just do:

  int err = callback (state, arg);
  if (err != DWARF_CB_OK)
    {
      if (process->callbacks->thread_detach)
        process->callbacks->thread_detach (thread, thread->callbacks_arg);
      thread_free_all_states (thread);
      __libdwfl_seterrno (DWFL_E_NOERROR);
      return err;
    }

and

  if (state == NULL || state->pc_state == DWFL_FRAME_STATE_ERROR)
    {
      __libdwfl_seterrno (err);
      return (err == DWFL_E_NO_DWARF
              || (err == DWFL_E_ADDR_OUTOFRANGE) ? 1 :-1;
    }

Then we do need to notice the DWARF_E_NO_MATCH in handle_cfi and
transform it to a DWFL_E_NO_MATCH with the following trick:

   if (INTUSE(dwarf_cfi_addrframe) (cfi, pc, &frame) != 0)
    {

      int err = INTUSE(dwarf_errno) ();
      __libdwfl_seterrno (DWARF_E_NO_MATCH
		     	  ? DWFL_E_NO_MATCH : DWFL_E (LIBDW, err));
      return;
    }

Thanks,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]