Fix a crash when stepping and unwinding fails

Mark Kettenis mark.kettenis@xs4all.nl
Tue Feb 21 20:54:00 GMT 2006


> Date: Tue, 21 Feb 2006 15:28:33 -0500
> From: Daniel Jacobowitz <drow@false.org>
> 
> On Tue, Feb 21, 2006 at 09:15:16PM +0100, Mark Kettenis wrote:
> > > It's still not great, but at least it's an improvement over crashing.
> > > It is reasonably likely that we've just stepped over a standard
> > > function call, and that consequentially the function return
> > > address is in the standard place for the architecture; in fact,
> > > GDB used to have a hook for this, before the frame overhaul:
> > > SAVED_PC_AFTER_CALL.  But it's gone now and there's no easy analogue,
> > > and it was never 100% reliable anyway.  So unfortunately, if we
> > > single-step out to an address that we can't find a way to unwind from,
> > > we'll stop instead of stepping out.
> > 
> > How can this happen?  Both affected calls to
> > insert_step_resume_breakpoint_at_frame() are in the same
> > 
> >   if (frame_id_eq (frame_unwind_id (get_current_frame ()), step_frame_id))
> >     {
> > 
> > block.  Assuming that step_frame_id isn't equal to null_frame_id, this
> > means that we *can* unwind.
> 
> There's your problem: you're assuming that step_frame_id isn't equal to
> null_frame_id.  But in fact it is.

But if step_frame_id is equal to null_frame_id, we shouldn't be trying
to insert step-resume-breakpoints.  It means that step_frame_id is
still uninitialized, since step_frame_id is initialized by:

  step_frame_id = get_frame_id (get_current_frame ());

(or equivalent code), and unwinding from sentinel frame shoud always
yield a frame ID that's different from null_frame_id.

> If we can't unwind past the current frame, then that means the last
> frame sniffer (generally the prologue analyzer), which is required
> to accept any frame given to it, could not make heads or tails of
> it.  Which in turn means it doesn't know what the frame's ID is, so
> it gets left as invalid.  Which means the current frame will have an
> ID of null_frame_id.
> 
> That's what's happening to me, although I seem to recall something
> similar could be produced by stepping across main without debug info.

I think it can happen if you're trying to step "over" main from within
the C runtime code.  But in that case the frame ID won't be
null_frame_id.

> That seems like a good change indeed, but probably wouldn't fix this
> problem.
> 
> Hmm, what does frame_pc_unwind do when we've hit the last frame?  I'm
> not sure it's meaningful.

How can we hit the last frame?  If we're hitting the last frame, where
did we come from?

It may very well be that there are GDB bugs that make step_frame_id
equal to null_frame_id.  If we can't trace those bugs right now, we
should probably sprinkle a few gdb_assert()'s around and try to solve
the issues when we hit those.

Mark



More information about the Gdb-patches mailing list