This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Move the frame zero PC check earlier


> Date: Tue, 16 May 2006 16:45:03 -0400
> From: Daniel Jacobowitz <drow@false.org>
> 
> I won't reply paragraph-by-paragraph; there's getting to be a lot of
> paragraphs.  Please, let me know if I've cut something that I shouldn't
> have.
> 
> On Sat, May 13, 2006 at 06:42:44PM +0200, Mark Kettenis wrote:
> > > Explanatory output ("why did that backtrace stop?") is available in
> > > "set debug frame 1".  If you think it's routinely useful, then we can
> > > make it available in some prettier form, perhaps in "info frame" for
> > > the outermost frame.
> > 
> > If we can reliably tell that a frame is the outermost frame, we might
> > indeed print that as part of "info frame".
> 
> That's just about the opposite of what I'm suggesting.  I think that
> "the stack ended jaggedly" might be useful in "info frame".

So we disagree.

> > > Also, I don't think that "gdb is confused" errors are as desirable as
> > > you think they are.  This extra frame has been reported to me as a bug
> > > at least three times that I can think of (twice for RTOSes and once for
> > > Linux KGDB).
> > 
> > I can imagine you'd like to get these people off your back.  And
> > perhaps they're right that the extra frame is caused by a bug in GDB.
> > But that bug is not the printing of the extra frame itself.  The bug
> > is GDB not being able to determine that it is at the end of the stack,
> > which might actually be a bug in the compiler or system libraries
> > they're using.
> 
> No!  No no no.
> 
> First of all, I'm not just trying to get them off my back.  I think
> they're right and it shouldn't be displayed.  Second, this _is_ GDB
> being able to determine that it's at the end of the stack.

Yes, if GDB is being able to detect that the stack has ended, it
should defenitely not print another, bugus frame.

> A return address of zero is a fairly common convention for this.

Fairly common, perhaps, but not universal.

> It's natural, if you think about it.  On architectures with a
> well entrenched frame pointer (exhibit A, our earlier conversation
> about cache->base and %ebp on x86) then that can be initialized to zero
> either before calling a generic higher-level function or else by
> handwritten startup code.  The same is true for the return address; it
> can be set to return to nowhere.  Neither "makes sense", so they are
> useful markers.  About the only other option is the stack pointer, and
> you can't do that unless you're calling handwritten startup code that
> also knows where the stack is supposed to go - pretty rare in modern
> systems where that code is being called by anything other than a reset
> vector.

So there are several conventions, and these make sense for a specific
ISA or perhaps even a specific OS.

> > Then we should improve the unwinder.  If we didn't error out with that
> > error, the backtrace would never end.
> 
> As you well know, in many cases it is either impractical or downright
> impossible to improve the prologue unwinder, e.g. when OS vendors ship
> system libraries with neither unwind information nor symbols.

And this is exactly the case where I think the jagged end of the
backtrace is important.  It indicates that GDB lost track somewhere
and that the backtrace can't be trusted.

> > > And Joel recently reported that Ada tasking generates this message
> > > on at least one platform, and users are unhappy about that, too.
> > 
> > IIRC this is a case where the outermost frame wasn't marked properly,
> > or at least not detected as such by GDB.  That's the problem that
> > needs to be fixed.
> 
> I guess that depends what you mean by "marked properly" and what you
> mean by "fixed".
> 
> The problem was that a routine in the system libraries was called
> directly from new threads.  The name of the routine is OS-dependant
> and maybe even OS-version-dependant.  Changing the system libraries is
> out of the question here; all the world isn't free software and I
> believe the system in question was either HP-UX or OSF.  Recognizing it
> by name would, I suppose, work - but you'd have to be careful of the
> user reusing a fairly generic function name!  Or else limit the check
> to a specific shared object, and ditch caring about static linking.

If this is on a proprietary system and the system vendor did not
provide a reliable way to determine we're falling off the stack,
there's not much we can do.  That's one of the reasons why we should
encourage people to use free software, that can be changed to do the
right thing.

> > > And when we've run out of useful information, the stack appears to
> > > end, and we're quite justified in reporting that the stack ended.
> > > It's quite complex enough already without reporting "but the end of
> > > the stack looks a little funny to me...".
> > 
> > No, if a stack doesn't end properly on a platform where it should end
> > properly, that's useful information that should be reported to the
> > user.
> 
> I answered this one in another message, and it's basically the same as
> my first bit above in this message.  I think it's precisely proper to
> report this as an end of stack condition.
> 
> We're reading the stack frame from memory, so we're automatically
> vulnerable to displaying corrupted information if the user has
> scribbled on the stack.  But I don't think that merits displaying
> something that we know is garbage.  A non-zero unknown PC might be
> garbage or it might be code we don't have symbols for; we display it as
> if it were code we didn't have a symbol for.  I posit that there's a
> difference in kind between that and a zero PC, which might be garbage
> or might be the end of the stack; and by analogy, I'm suggesting we
> display it as if it is the end of the stack.

But the zero PC is *not* universal.  Therefore it should be treated
the same as the non-zero garbage PC.

> Anyway, that's my opinion.  I have no idea how to proceed on this;
> I don't really expect any of that to change your mind, and you probably
> don't use any system where this extra frame is a serious annoyance.

I cannot imagine that a single extra frame to be a serious annoyance.
I can see that the extra frame looses its signalling function on
systems where it's seen a lot in cases where the stack actually ends
that way.

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]