This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
Re: The gdb x86 function prologue parser
Date: Wed, 8 Jun 2005 09:58:05 -0700
From: Jason Molenda <jason-swarelist@molenda.com>
Hi Daniel, thanks for the comments.
On Wed, Jun 08, 2005 at 09:24:31AM -0400, Daniel Jacobowitz wrote:
> I looked at your table.
>
> (A) You've added jump instructions to it. Assuming that I'm following
> what you're doing with this table correctly, I'm not real comfortable
> with that without special cases checking the targets of the jumps.
These showed up in a couple of functions that had hand-written
assembly at the start of the function. Like there's one who has a
special agreement with its caller about the contents of EDX, and
it'd compare EDX to a value and then jump to an alternate location
if it matched. The only way this code is executed is if the PC is
past the jmp instruction, so I wasn't too concerned about it --
SOMEHOW we got past the jmp and back again.
You are aware of the fact that the prologue scanner is used for two
purposes? I not, here they are:
1. Finding out the gory details about the stack frame being executed.
2. Determining the first bit of code after the prologue, i.e. the
first bit of real code.
For (2) following jumps is usually a very bad thing to do.
That said, I'm not sure this dual usage of the prologue scanner really
makes sense these days. There is a certain lack of consistency in gdb
how we handle this anyway. Maybe the best thing to do is to not use
the prologue scanner for 2 at all.
>
> > And for goodness sakes, if we can't figure out anything
> > about a function that's not at the top of the stack, don't you think
> > it'd be reasonable to assume that the function has set up a stack
> > frame and saved the caller's EBP?
>
> Because there is a GDB policy to determine information about the frame
> based on the current frame, not based on where it lies on the stack.
> I've experimented with this before; this change can have some weird
> consequences... for instance, in any case where we can backtrace
> through "foo" only because of the addition of this case, we won't be
> able to backtrace through "foo" if it is on top of the stack.
I'd say that's an expected behavior, but yes, it's true that this
can happen. It'd be great if the prologue analyzer never got confused
and could always figure out how to find a function's caller's saved
fp/pc, but even if we switch to using the opcodes disassembler so
we never lose on another instruction, on MacOS X we can have libraries
where the functions up the stack that have no symbols whatsoever.
We have no idea where the function might begin--all we know is a saved
address in the middle of a function. In such a situation, is it
preferable that we can't backtrace past tricky functions like these?
After a month of working on the x86 port, I got so frustrated I wrote
a user command that could backtrace --
define x86-bt
set $frameno = 1
set $cur_ebp = $ebp
printf "frame 0 EBP: 0x%08x EIP: 0x%08x\n", $ebp, $eip
x/1i $eip
set $prev_ebp = *((uint32_t *) $cur_ebp)
set $prev_eip = *((uint32_t *) ($cur_ebp + 4))
while $prev_ebp != 0
printf "frame %d EBP: 0x%08x EIP: 0x%08x\n", $frameno, $prev_ebp, $prev_eip
x/1i $prev_eip
set $cur_ebp = $prev_ebp
set $prev_ebp = *((uint32_t *) $cur_ebp)
set $prev_eip = *((uint32_t *) ($cur_ebp + 4))
set $frameno = $frameno + 1
end
end
because I was having to do backtraces by manually walking the stack
so often. That's when I said, "enough is enough, this is stupid that
gdb can't do this."
> You can find more information about this in the list archives, in
> plenty of places; most recently Mark pulled together an implementation
> of "set i386 trust-frame-pointer".
Yeah, I couldn't comment at the time. Mark's change was wrong.
Oh, I'm so happy I don't live in that silly corporate world where you
grudgingly have to bite your tongue, because taking part in a
technical discussion would reveal a little too much about the
company's strategy ;-).
He said himself,
You probably want to reset it to 0 before continuing your program
since I found out that bad things happen with some of the tests
in the gdb testsuite with this turned on.
http://sourceware.org/ml/gdb/2005-04/msg00177.html
That's neither necessary nor acceptable. Mark's initial
reading of the Sleep() vs SleepEx() was IMO not correct.
http://sourceware.org/ml/gdb/2005-04/msg00156.html
Sleep() sets up a stack frame, then jumps to SleepEx().
SleepEx doesn't set up a stack frame, but that's fine --
Sleep() did. This is another instance that bolsters my
"if the function MUST have stored the caller's pc/fp, assume
it did" method -- if you try to analyze SleepEx() where
the PC is, you'll see a frameless function. But it's in
the middle of the stack; it can't be frameless.
Ah, but SleepEx() is a valid entry point itself. That's where things
get tricky. I'm not saying that there is no way out, but I simply
don't see it.
> > + /* We found a function-start address,
> > + or $pc is at 0x0 (someone jmp'ed thru NULL ptr). */
> > + if ((cache->pc != 0 || current_pc == 0)
>
> No way that's right. A jump through 0x0 is no different from a jump
> through any other unmapped, non-code address. Normally one uses a
> different frame unwinder for that case.
OK, I didn't know the right practice. Right now it goes through
i386_cache_frame. It's "frameless", of course, but we don't have
a function symbol for it so cache->pc (WTF is up with that structure
variable name, anyway--it means the start address of the function for
this frame) is 0 (i.e. unset).
It's probably historic; perhaps we should rename it.
Mark