new AVR prologue scanner (was a review and questions on avr_scan_prologue())

Petr Hluzín
Sun Feb 21 01:20:00 GMT 2010


On 16 February 2010 06:12, Weddington, Eric <> wrote:
>> -----Original Message-----
>> From: Petr Hluzín []
>> Also:
>> The avr_scan_prologue()'s recognizes several well-known prologues. Is
>> there a reason why it does not use the general prologue analysis
>> algorithm as described in the documentation [2]?
> IIRC, the AVR GCC prologue and epilogue used to be some fixed
> code. Recently the AVR GCC port generates RTL-based prologues
> and epiloges. I'm sure AVR GDB has not kept up.
>> I think universal prologue analysis is quite easy with AVR arch. The
>> code might be shorter (though less clear).
>> I might try to write the code if you are interested.
>> (The current prologue scan code chokes on hand-crafted assembly.)
> Yes, we would be very interested.

Here is my idea how it might work. (Sorry for the length.)

This algorithm uses SP and IP values from target (or called frame).
Others when explicitly noted.
* for every register: a byte pos after return address where is was
saved (if was saved) - current impl does that
* for every register: a literal value it was loaded with (if any)
* for every register: keep value of stack/frame size at time SPL/SPH
was moved in (if ever)
* a flag if we did encounter a call (for __prologue_saves__()
handling) - this limits number of subroutines scanned - which might be

== The loop:
In loop process every instruction from suspected procedure entry point:

push: Increase frame size. If register was not already pushed or
damned do record the stack pos.

pop: Decrease frame size. Some previously saved register may become
"unsaved" - should we handle this? (Also an ADD may trigger this.)

ldi: Load of a constant into a register:
** note the new literal content, if the register was not already
pushed then mark it as damned
** discard a SP value in the register, if any

move-in a SPL/SPH value: Copy frame size to the register (the lower/upper half)

subi, sbci, sbiw: subtract literal from a register containing a
SPL/SPH value (i.e. frame size):
* add the subtracted value to frame size stored in the register.
* This also handles the case when SP value was aleady moved-out to IO
reg or there has been push while SP was in register

sub, sbc: subtract register containing a constant from a register
containing a SPL/SPH value (i.e. frame size):
* handle as the case above

subi, sbci, sbiw: subtract literal from a register already containing
a constant:
* (unoptimized code?) adjust the constant <- we have to track the
carry if we want 16b. Yuck, might not impl that.

subi, sbci, sbiw: subtract literal (remaining cases): ignore

out: moving-out a register with SPL/SPH to appropriate IO register:
* record new frame size (note: might be negative if gcc moves SPL first)

add, adc, adiw: The same logic for adding.

mov, movw: Move between registers: record them.
* the source register may contain SP, handle that
* do not modify the stack position where a register was pushed

An other instruction modifying a register: discard all info for the
reg, specifically:
* if it was not saved tham mark it as damned
* mark it as not containing SPL/SPH
* mark it as not containing a constant
* this leads to a lot of code

'rcall .+0': handle as push of 2 or 3 bytes.

call, rcall: Follow it, it might be a __prologue_saves__().
icall, eicall: Give up.

ijmp, (eijmp): A jump to calculated address and we know the value:
* possibly a __prologue_saves__() return, follow it.

ijmp, (eijmp): A jump to calculated address and we do not know the
value: give up.

A conditional jump or skip: This is a problem if -fomit-frame-pointer.
Shall we do e.g. recursion? and keep an array of processed

Other instructions have no effect on frame shape or are a setup using
them would be both sub-optimal and not naive.

Loop exit conditions:
* reached an instruction pointed to by IP
* exmined too much instructions
If we have lost track how SP was manipulated then use frame ptr offset
and SP and frame ptr values from target

Prologue types/considerations:
* a bunch of locals, later allocating another bunch of locals, and again
* pushes,pops,pushes,... - no frame pointer
* frame pointer combined with pushes and pops
* frame pointer combined with pushes and pops and -mcall-saves
* interrupt service routines (ISR): require no special code/care
* should we handle a situation with push or pop while a SP copy is
being subtracted? (I think we do now)
* should we handle incorrect prologues? Can we fail silently (=print mess)?
* should we handle prologues that are suboptimal (e.g. unnecessary
modifications of vital registers)?
* when to give up looking for a -mcall-saves routine? Until first
examined subroutine? This means that compiler must not use the routine
outside of prologue.
* how to handle/detect main()? Or its caller in C-runtime?
* how to detect thread entry functions?
* detect "st Y+,r0"?
* -fomit-frame-pointer:
This is tougher than I thought. (Many functions I've seen are OK. The
problem is with certain loops, switch-case and full if-else.)
I suggest to analyze basic-blocks reachable without a call until an
instruction with current IP is found (or we give up) and keep a list
of addresses of analyzed basic-blocks. This is potentially slow.
(Might cache the results - source of bugs.)
I do not know how other archs handle it.

A backup strategy?
E.g. search stack for bytes which (when interpreted as a return
address) lie just after a call instruction. Others?

The proposed code will fail when single stepping last 4 instructions
of __prologue_saves__() - unless it takes into consideration a
register values from target. (It will self-heal after a return to

The proposed code will fail on code where we can not see a beginning
of the function, e.g. binary without any symbols or an C++ exception
pad (not on AVR yet). (This can be improved, though.)

Petr Hluzin

More information about the Gdb mailing list