Internals/Prologue Analysis

Prologue Analysis

To produce a backtrace and allow the user to manipulate older frames’ variables and arguments, GDB needs to find the base addresses of older frames, and discover where those frames’ registers have been saved. Since a frame’s “callee-saves” registers get saved by younger frames if and when they’re reused, a frame’s registers may be scattered unpredictably across younger frames. This means that changing the value of a register-allocated variable in an older frame may actually entail writing to a save slot in some younger frame.

Modern versions of GCC emit Dwarf call frame information (“CFI”), which describes how to find frame base addresses and saved registers. But CFI is not always available, so as a fallback GDB uses a technique called prologue analysis to find frame sizes and saved registers. A prologue analyzer disassembles the function’s machine code starting from its entry point, and looks for instructions that allocate frame space, save the stack pointer in a frame pointer register, save registers, and so on. Obviously, this can’t be done accurately in general, but it’s tractable to do well enough to be very helpful. Prologue analysis predates the GNU toolchain’s support for CFI; at one time, prologue analysis was the only mechanism GDB used for stack unwinding at all, when the function calling conventions didn’t specify a fixed frame layout.

In the olden days, function prologues were generated by hand-written, target-specific code in GCC, and treated as opaque and untouchable by optimizers. Looking at this code, it was usually straightforward to write a prologue analyzer for GDB that would accurately understand all the prologues GCC would generate. However, over time GCC became more aggressive about instruction scheduling, and began to understand more about the semantics of the prologue instructions themselves; in response, GDB’s analyzers became more complex and fragile. Keeping the prologue analyzers working as GCC (and the instruction sets themselves) evolved became a substantial task.

To try to address this problem, the code in prologue-value.h and prologue-value.c provides a general framework for writing prologue analyzers that are simpler and more robust than ad-hoc analyzers. When we analyze a prologue using the prologue-value framework, we’re really doing “abstract interpretation” or “pseudo-evaluation”: running the function’s code in simulation, but using conservative approximations of the values registers and memory would hold when the code actually runs. For example, if our function starts with the instruction:

addi r1, 42     # add 42 to r1

we don’t know exactly what value will be in r1 after executing this instruction, but we do know it’ll be 42 greater than its original value.

If we then see an instruction like:

addi r1, 22     # add 22 to r1

we still don’t know what r1's value is, but again, we can say it is now 64 greater than its original value.

If the next instruction were:

mov r2, r1      # set r2 to r1's value

then we can say that r2's value is now the original value of r1 plus 64.

It’s common for prologues to save registers on the stack, so we’ll need to track the values of stack frame slots, as well as the registers. So after an instruction like this:

mov (fp+4), r2

then we’d know that the stack slot four bytes above the frame pointer holds the original value of r1 plus 64.

And so on.

Of course, this can only go so far before it gets unreasonable. If we wanted to be able to say anything about the value of r1 after the instruction:

xor r1, r3      # exclusive-or r1 and r3, place result in r1

then things would get pretty complex. But remember, we’re just doing a conservative approximation; if exclusive-or instructions aren’t relevant to prologues, we can just say r1’s value is now “unknown”. We can ignore things that are too complex, if that loss of information is acceptable for our application.

So when we say “conservative approximation” here, what we mean is an approximation that is either accurate, or marked “unknown”, but never inaccurate.

Using this framework, a prologue analyzer is simply an interpreter for machine code, but one that uses conservative approximations for the contents of registers and memory instead of actual values. Starting from the function’s entry point, you simulate instructions up to the current PC, or an instruction that you don’t know how to simulate. Now you can examine the state of the registers and stack slots you’ve kept track of.

To see how large your stack frame is, just check the value of the stack pointer register; if it’s the original value of the SP minus a constant, then that constant is the stack frame’s size. If the SP’s value has been marked as “unknown”, then that means the prologue has done something too complex for us to track, and we don’t know the frame size.
To see where we’ve saved the previous frame’s registers, we just search the values we’ve tracked — stack slots, usually, but registers, too, if you want — for something equal to the register’s original value. If the calling conventions suggest a standard place to save a given register, then we can check there first, but really, anything that will get us back the original value will probably work.

This does take some work. But prologue analyzers aren’t quick-and-simple pattern patching to recognize a few fixed prologue forms any more; they’re big, hairy functions. Along with inferior function calls, prologue analysis accounts for a substantial portion of the time needed to stabilize a GDB port. So it’s worthwhile to look for an approach that will be easier to understand and maintain. In the approach described above:

It’s easier to see that the analyzer is correct: you just see whether the analyzer properly (albeit conservatively) simulates the effect of each instruction.
It’s easier to extend the analyzer: you can add support for new instructions, and know that you haven’t broken anything that wasn’t already broken before.
It’s orthogonal: to gather new information, you don’t need to complicate the code for each instruction. As long as your domain of conservative values is already detailed enough to tell you what you need, then all the existing instruction simulations are already gathering the right data for you.

The file prologue-value.h contains detailed comments explaining the framework and how to use it.

None: Internals/Prologue Analysis (last edited 2015-02-15 02:22:27 by SimonMarchi)