This is the mail archive of the mailing list for the elfutils project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] backends: Hook abi_cfi for arm.

> BTW I used LR as return_address_register on s390 and ppc because there is no
> DWARF number for PC and the CFI seems wrongly addressing LR there (moreover
> under two different numbers - and really not that one of them should be PC).

return_address_register does not have to be a register number that has a
meaning assigned in the spec.  It is really just a private choice for the
particular CIE.  If there is no specified number for the PC, then it can be
any number that is not specified as being for another particular register.

> So in the case of ARM having real DWARF number for PC it should be
> probably used for return_address_register.  But I do not know how to read
> .ARM.extab to verify how the unwind tables look there - if they use the
> register 15.

There is readelf -u in binutils and see
(actual spec in
).  And you can look at .debug_frame tables.  But that's really neither
here nor there.  It doesn't matter what the compiler does for some
particular FDEs (or for all that it emits).  All that matters is that the
number you choose for a particular CIE is the one that its FDEs describe
correctly for rematerializing the return address and is not one specified
to mean a particular register if the caller's value of that register has a
value that is not the same as the return address.

For machines with a link-register calling convention, "the caller's value
of the link register" is the same thing as "the return address" on entry.
The benefit of distinguishing them is that you can more precisely describe
what is required for unwinding without falsely stating that more is
required.  That is, the link register is call-clobbered in the ABI and to
unwind to the caller it would be sufficient to set the PC to the return
address while not restoring the link register.  By using a different
return_address_register, you could describe a situation where, e.g. the
return address is recoverable from the stack or another register, but the
link register per se will be left as "undefined".  In the case of ARM,
imagine the code:

	mov r0, lr
	mov lr, #0
	bx r0

Now, it would be correct enough to say that return_address_register is r0,
that r0's rule is same-value, and that lr's rule is undefined (or is
register r0, if you like).  That is not entirely precise, since it says the
caller's value of r0 was knowable when in fact it's undefined.  But as to
the return address and lr, it's perfectly correct and precise.  In actual
fact, the compiler would never generate code like that because the hardware
makes it optimal always to use "bx lr" for return (it has a call/return
prediction stack and "bx lr" is specifically recognized as "return").

So it's not really clear which is "better", because it depends on what your
goals are.  

If you just want to unwind and that's all, then it's best to arrange that
unwinding does as little work as possible.  It's not necessary to restore
the caller's value to lr, since lr is call-clobbered anyway.  So then it's
better to use the PC number (or an unassigned number, on machines with no
number assigned for the PC) as return_address_register and an undefined
rule for lr.  Depending on the code, that may even be a more precisely
correct description, because the code might return to the caller's lr
without putting that address back into lr:

	push {..., lr}
	bl other_function @ clobber lr
	pop {..., pc} @ restore regs & return in one insn, leave lr clobbered

That is, the epilogue won't restore lr, so unwinding is closer to the
effect of a natural return if it doesn't restore lr either.

For general debugging purposes, we usually say that the compiler should
describe how to recover everything that truly can be recovered.  (This is a
different predicate from "everything that a natural return would actually
restore".)  So it's nice to describe lr with a rule other than undefined
(i.e. same-value in the trivial case), because that means that when you go
up a frame you can see the value the register truly had at the call site
rather than being told it's unavailable.  Then it just seems redundant and
wasteful to use a different value for return_address_register, because then
you need to process pc=lr rule as well (or if lr is no longer same-value,
you need to repeat the same rule for lr and for return_address_register).

OTOH, if return_address_register != lr and the rule for
return_address_register gives a location that is mutable (i.e. some other
register or memory location) and that location is not also used in some
other register's rule, then the CFI describes how you can change the
machine state to warp the eventual return of the frame without changing any
caller register.  From the debugger perspective that seems ideal, but a
case where you could actually do that seems extremely unlikely (unless you
give lr an undefined rule so that the stack slot where the incoming lr was
saved is described as being the return address, as in the pop-to-pc example


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]