Quick access to stack bounds
Richard Henderson
rth@twiddle.net
Sun Dec 6 03:18:00 GMT 2009
There is a need for quick access to the bounds of the (normal) stack.
(1) The transactional memory undo-log must cancel entries that are
below the "current" stack pointer. If we fail to do so, we risk
destroying the stack frames of the transactional memory library
itself. This can happen in the following way:
void foo(int *x) => void foo_safe(int *x)
{ {
*x = stuff (); _ITM_WU4 (x, stuff_safe ());
} }
void bar() => void bar_safe()
{ {
int y[10], i;
for (i = 0; i < 10; ++i)
foo(y+i); foo_safe(y+i);
} }
void baz() => void baz()
{ {
__transaction { _ITM_beginTransaction ();
... and other init stuff
bar(); bar_safe ();
} _ITM_commitTransaction ();
}
The translation of FOO does not know from whence its argument comes,
and so properly uses the _ITM_WU4 accessor to write to memory under
the control of the STM. However, the memory does come from BAR, and
is invalid once we return to BAZ and commit the transaction. Indeed,
the memory that we write to could be the return address of
_ITM_commitTransaction or one of its subroutines.
Now, it's possible for _ITM_commitTransaction to record its CFA on
entry and cancel everything between there and whatever value happens
to be in esp at the point we do each commit. But that would not be
as efficient (or as reliable) as being able to cancel everything
between that CFA and the absolute top of the stack.
(2) The Split Stacks feature that Google is developing requires extremely
quick access to the top of the stack minus a small buffer reserved
for allocating additional stack.
(3) An off-stack trampoline patch must determine if the current stack
pointer is above or below a previous trampoline allocation. There
are two kinks to a simple comparision: (a) signal stacks and
(b) split stacks. A quick filter for both is to know the bounds of
the active stack. E.g. one can avoid a call to sigaltstack if we
see that esp is within the normal thread stack.
The proposal is as follows:
#define AT_STACK_TOP 26
Filled in by the kernel with the top of the stack of the main
thread. The bottom of the stack is of course __libc_stack_end.
#define SPLIT_STACK_AVAILABLE 256
From gcc's i386 split stack implementation. If the program is
to use split stacks at all this needs to be part of the ABI.
tcbhead_t->__private_tm[3]
Redefined to be the bottom of the stack of any thread.
tcbhead_t->__private_tm[4]
Redefined to be the top of the stack + SPLIT_STACK_AVAILABLE.
_dl_start and __libc_setup_tls
Needs to initialize tcbhead_t->__private_tm[3,4] from the
stack bounds indicated by AT_STACK_TOP and __libc_stack_end.
If the kernel doesn't provide AT_STACK_TOP, we *could* use
getrlimit(RLIMIT_STACK) to fill in the value instead. I'm
undecided whether it's better to simply do this once in libc
or force all users of this interface to check for NULL and
fill it in from getrlimit themselves.
create_thread
Needs to initialize tcbhead_t->__private_tm[3,4] from the
stack bounds with which the thread is created.
__libc_tcb_stack_bounds
Users of tcbhead_t->__private_tm[3,4] include
.global __libc_tcb_stack_bounds
in their source to produce an (otherwise unused) undefined
symbol in the object file. Given --no-undefined, we can
notice whether the installed libc supports the feature when
the user is built. That also installs the appropriate
glibc version definition in the binary so that we determine
at dynamic link time that the feature is available.
Have I missed anything?
r~
More information about the Libc-alpha
mailing list