This is the mail archive of the
mailing list for the glibc project.
Re: [RFC] nptl: change default stack guard size of threads
On 06/12/2017 18:41, Szabolcs Nagy wrote:
> On 06/12/17 14:27, Wilco Dijkstra wrote:
>> Florian Weimer wrote:
>>> On 11/29/2017 11:28 PM, Wilco Dijkstra wrote:
>>>> It's not related to what GLIBC needs, but GLIBC, like Linux, must continue to
>>>> run old binaries so a larger guard size is definitely beneficial. No existing code
>>>> uses probing, so increasing the guard size means far fewer functions could
>>>> jump the guard. The dropoff is exponential, each doubling of guard page size
>>>> halves the number of functions with a stack larger than the guard size.
>>> That's not actually true in the sense that the math doesn't work out
>>> that way. If you have a weird function which skips by a really large
>>> amount, you can double the guard size many, many times until the number
>>> of unprotected functions drops further.
>>> And there is definitely a long tail here, due to GNU's affinity to
>>> variable-length arrays and alloca.
>> The math works fine for most of the curve. The measurements I did
>> show that the number of functions with really large stack allocations is
>> extremely small. So it's a long tail only in terms of maximum stack
>> allocation, not in number of functions.
> with 4k probe interval about 1% of functions need probes
> with 64k probe interval about 0.01% (order of magnitude,
> alloca not included), so increasing the default guard can
> be useful for existing code.
Do you mean the function from glibc itself or is it a generic analysis?
How did you get these numbers?
> however i have doubts about mandating 64k guard page size
> on aarch64.
> in principle 64k guard on aarch64 should not cause problems
> since there are systems with 64k page size already,
> but there are some glibc issues:
> bug 11787 needs fixing (at least the guardsize accounting
> part) to avoid wasting stack.
I do not see BZ#11787 being a pre-requisite for this issue, although
it does interfere with thread stack allocation. Mainly because a
potential fix for 11787 will require a different way to account for
guard page, which will leads to its own possible issues.
> some applications control their own memory use via RLIMIT_AS
> which means we cannot arbitrarily waste address space either.
> the right logic for pthread_attr_init, __pthread_get_minstack,
> pthread_getattr_default_np, pthread_getattr_np, allocate_stack,
> __libc_alloca_cutoff is not obvious. and backporting a change
> with minimal impact may be difficult.
I really wish we could just get rid of __libc_alloca_cutoff internal
usage, it is very bad API and we have better ones to deal with
potentially unbounded data (dynarray for instance where I recently posted
a glob refactor to remove alloca usage).
> but more importantly glibc cannot fix everything:
> there are user allocated stacks where the user presumably
> allocates the guard too and we cannot affect that.
> there are other libcs and all of them use single guard page now
> (bsd systems, bionic, musl, ..) other language runtimes may
> explicitly set single guard page too (openjdk, erlang,..?) and
> providing reduced security guarantees on all those systems is
> suboptimal. (can we round up the guardsize setting?)
> so the question is whether pagesize < 64k systems sometimes
> getting reduced security because of small guard is worse or
> users turning probing off because of 4k probing overhead.
I see the question as whether a slight more security coverage of
64k guard pages pays off the potential disruption it may incur.
Your examples and Florian points also does not give me much
confidence such potential disruption is the best approach.
I think we can at least work from GLIBC side by prioritizing the
removal of unbounded dynamic stack allocations and deprecate the
GNU affinity for VLAs and alloca.