This is the mail archive of the libc-hacker@sourceware.org mailing list for the glibc project.

Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Dealing with multiple page sizes in NPTL


Roland McGrath <roland@redhat.com> wrote on 10/16/2005 07:54:15 AM:

> There are a few issues here. 
> 
> Certainly it should choose a default stack size that it will accept. 
I've
> committed a change to nptl/init.c so that it applies the same minimum 
(based
> on the page size) that the EINVAL check in allocatestack.c will demand.
> 
> It's not clear to me why the allocatestack.c check is as it is:
> 
>       if (__builtin_expect (size < (guardsize + __static_tls_size
>                 + MINIMAL_REST_STACK + pagesize_m1 + 1),
> 
> Requiring the guard size plus the minimum usable size plus another page
> seems odd.  Off hand, it makes more sense to require the guard size plus
> just the minimum usable size, or that rounded up to a page.  But I won't
> presume to change this calculation before Ulrich comments on it. 
Whatever
> the calculation is, the defaulting code in init.c should be updated to 
match.
> 
Looks like your patch still requires a minimum of 3 pages (192KB) as 
written. I had hoped we could change the test in allocatestack.c to allow 
a minimum 2 page stack as long as the constraint pagesize > 
(__static_tls_size + MINIMAL_REST_STACK) was meet.

> A different question is PTHREAD_STACK_MIN.  When an application 
allocates
> its own thread stack, then 16384 may be perfectly adequate regardless of
> the page size.  One can make the case that an application writer might 
want
> to minimize memory consumption at the expense of foregoing guard pages. 
A
> clever application writer might even do his own allocation with guard 
pages
> below but use the rest of the page above the stack for other purposes, 
when
> the page size is much larger than the actual stack requirements.
> 
> However, the standard would seem to indicate that an application can use
> pthread_attr_setstacksize without pthread_attr_setstackaddr (and rather
> than pthread_attr_setstack), passing PTHREAD_STACK_MIN, and expect to
> create a thread with a guard page.  For that to work, PTHREAD_STACK_MIN 
has
> to be at least a page over the minimum usable stack (for the guard 
page).
> The standard explicitly mentions that the effective guardsize may be
> page-rounded.  I don't think the standard clearly says whether the
> guardsize is included in the stacksize, but that's what we do.  That 
being
> the case, PTHREAD_STACK_MIN not being more than a page just makes no 
sense.
> There is an argument to be made that when pthread_attr_setstacksize
> succeeds without complaint (because the PTHREAD_STACK_MIN minimum was 
met),
> then pthread_create should not then fail because that size is too small.
> The only way to oblige that is to round up the requested size when
> allocating, to big enough for the guard page and the minimum usable 
stack.
> But then that runs afoul of the specification that it's "the size of the
> thread stack", meaning that the application might expect it to be exact
> (with reliable overflow behavior).

It seems that stack size, allocation size, and reliable overflow behavior 
are separable issues. The stack may start PTHREAD_STACK_MIN bytes into a 
larger page. As long as the guard page is an appropriate distance from the 
initial stack pointer, exact overflow behavior is maintained. This may 
waste part of larger page, but the user delegated the details to of 
storage allocations to the run-time, and the run-time is doing the best it 
can for the conditions it is given.

> 
> It may well be desireable to change PTHREAD_STACK_MIN.  Some other
> platforms have larger sizes (ia64 has 192k, enough for 64k pages to 
work).
> Or, POSIX allows us to omit the compile-time definition entirely, and
> oblige applications to use sysconf--then no particular value would be
> compiled into applications.  But, doing either of those is an ABI 
change.
> Programs that used PTHREAD_STACK_MIN according to the proper API were
> compiled into existing binaries using the 16384 value; those have a 
right
> to keep working.  That's why we have that compatibility code in
> pthread_attr_setstacksize and pthread_attr_stack, for the platforms that
> got a new ABI in the GLIBC_2.3.3 version set changing the 
PTHREAD_STACK_MIN
> value exposed to applications.
> 
I am reluctant to change PTHREAD_STACK_MIN for powerpc32 because as you 
say clever applications may want the smaller stack even with the 64K page. 
This the feedback I get from the IBM Java developers and performance 
teams. The concern is that forcing the minimum to 128KB (or 192KB) reduces 
the max threads, max heap, or both for the 32-bit JVM. 

Changing PTHREAD_STACK_MIN for powerpc64 is more viable, but as you say we 
would have to version and find someway to make old applications work with 
64K pages. I would prefer that the new minimum would be 128K (not 192K). 
This requires changing the test in allocatestack.c.

> If you want to change PTHREAD_STACK_MIN for powerpc, then you have to 
add
> similar compatibility code for the old ABI that will have to be 
obsoleted
> by a GLIBC_2.4 version of pthread_attr_setstack{,size}.  I would also be
> open to removing PTHREAD_STACK_MIN as a compile-time invariant.  That 
both
> requires some more futzing in libc, and has potential fallout in terms 
of
> source compatibility with not-quite-compliant applications that assume
> there is a macro giving a constant value.
> 
I to avoid (making PTHREAD_STACK_MIN non-constant) if we can. I fear that 
some of these not-quite-compliant applications might be DB2 or Oracle...

Thanks

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]