This is the mail archive of the libc-hacker@sourceware.org mailing list for the glibc project.

Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Dealing with multiple page sizes in NPTL


The recently announced POWER5+ hardware now supports 4KB, 64KB, and 16KB
pages sizes. For large systems used in High Performance Computing or large
scale data base applications a larger page size makes the TLB more
effective and boosts performance.

At this year's OLS there was discussion of adding a kernel option to
support 64KB (vs 4KB) as the base page size in these environments. I
suspect that a larger base page is an issue for IA64 as well. This raises
the possibility that the page size may change depending on which kernel was
booted for that machine. This page size will be reported via AT_PAGESZ but
the question is how well does glibc respond to that value not be a constant
4096.

Amazing well. Most of glibc depends on GLRO(dl_pagesize), __getpagesize() ,
or _sysconf(_SC_PAGESIZE), which are derived from AT_PAGESZ either directly
or indirectly. Even Linuxthreads behaves correctly because it uses the
following definition:

   /* The page size we can get from the system.  This should likely not be
      changed by the machine file but, you never know.  */
   #ifndef PAGE_SIZE
   #define PAGE_SIZE  (sysconf (_SC_PAGE_SIZE))
   #endif

NPTL however has a problem where I found pthread_create was returning
EINVAL due to the following code in allocate_stack() in
glibc/nptl/allocatestack.c.

      guardsize = (attr->guardsize + pagesize_m1) & ~pagesize_m1;
      if (__builtin_expect (size < (guardsize + __static_tls_size
                                    + MINIMAL_REST_STACK + pagesize_m1 +
1),
                            0))
        /* The stack is too small (or the guard too large).  */
        return EINVAL;

I found that the minimum stack size that allowed a thread to be created on
a 64K-page kernel is 135296, which equals 65536 + 128 + 4096 + 65535 + 1.
The guardsize and pagesize_m1 are computed from __getpagesize() but the
default value of size is not.

If the pthread_attr does no provide the stacksize attribute the
__default_stacksize value is used:

     /* Get the stack size from the attribute if it is set.  Otherwise we
        use the default we determined at start time.  */
     size = attr->stacksize ?: __default_stacksize;

The problem is the initialization of __default_stacksize which occurs in
nptl/init.c and nptl/vars.c. It looks like vars.c handles initialization
for the static case:

   /* Default stack size.  */
   size_t __default_stacksize attribute_hidden
   #ifdef SHARED
   ;
   #else
     = PTHREAD_STACK_MIN;
   #endif

And init.c (__pthread_initialize_minimal_internal) handles initialization
for the dynamic case:

     if (getrlimit (RLIMIT_STACK, &limit) != 0
         || limit.rlim_cur == RLIM_INFINITY)
       /* The system limit is not usable.  Use an architecture-specific
          default.  */
       __default_stacksize = ARCH_STACK_DEFAULT_SIZE;
     else if (limit.rlim_cur < PTHREAD_STACK_MIN)
       /* The system limit is unusably small.
          Use the minimal size acceptable.  */
       __default_stacksize = PTHREAD_STACK_MIN;
     else
      ....

The default value of PTHREAD_STACK_MIN is 16384 which too small for a 64KB
page. The minimum needs to be at least 2 pages (128KB) one for the
guardpage and one or more pages to hold the; minimum stack, thread struct,
and static TLS storage.

The seemingly simple solution is to use something like:

   #define  PTHREAD_STACK_MIN  (2 * __getpagesize())

but this causes other problems. The conditional:

   #if PTHREAD_STACK_MIN == 16384
   weak_alias (__pthread_attr_setstacksize, pthread_attr_setstacksize)
   #else
   versioned_symbol (libpthread, __pthread_attr_setstacksize,
                     pthread_attr_setstacksize, GLIBC_2_3_3);
   ...
   #endif

is used in several places to determine if versioning is required. These and
the static assignment in nptl/vars.c will not compile unless
PTHREAD_STACK_MIN is constant. So there is a structual problem of how make
a variable or at least how to set __default_stacksize correctly when
AT_PAGESZ > __default_stacksize. The dynamic case can be addressed with:

     if (__default_stacksize < (2 * __getpagesize()))
       /* The default_stacksize must be at least 2 pages.  */
       __default_stacksize = (2 * __getpagesize());
      ....

in __pthread_initialize_minimal_internal. It is not clear how best to
address the static case. It also seems that the formula in allocatestack()
needs to change to something like:

      guardsize = (attr->guardsize + pagesize_m1) & ~pagesize_m1;
      if (__builtin_expect (size < ((guardsize + __static_tls_size
                                    + MINIMAL_REST_STACK
                                    + pagesize_m1) & ~pagesize_m1,
                            0))
        /* The stack is too small (or the guard too large).  */
        return EINVAL;

Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]