This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: pthread wastes memory with mlockall(MCL_FUTURE)
- From: Rich Felker <dalias at libc dot org>
- To: Balazs Kezes <rlblaster at gmail dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Fri, 18 Sep 2015 15:45:21 -0400
- Subject: Re: pthread wastes memory with mlockall(MCL_FUTURE)
- Authentication-results: sourceware.org; auth=none
- References: <20150918102734 dot GA27881 at eper> <20150918143824 dot GB17773 at brightrain dot aerifal dot cx> <20150918163842 dot GB27881 at eper> <20150918170853 dot GC17773 at brightrain dot aerifal dot cx> <20150918192952 dot GC27881 at eper>
On Fri, Sep 18, 2015 at 08:29:52PM +0100, Balazs Kezes wrote:
> On 2015-09-18 13:08 -0400, Rich Felker wrote:
> > I'm talking about new PROT_NONE pages.
>
> That's not how pthread does the allocation: it mmaps read/write first,
> and then does a mprotect(..., ..., PROT_NONE).
Ah, that explains it then. I did the opposite in musl for exactly this
reason: first mmap PROT_NONE then mprotect the non-guard part
PROT_READ|PROT_WRITE.
> > The kernel certainly accounts for them differently as commit charge.
> > New PROT_NONE pages consume no commit charge. Anonymous pages with
> > data in them, which would become available again if you mprotect them
> > readable, do consume commit charge. (For this reason, you have to mmap
> > MAP_FIXED+PROT_NONE to uncommit memory rather than just using mprotect
> > PROT_NONE, even if you already used madvise MADV_DONTNEED on it.)
>
> So while working on the repro I've looked deeper and created a simple
> app which demonstrates the mmap behavior:
>
> // gcc -Wall -Wextra -std=c99 mapping.c -o mapping
> #define _GNU_SOURCE
> #include <assert.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> #include <sys/mman.h>
> #include <unistd.h>
>
> int main(void)
> {
> int r;
> r = mlockall(MCL_CURRENT | (getenv("M") ? MCL_FUTURE : 0));
> assert(r == 0);
>
> int flags = MAP_PRIVATE | MAP_ANONYMOUS;
> void *mem = mmap(NULL, 8LL << 30, PROT_WRITE, flags, -1, 0);
> assert(mem != NULL);
> sleep(100);
>
> return 0;
> }
>
> All it does it mmaps some memory and if I have the envvar M set then it
> also does the mlocking part. When I run this application without
> mlocking then it barely uses any RSS memory. However when I set M then I
> can see in htop that RSS is 8GB and that
> "cat /proc/meminfo | grep MemAvailable" shows 8 GB less memory. Actually
> when I look at the number of minor pagefaults I get this:
>
> $ /usr/bin/time -f %R ./mapping
> 102
> $ M=1 /usr/bin/time -f %R ./mapping
> 4709
>
> So I think the kernel preallocates all the memory in this case.
>
> However if I set the protection to PROT_NONE then the kernel doesn't do
> the preallocation.
>
> Interestingly it does *not* preallocate even if mmap with PROT_NONE
> first and then do a mprotect(mem, 8LL<<30, PROT_WRITE). I do see the
> pagefaults if I do a memset(mem, 0, 8LL<<30) afterwards though.
>
> So here's what I think pthreads should do: First mmap with PROT_NONE and
> only then should mprotect read/write the stack pages.
>
> Does that sound reasonable?
Yes.
Rich