This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: replace ptmalloc2

From: Rich Felker <dalias at libc dot org>
To: JÃrn Engel <joern at purestorage dot com>
Cc: Siddhesh Poyarekar <siddhesh dot poyarekar at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>
Date: Wed, 15 Oct 2014 00:00:31 -0400
Subject: Re: RFC: replace ptmalloc2
Authentication-results: sourceware.org; auth=none
References: <20141009215447 dot GD8583 at Sligo dot logfs dot org> <CAAHN_R0JDNQkx7oV0HS9Knv7nsPZiARLeFb4zpPa+rj7cNfECg at mail dot gmail dot com> <20141010010743 dot GA15146 at Sligo dot logfs dot org> <20141010012530 dot GX23797 at brightrain dot aerifal dot cx> <20141010013302 dot GC15146 at Sligo dot logfs dot org> <20141010020229 dot GY23797 at brightrain dot aerifal dot cx> <20141014233254 dot GA1860 at Sligo dot logfs dot org>

On Tue, Oct 14, 2014 at 04:32:54PM -0700, JÃrn Engel wrote:
> On Thu, Oct 09, 2014 at 10:02:29PM -0400, Rich Felker wrote:
> > 
> > The sane behavior is to keep the same PROT_NONE/mprotect pattern, but
> > expand by exponentially increasing amounts rather than one page each
> > time. E.g. force the Nth expansion to be at least 2^N pages.
> 
> Or maybe not mprotect at all and do some slow-start algorithm for mmap.
> There are many options one can pick from.  Main question is how to keep
> the code as simple as possible while achieving the goal.

The exponential expansion approach I described is just a couple lines
of code and completely non-invasive. Yes there are other approaches
like multiple mmaps (so that you never need PROT_NONE) but they have
worse address space fragmentation properties.

> For the moment I just removed the mprotect completely for some
> benchmarks.  That brings ptmalloc2 pretty close to jemalloc.  In some
> microbenchmarks it is 30% slower, in some it is 30% faster.  Both of
> them consistently outperform tcmalloc, which came as a surprise.

This is roughly what I expected.

> And jemalloc seems to have a nasty design flaw.  It is essentially a
> buddy allocator once you cross a certain size.  Size used to be 512B in
> 2006 and is 4k for the binary I tested.  malloc(4097) will return 8k,
> causing up to 2x memory overhead.  Improving this in jemalloc seems much
> harder than improving ptmalloc2, so my quest to replace the default
> allocator is over.
> 
> Anyhow, here are some raw numbers for the curious.  Benchmark allocated
> 2GB in 8 threads in sizes between 384B and 12288B and memset the memory.
> 		runtime	VmRSS	VmData	maps	syscalls
> libc		7.165s	2107908	2590048	67	332955
> libc-mprotect	0.768s	2107944	2399808	35	4149
> jemalloc	0.962s	2652152	2695332	42	5521
> tcmalloc	1.510s	2245760	2278460	47	38766
> 
> In this particular benchmark my hacked-up ptmalloc2 is winning, while a
> standard ptmalloc2 is clearly the worst of the bunch.

What benchmark are you using? I'd like to run it on my malloc.

Rich

Follow-Ups:
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel

References:
- RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Rich Felker
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Rich Felker
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]