This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: replace ptmalloc2


On Tue, Oct 14, 2014 at 04:32:54PM -0700, JÃrn Engel wrote:
> On Thu, Oct 09, 2014 at 10:02:29PM -0400, Rich Felker wrote:
> > 
> > The sane behavior is to keep the same PROT_NONE/mprotect pattern, but
> > expand by exponentially increasing amounts rather than one page each
> > time. E.g. force the Nth expansion to be at least 2^N pages.
> 
> Or maybe not mprotect at all and do some slow-start algorithm for mmap.
> There are many options one can pick from.  Main question is how to keep
> the code as simple as possible while achieving the goal.

The exponential expansion approach I described is just a couple lines
of code and completely non-invasive. Yes there are other approaches
like multiple mmaps (so that you never need PROT_NONE) but they have
worse address space fragmentation properties.

> For the moment I just removed the mprotect completely for some
> benchmarks.  That brings ptmalloc2 pretty close to jemalloc.  In some
> microbenchmarks it is 30% slower, in some it is 30% faster.  Both of
> them consistently outperform tcmalloc, which came as a surprise.

This is roughly what I expected.

> And jemalloc seems to have a nasty design flaw.  It is essentially a
> buddy allocator once you cross a certain size.  Size used to be 512B in
> 2006 and is 4k for the binary I tested.  malloc(4097) will return 8k,
> causing up to 2x memory overhead.  Improving this in jemalloc seems much
> harder than improving ptmalloc2, so my quest to replace the default
> allocator is over.
> 
> Anyhow, here are some raw numbers for the curious.  Benchmark allocated
> 2GB in 8 threads in sizes between 384B and 12288B and memset the memory.
> 		runtime	VmRSS	VmData	maps	syscalls
> libc		7.165s	2107908	2590048	67	332955
> libc-mprotect	0.768s	2107944	2399808	35	4149
> jemalloc	0.962s	2652152	2695332	42	5521
> tcmalloc	1.510s	2245760	2278460	47	38766
> 
> In this particular benchmark my hacked-up ptmalloc2 is winning, while a
> standard ptmalloc2 is clearly the worst of the bunch.

What benchmark are you using? I'd like to run it on my malloc.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]