This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: replace ptmalloc2


On 10 October 2014 06:37, JÃrn Engel <joern@purestorage.com> wrote:
> Again I have to disagree.  When running enough threads, practically all
> memory comes from mmap.  Only one arena can use sbrk, all others have
> to grow using mmap - independently of using mmap for large allocations.

Each non-main arena is allocated using mmap, but it's allocated 128MB
at a time.  So you should not be seeing a *lot* of maps for a finite
number of threads.  If you are then your allocations are typically
larger than 32MB and they're being allocated only using mmap, i.e. one
mmap for each allocation.

The other possibility is that you're using an older glibc that did not
fix the problem with arena thresholds not being honoured, causing
memory usage to spiral out of control.

> Clearly you get a performance benefit when using per-thread structures
> for the hot path.  But arenas are not a good fit for those per-thread
> structures.  They have a tendency to stay near their high-watermark for
> memory consumption, independent of current memory consumption.  If a
> thread peaks to, say, 1GB, then shrinks down to 1Mb and remains low for
> several months, the delta is missing from the system.

If that's happening then there's clearly something wrong, i.e. a bug
that ought to be fixed or your allocations all being large and hence
never actually hitting the arenas.

> Having a smallish per-thread structure that could push memory back to a
> common pool (like tcmalloc and jemalloc seem to) would fix that problem.
> Staying lockless through the per-thread structure 99% of the time gives
> you 99% of the performance benefit of per-thread arenas from the locking
> perspective.  Add cache locality effects and the remaining 1% is lost in
> the noise.

There were discussions in the past about making the current malloc
implementation (or at least the fast path) lockless too.  There hasn't
been a lot of progress beyond discussions AFAIR.

> Is there any prior art I could copy for the multi-allocator framework?
> It would be much nicer to steal someone else's good ideas than having to
> come up with my own.

Not that I know of.

Siddhesh
-- 
http://siddhesh.in


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]