This is the mail archive of the
mailing list for the glibc project.
Re: improving malloc
On Sun, Jan 06, 2013 at 11:02:33AM -0500, KOSAKI Motohiro wrote:
> On Sun, Jan 6, 2013 at 8:45 AM, Rich Felker <firstname.lastname@example.org> wrote:
> > On Sun, Jan 06, 2013 at 10:56:18AM +0100, OndÅej BÃlka wrote:
> >> > Other considerations are memory fragmentation, how quickly
> >> > it can give back unused memory to the OS, etc. etc.
> >> >
> >> For giving memory back to OS when linux gets volatile ranges then
> >> we can finally do not have to defer returning memory because zeroing
> >> pages is expensive.
> >> I wanted to suggest at linux-kernel to keep pages returned to linux
> >> at linked list and for allocations prefer these as they do not have
> >> to be zeroed.
> > Preferring them is backwards; it will cause more page faults and use
> > more memory. Once you've returned memory to the kernel, you should
> > avoid using it again unless absolutely necessary. It's _possible_
> > that, for some usage cases, calloc would want to use this memory, but
> > for malloc it's always a pessimization.
> I'm one of kernel memory folks and I'd like to explain how mvolatile() does.
> It give a hint that given ranges are discardable to kernel. Thus when getting
> memory pressure, kernel just drop such memory instead of swap out. It help
> to minimalize wrong returning memory cost.
> see https://lwn.net/Articles/531305/
When these pages are result of fragmentation caused by large malloc it
is wasteful to swap them.
A possible alternative could be implement most of this in userspace by
callback that tells which pages can be zeroed.
> However, of course, we have no free lunch. 1) minor page fault is
> still expensive for allocator
> fast path 2) mvolatile is less multi thread friendly than current
> Because madvise(MADV_DONTNEED) don't need to take mmap_sem in kernel and it has
> excellent multi threaded performance when no wrong memory returning case.
> I hope mvolatile finally replaces MADV_DONTNEED, however, It need
> carefully eveluation.
> btw, I don't understand Ondrej's "linked list" is which mechanism
> point to. Can anyone clarify?
One can allocate >10 page requests with nearly zero fragmentation
(on 64-bit systems where address exhaustion is not problem.) and
quite slowly with calling mmap/munmap instead of malloc/free.
Zeroing memory on that mmap(with some new flag) could be avoided
by kernel tracking and reusing unmaped memory.