This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: improving malloc

From: KOSAKI Motohiro <kosaki dot motohiro at gmail dot com>
To: Ondřej Bílka <neleai at seznam dot cz>
Cc: Rich Felker <dalias at aerifal dot cx>, libc-alpha <libc-alpha at sourceware dot org>
Date: Sun, 6 Jan 2013 17:13:06 -0500
Subject: Re: improving malloc
References: <20130105090242.GA4490@domone.kolej.mff.cuni.cz><m238yedc66.fsf@firstfloor.org> <20130106095618.GA23604@domone.kolej.mff.cuni.cz><20130106134545.GX20323@brightrain.aerifal.cx> <CAHGf_=qS+=cTLt7QPLFnYJMEjbVjVzMfnjp0=0_t+5v3ACkCqw@mail.gmail.com><20130106184009.GA25828@domone.kolej.mff.cuni.cz>

>> I'm one of kernel memory folks and I'd like to explain how mvolatile() does.
>> It give a hint that given ranges are discardable to kernel. Thus when getting
>> memory pressure, kernel just drop such memory instead of swap out. It help
>> to minimalize wrong returning memory cost.
>>
>> see https://lwn.net/Articles/531305/
>>       http://lwn.net/Articles/518130/
>
> When these pages are result of fragmentation caused by large malloc it
> is wasteful to swap them.

Current approach (using madvise(MADV_DONTNEED)) also can avoid swapping out.
So, mvolatile don't make any difference from point of swapping view.


> A possible alternative could be implement most of this in userspace by
> callback that tells which pages can be zeroed.

No comments. As you know, userspace implementation makes a lot of
race in general. I'm not sure it can practical performance improvement,
however, i can not comment it until i see actual code.


>> btw, I don't understand Ondrej's "linked list" is which mechanism
>> point to. Can anyone clarify?
>
> One can allocate >10 page requests with nearly zero fragmentation
> (on 64-bit systems where address exhaustion is not problem.) and
> quite slowly with calling mmap/munmap instead of malloc/free.
>
> Zeroing memory on that mmap(with some new flag) could be avoided
> by kernel tracking and reusing unmaped memory.

So, I don't disagree kernel can implement per-process unmapped memory
cache. however I don't see any advantage because 1) it also need take
mmap_sem and then it may be slower than madvise(DONTNEED) and
2) As you know, using M_TRIM_THRESHOLD=-1 can avoid zeroing memory
completely. It is most efficient rather than any kernel mechanism.

Maybe I'm overlooking anything. When posting actual code, probably I can talk
about more productive comment.

Follow-Ups:
- Re: improving malloc
  - From: OndÅej BÃlka

References:
- improving malloc
  - From: OndÅej BÃlka
- Re: improving malloc
  - From: Andi Kleen
- Re: improving malloc
  - From: OndÅej BÃlka
- Re: improving malloc
  - From: Rich Felker
- Re: improving malloc
  - From: KOSAKI Motohiro
- Re: improving malloc
  - From: OndÅej BÃlka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]