This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
improving malloc
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: libc-alpha at sourceware dot org
- Date: Sat, 5 Jan 2013 10:02:42 +0100
- Subject: improving malloc
As malloc is concerned I have several ideas how to improve it. For
simplicity I will consider only 64bit system here.
As my profiling shows most of malloc allocations are at most 64byte
large.
Most effective way that I could think of was to use memory pools for
requests at most 64 byte large.
A draft of my ideas is here.
http://kam.mff.cuni.cz/~ondra/small_malloc.c
Pools have several advantages.
First is small(at most 1/10) memory overhead.
If we relax alignment guarante we can reduce memory consumption more
by returning 8byte requests aligned only to 8 bytes.
Second is speed.
I can make malloc and free use only one atomic compare and swap.
This is best possible when one does not use thread local storage.
It bottleneck on core2 where compare and swap takes 80 cycles.
Trend is that CAS is faster on modern processors, on sandy bridge
its only 30 cycles.
I plan to optimize this more. My idea is that malloc and free from same
thread are without any atomic operations. When free is from separate
thread(less than 1% according to my profiling) then I put this request
to free queue for owner thread.
Thread checks queue for each 10 free/malloc.
A main techical obstacle here that we must deal with canceled threads.