Created attachment 6957 [details] testcase for free can't give the memory back to kernel Compile and run the attached test-case. What it does: 1.Call malloc() to allocate 1024*4kB memory 2.Call mmap() 3.Call malloc() and allocate 4096kB memory again 4.Call system to inspect the MemFree 5.Call free() 6.Call system to inspect the MemFree again So,The will be free 8192kB memory,but it not. ------------------------------------------------------------- [xuzhanxin@localhost t-arena]$ gcc -o test-ok testcase.c -g -DUNMAP_OK [xuzhanxin@localhost t-arena]$ gcc -o test-err testcase.c -g [xuzhanxin@localhost t-arena]$ ./test-ok First malloc ok mmap ok Second malloc ok! MemFree: 12630976 kB After free memory! MemFree: 12638540 kB [xuzhanxin@localhost t-arena]$ ./test-err First malloc ok mmap ok Second malloc ok! MemFree: 12627628 kB After free memory! MemFree: 12627380 kB
Created attachment 6958 [details] new testcase Could you make it to leak memory repeately? I enclosed your test in loop and memory consumption remains stable.
(In reply to comment #1) > Created attachment 6958 [details] > new testcase > Could you make it to leak memory repeately? > I enclosed your test in loop and memory consumption remains stable. The memory(8k) can be used by itself,so memory exhausted not occur in loop. [@localhost t-arena]$ cat /proc/22652/smaps |grep Rss Rss: 88 kB Rss: 4 kB Rss: 4 kB Rss: 192 kB Rss: 0 kB Rss: 8 kB Rss: 4 kB Rss: 12 kB Rss: 4 kB Rss: 4 kB Rss: 4 kB Rss: 5028 kB /* can't free */ Rss: 0 kB Rss: 3208 kB /* can't free */ Rss: 4 kB Rss: 8 kB
(In reply to comment #0) > What it does: > 1.Call malloc() to allocate 1024*4kB memory > 2.Call mmap() > 3.Call malloc() and allocate 4096kB memory again > 4.Call system to inspect the MemFree > 5.Call free() > 6.Call system to inspect the MemFree again I'm not sure this test case is valid. Without compaction (which tends to break C semantics), you can always find a sequence of allocations that result in fragmentation. As Ondrej said, as long as the test reaches a steady state when executed in a loop, this is not a bug.
Closing as NOTABUG based on comment 3. malloc/free indeed cannot give address space back to the kernel when main arena is discontinuous and that is expected behaviour.
(In reply to Siddhesh Poyarekar from comment #4) > Closing as NOTABUG based on comment 3. malloc/free indeed cannot give > address space back to the kernel when main arena is discontinuous and that > is expected behaviour. Hi guys, I'm porting a patch about this problem to newest glibc. I am afraid this bug were probably more serious than discussed here. In short, glibc just do not free any memory in the main heap when it become discoutinuous. As a result, a process may keep the peak physical memory usage as long as it is alive. This is really bad as glibc are widely used in server programs... This bug is not about fragmentation, but more serious. Once all memory are freed, fragmentation will disappear and all memory could be reused again. But if the bug occurred, no memory could be reused(by other processes) even all of them were freed. In fact, fix to this bug is quite straightforward. Just port the mechanism used by sub heaps to the main arena should be enough. We have made such a patch several years ago, and have applied it on production environments since then. I'll post the patch later after I port it to the new glibc release.
Please reopen if you can share a test case that demonstrates that memory is not reclaimed if it has been fully freed.
Created attachment 10003 [details] test_case to show the peak memory usage retain after memory all freed(use write to avoid mallocs in printf) Hi all, I have post a test to show the most dangerous impact of the bug(please make sure you have more than 4G memory). In short, a process may keep the peak physical memory usage when the bug occurred. Imagine there were three daemon servers on a 8G-memory machine, the three guys work for on same data pipeline to provide useful information. Assume each of them use up to 4G memory, we know they will run happily as they were designed to work serially. However, if this bug occurred unfortunately , one of the three could not get enough memory, which lead to OOM or unacceptably slow swap usage.
Comment on attachment 10003 [details] test_case to show the peak memory usage retain after memory all freed(use write to avoid mallocs in printf) #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/mman.h> #include <assert.h> #define MEM_SIZE 1024*512 char print_buf[2048]; const char msg1[1024] = "now we will see the peak physical memory usage\n"; const char msg2[1024] = "I have freed all memory, now I'm using \n"; int main(int argc,char *argv[]) { int i = 0; void **mem = (void**) malloc(sizeof(void*) * MEM_SIZE * 2);; for (; i < MEM_SIZE;i++) { mem[i]=malloc(4096); if(mem[i] == NULL) { printf("malloc failed.\n"); exit(1); } } void *tt = mmap(mem[i-1]+1024*1024, 1024*213, PROT_READ | PROT_WRITE,MAP_SHARED | MAP_ANONYMOUS, -1, 0); for (; i < MEM_SIZE * 2; i++) { mem[i]=malloc(4096); if(mem[i] == NULL) { printf("malloc failed.\n"); exit(1); } } char cmd[1024]; write(1, msg1, strlen(msg1)); snprintf(cmd, sizeof(cmd), "cat /proc/%d/status |grep VmRSS:", getpid()); system(cmd); while(i--) { free(mem[i]); mem[i]=NULL; } free(mem); //assert(malloc_trim(0)==1); write(1, msg2, strlen(msg2)); system(cmd); return 0; }
(In reply to ma.jiang from comment #8) > void *tt = mmap(mem[i-1]+1024*1024, 1024*213, PROT_READ | > PROT_WRITE,MAP_SHARED | MAP_ANONYMOUS, -1, 0); This corrupts the arena and the chunk metadata by writing zero (if the kernel honours the address hint). You can't place a mapping at a fixed address that may already contain data. You must know that address is empty or you're going to have serious problems, possibly overwriting whatever is already at that address. Corruption of metadata my indeed result in all memory not being freed.
(In reply to Carlos O'Donell from comment #9) > (In reply to ma.jiang from comment #8) > > void *tt = mmap(mem[i-1]+1024*1024, 1024*213, PROT_READ | > > PROT_WRITE,MAP_SHARED | MAP_ANONYMOUS, -1, 0); > > This corrupts the arena and the chunk metadata by writing zero (if the > kernel honours the address hint). If it does, it's a kernel bug. If the break point overlaps with the mapping, the mmap call must fail because MAP_FIXED hasn't been specified. Even without the hint, something like that could happen under virtual address space pressure. I think this needs further analysis.
(In reply to Florian Weimer from comment #10) > (In reply to Carlos O'Donell from comment #9) > > (In reply to ma.jiang from comment #8) > > > void *tt = mmap(mem[i-1]+1024*1024, 1024*213, PROT_READ | > > > PROT_WRITE,MAP_SHARED | MAP_ANONYMOUS, -1, 0); > > > > This corrupts the arena and the chunk metadata by writing zero (if the > > kernel honours the address hint). > > > If it does, it's a kernel bug. If the break point overlaps with the > mapping, the mmap call must fail because MAP_FIXED hasn't been specified. You are absolutely right, I missed that MAP_FIXED wasn't in the flags. > Even without the hint, something like that could happen under virtual > address space pressure. > > I think this needs further analysis. We already support foreign sbrk in the main arena, thus we support non-contiguous cases. This looks like one of those such cases, but instead of a foreign sbrk, we have a foreign mmap. I'm reopening and looking at it under the debugger to see what really changes if anything.
> > We already support foreign sbrk in the main arena, thus we support > non-contiguous cases. > > This looks like one of those such cases, but instead of a foreign sbrk, we > have a foreign mmap. > > I'm reopening and looking at it under the debugger to see what really > changes if anything. Hi all, It's happy to see your replies. The mmap in the test code seems confusing, but we are just trying to make a fence in front of brk(to make the main arena discoutinuous). This action can be replaced by a normal mmap(which is normal in user progarms), providing result address of the mmap is in front of brk. I have ported our local patch to glibc-2.24(and it works as expected :) ). I'll post it later if you were intereted.
(In reply to ma.jiang from comment #12) > > > > We already support foreign sbrk in the main arena, thus we support > > non-contiguous cases. > > > > This looks like one of those such cases, but instead of a foreign sbrk, we > > have a foreign mmap. > > > > I'm reopening and looking at it under the debugger to see what really > > changes if anything. > Hi all, > It's happy to see your replies. The mmap in the test code seems confusing, > but we are just trying to make a fence in front of brk(to make the main > arena discoutinuous). This action can be replaced by a normal mmap(which is > normal in user progarms), providing result address of the mmap is in front > of brk. > I have ported our local patch to glibc-2.24(and it works as expected :) ). > I'll post it later if you were intereted. If you are going to post patches please follow the contribution checklist, particularly the copyright section: https://sourceware.org/glibc/wiki/Contribution%20checklist I can confirm that sbrk will fail when it runs across the mmap'd region, and malloc will fall back to using mmap as MORECORE and set the main_arena as non-contiguous. The question which remains is to double check the trimming logic to make sure it could be capable of undoing that transition to start freeing back to the original sbrk region. I don't think it can because this code triggers: /* Only proceed if end of memory is where we last set it. This avoids problems if there were foreign sbrk calls. */ current_brk = (char *) (MORECORE (0)); if (current_brk == (char *) (av->top) + top_size) The discontinuity is as-if a foreign sbrk had been applied, and that means we won't free any memory. If mmap's keep intersecting the heap of the main_arena, we will see continued increases in RSS that we can't free without calling malloc_trim() which trims not just from the top but from all coalesced chunks everywhere.
Thinking more deeply about this the only way we are going to solve this is by having main_arena gain a full heap_info, and when sbrk fails, allocate a second heap with mmap. The heap_info structure is what we need to track and trim top-of-heap properly. So we will have moved main_arena closer to what the other arenas are like with the capability to have multiple heaps. Though in the case of main_arena it would only happen when the traditional heap fails.
(In reply to Carlos O'Donell from comment #14) > Thinking more deeply about this the only way we are going to solve this is > by having main_arena gain a full heap_info, and when sbrk fails, allocate a > second heap with mmap. The heap_info structure is what we need to track and > trim top-of-heap properly. So we will have moved main_arena closer to what > the other arenas are like with the capability to have multiple heaps. Though > in the case of main_arena it would only happen when the traditional heap > fails. So to be clear, this is the kind of patch I'd be looking for if someone were going to work on a solution for this. I'm not going to tackle this right now.
(In reply to Carlos O'Donell from comment #15) > (In reply to Carlos O'Donell from comment #14) > > Thinking more deeply about this the only way we are going to solve this is > > by having main_arena gain a full heap_info, and when sbrk fails, allocate a > > second heap with mmap. The heap_info structure is what we need to track and > > trim top-of-heap properly. So we will have moved main_arena closer to what > > the other arenas are like with the capability to have multiple heaps. Though > > in the case of main_arena it would only happen when the traditional heap > > fails. > > So to be clear, this is the kind of patch I'd be looking for if someone were > going to work on a solution for this. > > I'm not going to tackle this right now. Thanks, our solution are the same as you described above. I'll try to post it as soon as possible.
Created attachment 11415 [details] fix the discontinuous main arena
*** Bug 24422 has been marked as a duplicate of this bug. ***