|Summary:||free() doesn't honor M_TRIM_THRESHOLD|
|Product:||glibc||Reporter:||Sérgio Martins <iamsergio>|
|Component:||malloc||Assignee:||Not yet assigned to anyone <unassigned>|
|Severity:||normal||CC:||bharath.vegito, bvbfan, carlos, kuraga333, leonard, mail|
Description Sérgio Martins 2012-11-10 23:53:57 UTC
Created attachment 6725 [details] test-case free() isn't calling brk() to give memory back to the kernel when M_TRIM_THRESHOLD is passed. Run the attached test-case. What it does: 1. Calls malloc() 2800000 times 2. Calls free() 2800000 times 3. pauses, so you can inspect the heap size. You'll see that the heap size is around 250 MB. Manually calling malloc_trim(), through gdb, decreases the heap size to 4 K. ---------------------------------------------------- How I measured heap size: $ cat /proc/12345/maps | grep heap 01bc6000-0f180000 rw-p 00000000 00:00 0 [heap] $ python > (0x0f180000-0x01bc6000) / (1024*1024) > 213 213 Megabytes $ top -p12345 # tested with top too 227m 214m for VIRT and RES respectively $ gdb -pid 12345 # Lets attach gdb and call malloc_trim() > call malloc_trim(0) $ top -p12345 14492 1076 for VIRT and RES respectively $ cat /proc/12345/maps | grep heap 01bc6000-01bc7000 rw-p 00000000 00:00 0 [heap] $ python > (0x01bc7000-0x01bc6000) / (1024*1024) > 0.00390625 // 4KB ------------------------------------------------------------ I'm on Linux 3.6.5 with glibc-2.16
Comment 1 Sérgio Martins 2012-11-16 17:03:23 UTC
This seems to be caused due to the "fastbins" features. free() doesn't trim fastbins because the malloc() was less than M_MXFAST. But there really should be a limit to the number of fastbins that we keep around. In KDE we've seen 600MB of memory being freed after attaching gdb and calling malloc_trim(0)
Comment 2 Milian Wolff 2012-11-19 17:04:08 UTC
I can reproduce this issue and think its also an issue for KDevelop and similar apps. What else is needed to improve the situation here?
Comment 3 Carlos O'Donell 2016-11-28 13:41:59 UTC
The way we are going to improve this situation is IMO by moving from fastbins to per-thread caches, and those caches will have a size limit to limit RSS growth. The fastbins are not as fast as lockless per-thread caches which is the common implementation across tcmalloc and jemalloc. We already have an implementation in dj/malloc which has been proposed and posted.
Comment 4 Anthony Fieroni 2016-12-28 17:34:32 UTC
I saw branch dj/malloc and i don't think it can fix the issue. The main problem is in _int_free (mstate av, mchunkptr p, int have_lock) if ((unsigned long)(size) <= (unsigned long)(get_max_fast ()) <------------- if all chunks size are lower than M_MXFAST this code *never* release consolidate memory. This is serious issue after all. For me this must be if all_unused_size >= 2*FASTBIN_CONSOLIDATION_THRESHOLD we must free one of FASTBIN_CONSOLIDATION_THRESHOLD. I can make patch, when i realise how to get all consolidate chunk size.
Comment 5 Aleksandr Kurakin 2018-06-15 14:32:02 UTC
Any news on this? This issue touchs many of applications and libraries. Moreover, Anthony pointed the place of this bug.
Comment 6 Aleksandr Kurakin 2018-10-29 09:51:46 UTC
Ok, let's make M_MXFAST tunable and add M_MXFAST_ environment variable?
Comment 7 Carlos O'Donell 2019-12-19 14:40:39 UTC
(In reply to Aleksandr Kurakin from comment #6) > Ok, let's make M_MXFAST tunable and add M_MXFAST_ environment variable? We now have glibc.malloc.mxfast tunable so you can test this out without adding code or preloading a library that calls mallopt with M_MXFAST setting to 0.
Comment 8 Aleksandr Kurakin 2019-12-19 20:01:26 UTC
(In reply to Carlos O'Donell from comment #7) > We now have glibc.malloc.mxfast tunable so you can test this out without > adding code or preloading a library that calls mallopt with M_MXFAST setting > to 0. Thanks very much!
Comment 9 Milian Wolff 2019-12-20 12:59:38 UTC
What version of glibc does this require? The new tunable isn't yet documented on https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html