This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] malloc: Fix missing accounting of top chunk in malloc_info [BZ #24026]
* Carlos O'Donell:
> On 8/1/19 8:12 AM, Florian Weimer wrote:
>> Author: Niklas Hambüchen <mail@nh2.me>
>>
>> Fixes `<total type="rest" size="..."> incorrectly showing as 0 most
>> of the time.
>>
>> The rest value being wrong is significant because to compute the
>> actual amount of memory handed out via malloc, the user must subtract
>> it from <system type="current" size="...">. That result being wrong
>> makes investigating memory fragmentation issues like
>> https://bugzilla.redhat.com/show_bug.cgi?id=843478 close to
>> impossible.
>>
>> [Patch from Bugzilla. Reformatted for GNU style.]
>>
>> 2019-08-01 Niklas Hambüchen <mail@nh2.me>
>>
>> [BZ #24026]
>> * malloc/malloc.c (__malloc_info): Account for top chunk.
>>
>> diff --git a/malloc/malloc.c b/malloc/malloc.c
>> index 00ce48cf58..1083cd3ef2 100644
>> --- a/malloc/malloc.c
>> +++ b/malloc/malloc.c
>> @@ -5406,6 +5406,10 @@ __malloc_info (int options, FILE *fp)
>> __libc_lock_lock (ar_ptr->mutex);
>> + /* Account for top chunk. */
>> + avail = chunksize (ar_ptr->top);
>> + nblocks = 1; /* Top always exists. */
>> +
>> for (size_t i = 0; i < NFASTBINS; ++i)
>> {
>> mchunkptr p = fastbin (ar_ptr, i);
>>
>
> The malloc accounting is quiet complex.
>
> To review this I'd basically have to do a bunch of debugging on
> my own to prove this is correct since your proposed patch doesn't
> have the required details to validate the change.
>
> Can you post some analysis of a trivial heap and show how this is
> a correct fix with before and after numbers?
I think this matches what I saw when I wrote the Emacs heap rewriter.
DJ's writeup also says this:
| Each arena keeps track of a special "top" chunk, which is typically the
| biggest available chunk, and also refers to the most recently allocated
| heap.
<https://sourceware.org/glibc/wiki/MallocInternals>
malloc_stats has the same logic.
All this is why I think it's the right fix.
Before the fix, I see:
$ grep rest malloc/tst-malloc_info.out
<total type="rest" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<total type="rest" count="0" size="0"/>
<total type="rest" count="1" size="657"/>
<total type="rest" count="1" size="657"/>
<total type="rest" count="1" size="657"/>
<total type="rest" count="1" size="657"/>
<total type="rest" count="4" size="2628"/>
After:
$ grep rest malloc/tst-malloc_info.out
<total type="rest" count="1" size="133360"/>
<total type="rest" count="1" size="2656"/>
<total type="rest" count="1" size="2656"/>
<total type="rest" count="1" size="2656"/>
<total type="rest" count="1" size="2656"/>
<total type="rest" count="5" size="143984"/>
<total type="rest" count="1" size="133360"/>
<total type="rest" count="2" size="3313"/>
<total type="rest" count="2" size="3313"/>
<total type="rest" count="2" size="3313"/>
<total type="rest" count="2" size="3313"/>
<total type="rest" count="9" size="146612"/>
These numbers look much more reasonable to me. (The test is inherently
racy, so there is no exact match.)
Writing a test case for this would unfortunately need an XML parser.
I don't know if we can use the one that comes with Python.
Thanks,
Florian