Summary: | Poor threaded application performance when using malloc | ||
---|---|---|---|
Product: | glibc | Reporter: | Steven Munroe <sjmunroe> |
Component: | libc | Assignee: | Ulrich Drepper <drepper.fsp> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | fweimer, glibc-bugs, roland |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | ||
Attachments: |
Threaded malloc test with MMAP_THRESHOLD options
Oprofile of malloc-test 128000 1000 8 on Dual PPC64 G5 profile from similar run but with MMAP_THRESHOLD increased to 16M |
Description
Steven Munroe
2005-10-25 14:11:50 UTC
Created attachment 724 [details]
Threaded malloc test with MMAP_THRESHOLD options
To build use:
gcc -g -O2 malloc-test.c -lpthread -o malloc-test
or
gcc -g -O2 -DMAP_THRESHOLD=16777216 malloc-test.c -lpthread -o
malloc-test_16M
To run the testcase single threaded ./malloc_test 128000 10000 ... Average : 0.718383 seconds for 10000 requests of 128000 bytes, 491MB concurrent. To run with 16 threads ./malloc_test 128000 10000 16 ... Average : 1.280583 seconds for 10000 requests of 128000 bytes, 490MB concurrent. These run quickly because 128000 is less than the cash threshold. Now try with a malloc size larger than the MMAP_THRESHOLD: ./malloc_test 1280000 10000 16 ... Average : 227.594933 seconds for 10000 requests of 421006 bytes, 488MB concurrent. Notice the huge jump from 1.28 to 227 seconds while to total concurrent storage remained constant around 490MB! Now try a version of malloc-test that changes the MMAP_THRESHOLD to 16M: ./malloc-test_16M 1280000 10000 16 ... Average : 7.473701 seconds for 10000 requests of 421006 bytes, 488MB concurrent. The time comes down to a more reasonable 7.47 seconds. Finally to verify that larger MMAP_THRESHOLD does not negatively impact smalled allocatoions try. ./malloc-test_16M 128000 10000 16 ... 1.066022 seconds for 10000 requests of 128000 bytes, 490MB concurrent. Which in this case is faster than with to smalled default MMAP_THRESHOLD. All runs on my dual 2GHz G5 (PPC64/970) system, but I see simular results on my dual Athelon system. So I suspect this a common problem across SMP platforms. Have you done any profiling to substantiate your analysis of why it is slower? I see nothing in the kernel to suggest that brk preallocates zero-fill pages. Your test program preallocates them in its early iterations and then reuses those pages by freeing and allocating repeatedly, I would suspect. Profiling would show the time spent in mmap/munmap syscalls vs spent faulting in pages, for example. Created attachment 733 [details]
Oprofile of malloc-test 128000 1000 8 on Dual PPC64 G5
This profile show that when the MMAP_THRESHOLD is exceeded we see a big
increase in kernel time. The kernel time is associate with the locking,
schedualing, and page fault.
I don't have access to a i386 SMP box with at the moment but I suspect the
profile there will be similar.
Created attachment 734 [details]
profile from similar run but with MMAP_THRESHOLD increased to 16M
Increasing the MMAP_THRESHOLD improved performance so I had the increase the
number of iterations to get the test to run long enoigh to profile. The profile
show most of the time (92%) in the test application (run_test) and and a few
percent in the malloc runtime. The first kernel contribution starts at 0.2% for
schedule.
Yes arenas allocated in brk store page fault once but are effeciently reused. The problem with large allocations is that the storage allocated with mmap is unmapped with the free(). So each new allocation that exceeds the MMAP_THRESHOLD has to be faulted in. The mmap syscall does not do much work. Most of the effort of allocating the page and zeroing it out is defered until the page is actually touched the first time. This is reflected in the profiles attached above. This should have been dealt with in a malloc patch which went in some time ago. Verify and close or elaborate. The adaptive mmap threshold should have fixed this; no response, so it's probably safe to assume so. *** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla. |