This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2][malloc] Use relaxed atomics for malloc have_fastchunks
- From: DJ Delorie <dj at redhat dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: libc-alpha at sourceware dot org, nd at arm dot com
- Date: Tue, 26 Sep 2017 13:06:24 -0400
- Subject: Re: [PATCH v2][malloc] Use relaxed atomics for malloc have_fastchunks
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=dj at redhat dot com
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 857D4C058ECE
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
>> Workload Pristine Patched
>> 389ds 9,121,687,695 8,017,021,813 87.89%
>> dj2 7,901,004,232 8,277,784,940 104.77%
>> ...
>> okular-1 3,648,656,309 3,220,751,900 88.27%
>> oocalc 1,053,984,703 1,009,859,213 95.81%
>> qemu-virtio 781,260,028 766,458,246 98.11%
>> qemu-win7 655,497,193 626,270,566 95.54%
>> proprietary-2 2,112,159,165 1,977,684,058 93.63%
> Btw have you tried running these traces with say:
> export GLIBC_TUNABLES=glibc.malloc.tcache_count=100
389ds 7,381,521,147 7,318,122,451
dj2 7,385,497,424 7,271,176,052
...
okular-1 3,658,050,304 3,590,747,773
oocalc 1,151,498,086 1,088,218,893
qemu-virtio 788,062,420 732,039,964
qemu-win7 686,729,339 710,123,160
proprietary-2 2,192,659,157 2,069,942,722
So, helps some, hurts others. The default is a compromise, and this is
why it's a tunable :-)
Plus, these results don't show how much memory (RSS) is used by the
cache for each app. The default is intentionally on the low side to
minimize RSS use.
(the usual caveat about benchmark results apply; I don't have a
dedicated machine to run these on, so "results vary, sometimes a lot" ;)
> It would be interesting to find out whether that (or even larger values)
> helps your traces too like it does the benchmarks I've tried.
If you could capture traces from those benchmarks, I'd appreciate
getting a copy so I can add them to my corpus. Esp if it's a benchmark
of a real-world app :-)