This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Remove atomic operations from malloc.c
- From: Rich Felker <dalias at libc dot org>
- To: Siddhesh Poyarekar <siddhesh at redhat dot com>
- Cc: Torvald Riegel <triegel at redhat dot com>, Leonhard Holz <leonhard dot holz at web dot de>, libc-alpha at sourceware dot org
- Date: Wed, 18 Feb 2015 21:07:29 -0500
- Subject: Re: [PATCH] Remove atomic operations from malloc.c
- Authentication-results: sourceware.org; auth=none
- References: <54DB130F dot 9070300 at web dot de> <1423652468 dot 9778 dot 250 dot camel at triegel dot csb> <20150218130009 dot GJ1594 at spoyarek dot pnq dot redhat dot com>
On Wed, Feb 18, 2015 at 06:30:10PM +0530, Siddhesh Poyarekar wrote:
> On Wed, Feb 11, 2015 at 12:01:08PM +0100, Torvald Riegel wrote:
> > If your machine has just two cores, then at the very least you should
> > measure for just two threads too; a bigger number of threads is not
> > putting more contention on any of the synchronization bits, there's just
> > some more likelihood to having to wait for a thread that isn't running.
> > Also, to really assess performance, this has to be benchmarked on a
> > machine with more cores. Additionally, you could argue why it should
> > not make a difference, and if that's a compelling argument, we could
> > follow it instead of the benchmark (which, as Will mentions, is hard to
> > make representative of real-world workloads).
> The default malloc implementation creates 8 * n arenas on a system
> with n cores, so for anything up to 8 * n threads, you're just
> measuring contention between threads for the CPU since they're all
> working on different arenas.
> Maybe one way to guarantee such contention is a test with one thread
> that allocates on an arena and another thread that frees from the same
> arena. I don't think the current benchmark does that.
I would really like to see more attention to this usage case (allocate
in one thread, free in another). It's an idiomatic msg/data-passing
strategy and probably the least complex in most cases, and it's a
shame if people are avoiding it for performance reasons.