This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH][malloc] Avoid atomics in have_fastchunks

From: DJ Delorie <dj at redhat dot com>
To: Markus Trippelsdorf <markus at trippelsdorf dot de>, Wilco dot Dijkstra at arm dot com
Cc: carlos at redhat dot com, libc-alpha at sourceware dot org, nd at arm dot com
Date: Wed, 20 Sep 2017 15:11:02 -0400
Subject: Re: [PATCH][malloc] Avoid atomics in have_fastchunks
Authentication-results: sourceware.org; auth=none
Authentication-results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=dj at redhat dot com
Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 58C79267F8

Markus Trippelsdorf <markus@trippelsdorf.de> writes:
> The reason why nobody uses your trace/simulation patches is because they
> generate way too much data and are just too complex/invasive for most
> users.

Well, the reason people don't use trace is because it isn't a built-in
easy button labelled "send perforance data upstream".  We're trying to
make it easier to use.

However, I agree that it can generate a LOT of data.  My record so far
is 4.5 TERABYTES of trace.  But... the applications that generate that
much data are the applications that are WAY too complex to analyze by
review, and they're exceptions.  Also, the trace data can be converted
into benchmarks (its original purpose).  My original intention was to
build up a corpus of trace-generated benchmarks ("workloads") that we
can use to "represent" various applications in our malloc performance
testing.  It's very difficult to get reliable malloc performance data
out of, say, LibreOffice or qemu, or any other app which (1) requires
user interactions, (2) requires an active network, (3) is difficult to
configure, (4) is part of a suite of applications that must run
together, or (5) might be proprietary or otherwise difficult to run
locally.

Having a corpus of *real world* benchmarks, not just synthetic ones, is
IMHO very important for something like malloc, where complexity of API
usage has significant effects on performance.  Using workload
simulations allows us to remove many variables and get more consistent
before/after comparisons.  Representing many real applications means we
don't unknowingly regress performance for any of them.

> And someone would have to analyze all this data.

We've done that internally, with the 4.5TB trace data.  It took weeks.
We didn't have any other way of solving the customer's problem, but we
*were* able to solve their problem.  Will trace solve all our problems?
No.  Can trace solve some of our problems?  Yes.

Mostly, though, malloc trace isn't about analyzing the trace, it's about
capturing a reproducible workload for future benchmarking.

For example, Florian posted a malloc patch and thought it might affect
performance.  I was able to script up a before/after test of ALL my
workloads and show that the effect was negligible.  I'll probably do it
for Wilco's patch as well.

> So it is natural to look for other metrics like code complexity instead.

We should, of course, persue all practical options :-)

And we can use the various tools together - patches based on complexity
still need to be proven, and workload simulations prove them.

Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> Indeed, I can generate hundreds of gigabytes of malloc traces in a few hours...
> But what's the point?

As noted above, they represent real-world applications in benchmarks.

> Traces are many orders of magnitude too large to share (let alone
> commit),

Carlos and I discussed this many times, and there isn't a good solution
other than a git-annex tool that lets you store them across the
internet.  I have a collection of workloads I can share, but yeah,
moving them around takes time.  They average in the tens-of-megabytes
size, with a few in the hundreds-of-megs range.  The few of us who work
on malloc would have them pre-copied locally, I suppose.

> So I think we'll need to add microbenchmarks that test various aspects
> of memory allocation.

Note that the simulation framework lets us do microbenchmarks as well :-)

References:
- Re: [PATCH][malloc] Avoid atomics in have_fastchunks
  - From: Markus Trippelsdorf

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]