This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [patch] malloc per-thread cache ready for review

From: DJ Delorie <dj at redhat dot com>
To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
Cc: libc-alpha at sourceware dot org
Date: Thu, 02 Feb 2017 14:43:09 -0500
Subject: Re: [patch] malloc per-thread cache ready for review
Authentication-results: sourceware.org; auth=none

Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> There are several issues, the preload binary reports errors when it is
> linked with other GLIBCs (so you can't easily use it unless you've
> installed the malloc tracing GLIBC

Correct; the trace infrastructure is in glibc itself.  The preload is
just for turning it on and off.  So, what you do is capture the trace
with the instrumented glibc, convert the trace to a workload for the
simulator, then simulate the workload under various non-trace versions
of glibc.

> Also it writes 64-byte records for every malloc/free, so the trace
> files are absolutely huge, ~60GB for the one benchmark I tried.

Yup.  We gather a lot of info that isn't always needed, and favored API
speed over memory usage.  Our record so far is a 4.5Tb trace file.

> It seems to me LEB128 compression of the address/size differences
> should make it more reasonable - you could even remove the address if
> there is a count field in each allocated block.

We've talked with the various trace working groups about this, and the
longer term goal is to integrate one of the existing frameworks into the
system rather than continue to develop ours.  At the time I wrote it,
though, I needed to have the absolute minimum timing impact on the
running program.

> Not sure now what we can do with these traces, since trace_run doesn't
> like being linked with different GLIBCs, so I cannot use it to check
> the replay time of old vs new GLIBC malloc...

trace_run doesn't need to be linked against anything special, or even
built specially.  I built mine with the F24 system libraries.  Once you
have a workload, you no longer need the trace infrastructure to
benchmark it.

To summarize:

* Use the dj/malloc glibc with its preload to capture a trace
* Convert the trace to a workload file
* Run the simulator with the workload under various glibc's

The trace-enabled glibc is only needed for the first step, and using
testrun.sh is sufficient for that purpose (although I hack mine to set
LD_PRELOAD inside testrun.sh so it doesn't affect the bash that runs the
script itself)

Follow-Ups:
- Re: [patch] malloc per-thread cache ready for review
  - From: Steve Vormwald

References:
- Re: [patch] malloc per-thread cache ready for review
  - From: Wilco Dijkstra

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]