This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 2/2] libdw: Rewrite the memory handler to be more robust.

From: Jonathon Anderson <jma14 at rice dot edu>
To: Mark Wielaard <mark at klomp dot org>
Cc: elfutils-devel at sourceware dot org
Date: Fri, 08 Nov 2019 11:41:20 -0600
Subject: Re: [PATCH 2/2] libdw: Rewrite the memory handler to be more robust.
References: <1572380520.19948.0@rice.edu> <20191029211437.3268-1-mark@klomp.org> <20191029211437.3268-2-mark@klomp.org> <93a4d8983b6ec43c09c6c3b3f6ed8d358321bb9d.camel@klomp.org> <1573152030.2173.2@rice.edu> <9caa7eee4a810e3f26cb2cb4ec4046e15d7dad4e.camel@klomp.org>



On Fri, Nov 8, 2019 at 17:22, Mark Wielaard <mark@klomp.org> wrote:

On Thu, 2019-11-07 at 12:40 -0600, Jonathon Anderson wrote:
I haven't benchmarked this version, but I did benchmark theequivalentearlier version (this version is almost quite literally a rebase oftheother). I don't have the exact results on hand, what I remember isthat
 the pthread_key method was faster (and handled the many-thread case
 better), by maybe a factor of 1.5x-2x in parallel. In serial the
overhead was minimal (just an extra pointer indirection onallocations).
I just tested the single-threaded case a bit and is not measurable
slower than the previous version, and compared to 0.177 things are
maybe ~1% slower (so probably in the noise).
A factor 1.5x-2.0x slower in parallel does seem significant. Is thatin
the case of many-threads that are colliding a lot or in general?

I believe it was 64 threads colliding a lot (on the reader side ofmem_rwl). That said, this is all based on my memory from before thesemester started. (They may also be numbers munged out of a largerbenchmark, so don't trust them too much).

As it happens, on our end any slowdown is entirely hidden by all theother work we do while reading DIEs, so its not a critical concern. Ourcode opens a Dwarf and then uses a #pragma parallel for across the CUs(using a serial recursion to read the DIEs), if you want to benchmarkit that should suffice on a large enough example.


Thanks,

Mark

References:
- Re: [PATCH 0/2] libdw: Rewrite the memory handler to be more robust
  - From: Jonathon Anderson
- [PATCH 1/2] Add configure options for Valgrind annotations.
  - From: Mark Wielaard
- [PATCH 2/2] libdw: Rewrite the memory handler to be more robust.
  - From: Mark Wielaard
- Re: [PATCH 2/2] libdw: Rewrite the memory handler to be more robust.
  - From: Mark Wielaard
- Re: [PATCH 2/2] libdw: Rewrite the memory handler to be more robust.
  - From: Jonathon Anderson
- Re: [PATCH 2/2] libdw: Rewrite the memory handler to be more robust.
  - From: Mark Wielaard

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]