This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.

From: Jonathon Anderson <jma14 at rice dot edu>
To: Mark Wielaard <mark at klomp dot org>
Cc: elfutils-devel at sourceware dot org, Srdan Milakovic <sm108 at rice dot edu>
Date: Fri, 08 Nov 2019 09:28:54 -0600
Subject: Re: [PATCH] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.
References: <e6ae6fdcad2fe12e33cfa838a020108051f73d31.camel@klomp.org> <20191104163932.21891-1-jma14@rice.edu> <27820824019c4966ca81c7d65d6c68581633aa60.camel@klomp.org> <1573140272.2173.0@rice.edu> <e762f5d48fe7c70a924ac919239902188309a30f.camel@klomp.org>



On Fri, Nov 8, 2019 at 15:07, Mark Wielaard <mark@klomp.org> wrote:

Hi,

On Thu, 2019-11-07 at 09:24 -0600, Jonathon Anderson wrote:
On Thu, Nov 7, 2019 at 12:07, Mark Wielaard <mark@klomp.org<mailto:mark@klomp.org>> wrote:> Looking at the difference between the previous version and thisone,
 > it
> incorporates my simplification of FIND and lookup functions. Andfixes> it by making insert_helper consistently return -1 when the valuewas> already inserted (hash == hval). And it fixes an issue where wewere> using the the table entry val_ptr instead of the hashval as hash(was
 > that the typo? it didn't look harmless).

 Yep, those are the changes (plus the Sig8 patch). That typo was
harmless because hash would be overwritten before its next use (orjustunused), now with the (hash == hval) clause its always read so thetypo
 is fixed.
Thanks for explaining. I have now finally pushed this to master.
 Regarding the Sig8 table: I took a close look, and at the moment its
 use is in an area that isn't thread-safe anyway (in particular,
__libdw_intern_next_unit). Depending on how that is parallelizedtheremight be an issue (if its just wrapped with a separate mutex athread
 might "miss" a CU if its not already in the table), but since that
region would need inspection at that time anyway I'm fine witheither
 option.
I still kept the code to handle the Sig8 table with the newconcurrent-
safe code, since I think it is better if we use the new code always
(even in the single threaded case).

So to fix this we do need some mutex to protect the binary search tree
when calling tsearch/tfind? Or do you see other issues too?

The search tree can be handled with a mutex, the issue is withnext_{tu,cu}_offset and the general logic of the function. As anexample: suppose two threads look up in the Sig8 for A and see that itsnot currently present. They'll both use __libdw_intern_next_unit toload CUs (or units, I suppose) until they either find it or run out ofunits.

If the entirety of intern_next_unit is wrapped in a mutex, one of thetwo will load in the missing entry and finish, while the other has"missed" it and will keep going until no units remain. The easysolution is to have the threads check the table again on next_unitfailure for whether the entry has been added, but that incurs alarge-ish overhead for the constant reprobing. The easiest way aroundthat is to add an interface on the Sig8 table that returns a "probe" onlookup failure that can be continued with only a handful of atomics(essentially saving state from the first find). The downside to thisapproach is that unit parsing is fully serialized.

If the next_*_offset is handled with a separate mutex or as an atomic(say, using fetch_add), the same issue occurs but without the mutexthere's no guarantee that another thread isn't currently parsing andwill write the entry soon, so the easy solution doesn't work. Since theSig8 key is only known after the parsing is complete, we can't eveninsert a "in progress" entry. One solution is to allow for duplicateparsing (but then next_*_offset would have to be updated *after*Sig8_Hash_insert), another is to use a condition variable on whetherall the units have been parsed (so threads that don't find what they'relooking for would block until its certain that it doesn't exist).


Both are viable directions, but neither are trivial.

This isn't an issue for Dwarf_Abbrev, the worst that can happenthere
 is a duplicate alloc and parsing (as long as the DWARF doesn't have
erroneous abbrev entries, if it does we would need to thread-safethe
 error handling too).
Unfortunately we don't always control the data, so bad abbrev entries
could happen. The extra alloc wouldn't really "leak" because it would
be freed with the Dwarf. So I am not too concerned about that. Is that
the worse that can happen in __libdw_getabbrev? When we goto invalid
the Dwarf_Abbrev would indeed "leak", but it isn't really lost, itwill
get cleaned up when the Dwarf is destroyed.

It wouldn't "leak," but it would be taking up space until thedwarf_end. Not that I mind (they're really small).

I'm thinking more of the case where the Abbrev_insert returns -1 (entrycollision), in that case the new Abbrev would stick around until thedwarf_end.


Thanks,

Makr

Follow-Ups:
- Re: [PATCH] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.
  - From: Mark Wielaard

References:
- Re: [PATCH 3/3] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.
  - From: Mark Wielaard
- [PATCH] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.
  - From: Jonathon Anderson
- Re: [PATCH] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.
  - From: Mark Wielaard
- Re: [PATCH] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.
  - From: Jonathon Anderson
- Re: [PATCH] lib + libdw: Add and use a concurrent version of the dynamic-size hash table.
  - From: Mark Wielaard

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]