[PATCH 4/4] gdb: change regcache list to be a map
Simon Marchi
simark@simark.ca
Thu Jul 30 16:58:13 GMT 2020
On 2020-07-23 9:53 p.m., Pedro Alves wrote:
> On 7/20/20 9:41 PM, Simon Marchi via Gdb-patches wrote:
>> One regcache object is created for each stopped thread and is stored in
>> the regcache::regcaches linked list. Looking up a regcache for a given
>> thread is therefore in O(number of threads). Stopping all threads then
>> becomes O((number of threads) ^ 2). It becomes noticeable when
>> debugging thousands of threads, which is typical with GPU targets. This
>> patch replaces the linked list with an std::unordered_multimap, indexed
>> by (target, ptid).
>>
>> I originally designed it using an std::unordered_map with (target, ptid,
>> arch) as the key, because that's how lookups are done (in
>> get_thread_arch_aspace_regcache). However, the registers_changed_ptid
>> function, also somewhat on the hot path (it is used when resuming
>> threads), needs to delete all regcaches associated to a given (target,
>> ptid) tuple. Using (target, ptid) as a key allows to do this more
>> efficiently (see exception below). If the key of the map was (target,
>> ptid, arch), we'd have to walk all items of the map.
>>
>> The lookup (in get_thread_arch_aspace_regcache), walks over all existing
>> regcaches belonging to this (target, ptid), looking to find the one with
>> the right arch. This is ok, as there will be very few regcaches for a
>> given key (typically one). Lookups become faster when the number of
>> threads grows, compared to the linked list. With a small number of
>> threads, it will probably be a bit slower to do a map lookup than to
>> walk a few linked list nodes, but I don't think it will be noticeable in
>> practice.
>>
>> The function registers_changed_ptid deletes all regcaches related to a
>> given (target, ptid). We must now handle the different cases
>> separately:
>>
>> - NULL target and minus_one_ptid: we delete all the entries
>> - NULL target and non-minus_one_ptid: invalid (checked by assert)
>> - non-NULL target and non-minus_one_ptid: we delete all the entries
>> associated to that tuple, this is done efficiently
>> - a non-NULL target and minus_one_ptid: we delete all the entries
>> associated to that target, whatever the ptid. This is the slightly
>> annoying case, as we can't easily look up all items having this target
>> in their key. I implemented it by walking the list, which is not
>> ideal.
>
> Given the last case, did you consider using a two-level map instead?
>
> using ptid_regcache_map
> = std::unordered_multimap<ptid_t, regcache *>;
>
> using target_ptid_regcache_map
> = std::unordered_map<process_stratum_target *, ptid_regcache_map>;
>
> static target_ptid_regcache_map regcaches;
>
> The result for registers_changed_ptid would be:
>
> - NULL target and minus_one_ptid: we delete all the REGCACHES entries,
>
> - NULL target and non-minus_one_ptid: invalid (checked by assert)
>
> - non-NULL target and non-minus_one_ptid: we lookup the target
> in REGCACHES, and then delete all the entries in that map
> associated to that ptid.
>
> - a non-NULL target and minus_one_ptid: we delete the map entry
> in REGCACHES associated to that target.
I thought about it early on, but dismissed it thinking it would be too heavy
for nothing, having a map of maps. But I didn't think it would help that last
case, that might be a good reason to go with that.
> Looking up a regcache does two map lookups instead of one, but
> that is 2x amortized O(1), so should be a constant hit.
>
> I gave this a try, to see how feasible it would be. See attached
> patches on top of yours (the first two address comments I'll make below).
> I've also put these in the users/palves/regcache-map branch on
> sourceware.org.
Thanks, I'll run with that.
> I did not run performance tests, though. It may be that in
> registers_changed_ptid, it would be more efficient to
> not clear the first level map entry for a given target (i.e.,
> destroy the second level map object completely), but instead
> clear the second level map entry. I.e., avoid destroying the
> unordered_multimap for a given target, only to recreate it soon
> enough. But I kept it simple in case that isn't noticeable.
Will try.
> A similar two-level data structure would be to put an instance of:
>
> using ptid_regcache_map
> = std::unordered_multimap<ptid_t, regcache *>;
>
> in process_stratum_target directly.
>
> IOW, target-connection.c:process_targets has a list of open targets,
> which could be used to walk over all targets in the
> "NULL target and minus_one_ptid" scenario.
Ok, that would be an extra step, I suggest we keep it in mind for a subsequent
refactor.
>> The function regcache_thread_ptid_changed is called when a thread
>> changes ptid. It is implemented efficiently using the map, although
>> that's not very important: it is not called often, mostly when creating
>> an inferior, on some specific platforms.
>>
>> Note: In hash_target_ptid, I am combining hash values from std::hash by
>> summing them. I don't think it's ideal, since std::hash is just the
>> identity function for base types. But I don't know what would be better
>> to reduce the change of collisions. If anybody has a better idea, I'd
>> be interested.
>
> I'd maybe look in some kernel sources, e.g., Linux or BSD, what they
> use as hash function for pids.
Ok.
>> +/* Hash function for target_ptid. */
>> +
>> +struct hash_target_ptid
>> +{
>> + size_t operator() (const target_ptid &val) const
>> + {
>> + std::hash<long> h_long;
>> + hash_ptid h_ptid;
>> +
>> + return h_long ((long) val.target) + h_ptid (val.ptid);
>
> Note that on 64-bit Windows, long is 32-bit, so pointers don't
> fit. It just means you're only hashing half the bits.
> I wonder whether using libiberty's htab_hash_pointer would be
> a good idea here.
Huh, I did that because there's no std::hash overload for uintptr_t. But
I just saw there is:
template< class T > struct hash<T*>;
which seems to be for hashing pointers. It is implemented as:
return reinterpret_cast<size_t>(__p);
So I'll just use that. Otherwise, htab_hash_pointer would be good too.
>> @@ -417,10 +458,25 @@ static void
>> regcache_thread_ptid_changed (process_stratum_target *target,
>> ptid_t old_ptid, ptid_t new_ptid)
>> {
>> - for (auto ®cache : regcaches)
>> + /* Find all the regcaches to updates. */
>
> typo: to update.
Fixed.
>> + std::vector<regcache *> regcaches_to_update;
>> + target_ptid old_key (target, old_ptid);
>> + auto range = regcaches.equal_range (old_key);
>> + for (auto it = range.first; it != range.second; it++)
>> + regcaches_to_update.push_back (it->second);
>> +
>> + /* Remove all entries associated to OLD_KEY. We could have erased the items
>> + in the previous `for`, but it is only safe to erase items while iterating
>> + starting with C++14. */
>> + regcaches.erase (old_key);
>
> I don't think that's really true in practice. The idiom used pretty
> much by everyone is:
>
> for (auto it = range.first; it != range.second;)
> {
> regcaches_to_update.push_back (it->second);
> it = regcaches.erase (it);
> }
>
> Note that even the resolution of the C++ issue at:
>
> https://wg21.cmeerw.net/lwg/issue2356
>
> says:
>
> "not that any actual implementation does that, anyway"
>
> There's no need to be super pedantic wrt to an issue
> that no compiler is going to miscompile.
Ok, didn't know about that.
> See here as well:
> https://stackoverflow.com/questions/25047241/c11-is-it-safe-to-remove-individual-elements-from-stdunordered-map-while-it
>
> And then, given we know that there are no entries for
> new_ptid in the map, I think we could instead do it all in
> one iteration, like:
>
> target_ptid old_key (target, old_ptid);
> target_ptid new_key (target, new_ptid);
>
> auto range = regcaches.equal_range (old_key);
> for (auto it = range.first; it != range.second;)
> {
> regcache *rc = it->second;
> rc->set_ptid (new_ptid);
>
> /* Remove old before inserting new, to avoid rehashing, which
> would invalidate iterators. */
> it = regcaches.erase (it);
> regcaches.insert (std::make_pair (new_key, rc));
> }
Thanks, I used this.
>> +
>> + /* Update the regcaches' ptid, insert them back in the map with an updated
>> + key. */
>> + target_ptid new_key (target, new_ptid);
>> + for (regcache *rc : regcaches_to_update)
>> {
>> - if (regcache->ptid () == old_ptid && regcache->target () == target)
>> - regcache->set_ptid (new_ptid);
>> + rc->set_ptid (new_ptid);
>> + regcaches.insert (std::make_pair (new_key, rc));
>> }
>> }
>>
>> @@ -438,19 +494,53 @@ regcache_thread_ptid_changed (process_stratum_target *target,
>> void
>> registers_changed_ptid (process_stratum_target *target, ptid_t ptid)
>> {
>> - for (auto oit = regcaches.before_begin (), it = std::next (oit);
>> - it != regcaches.end (); )
>> + if (target == nullptr)
>> {
>> - struct regcache *regcache = *it;
>> - if ((target == nullptr || regcache->target () == target)
>> - && regcache->ptid ().matches (ptid))
>> - {
>> - delete regcache;
>> - it = regcaches.erase_after (oit);
>> - }
>> - else
>> - oit = it++;
>> + /* Since there can be ptid clashes between targets, it's not valid to
>> + pass a ptid without saying to which target it belongs. */
>> + gdb_assert (ptid == minus_one_ptid);
>> +
>> + /* Delete all the regcaches. */
>> + for (auto pair : regcaches)
>> + delete pair.second;
>
> We could store a std::unique_ptr<regcache> in the map instead
> to avoid all these manual deletes.
Indeed, I'll do that.
>> +
>> + regcaches.clear ();
>> }
>> + else if (ptid != minus_one_ptid)
>> + {
>> + /* Non-NULL target and non-minus_one_ptid, delete all regcaches belonging
>> + to this (TARGET, PTID). */
>> + target_ptid key (target, ptid);
>> + auto range = regcaches.equal_range (key);
>> + for (auto it = range.first; it != range.second; it++)
>> + delete it->second;
>
> Like this one.
Done.
>> +
>> + regcaches.erase (key);
>
> Here as well could do:
>
> for (auto it = range.first; it != range.second;)
> {
> delete it->second;
> it = regcaches.erase (it);
> }
With the unique_ptr, it's now just:
regcaches.erase (target_ptid (target, ptid));
>> + }
>> + else
>> + {
>> + /* Non-NULL target and minus_one_ptid, delete all regcaches associated
>> + to this target.
>> +
>> + We unfortunately don't have an efficient way to do this. Fall back
>> + to iterating all items to find all those belonging to TARGET.
>> +
>> + Note that in C++11, it's not safe to erase map entries while
>> + iterating, so we keep track of them and delete them at the end. */
>
> Here as well.
It's down to:
for (auto it = regcaches.begin (); it != regcaches.end ();)
{
if (it->second->target () == target)
it = regcaches.erase (it);
else
it++;
}
Thanks for the comments. I'll try the multi-level map thing and come back with
another version.
Simon
More information about the Gdb-patches
mailing list