[PATCH 4/4] gdb: change regcache list to be a map
Pedro Alves
pedro@palves.net
Fri Jul 24 01:53:27 GMT 2020
On 7/20/20 9:41 PM, Simon Marchi via Gdb-patches wrote:
> One regcache object is created for each stopped thread and is stored in
> the regcache::regcaches linked list. Looking up a regcache for a given
> thread is therefore in O(number of threads). Stopping all threads then
> becomes O((number of threads) ^ 2). It becomes noticeable when
> debugging thousands of threads, which is typical with GPU targets. This
> patch replaces the linked list with an std::unordered_multimap, indexed
> by (target, ptid).
>
> I originally designed it using an std::unordered_map with (target, ptid,
> arch) as the key, because that's how lookups are done (in
> get_thread_arch_aspace_regcache). However, the registers_changed_ptid
> function, also somewhat on the hot path (it is used when resuming
> threads), needs to delete all regcaches associated to a given (target,
> ptid) tuple. Using (target, ptid) as a key allows to do this more
> efficiently (see exception below). If the key of the map was (target,
> ptid, arch), we'd have to walk all items of the map.
>
> The lookup (in get_thread_arch_aspace_regcache), walks over all existing
> regcaches belonging to this (target, ptid), looking to find the one with
> the right arch. This is ok, as there will be very few regcaches for a
> given key (typically one). Lookups become faster when the number of
> threads grows, compared to the linked list. With a small number of
> threads, it will probably be a bit slower to do a map lookup than to
> walk a few linked list nodes, but I don't think it will be noticeable in
> practice.
>
> The function registers_changed_ptid deletes all regcaches related to a
> given (target, ptid). We must now handle the different cases
> separately:
>
> - NULL target and minus_one_ptid: we delete all the entries
> - NULL target and non-minus_one_ptid: invalid (checked by assert)
> - non-NULL target and non-minus_one_ptid: we delete all the entries
> associated to that tuple, this is done efficiently
> - a non-NULL target and minus_one_ptid: we delete all the entries
> associated to that target, whatever the ptid. This is the slightly
> annoying case, as we can't easily look up all items having this target
> in their key. I implemented it by walking the list, which is not
> ideal.
Given the last case, did you consider using a two-level map instead?
using ptid_regcache_map
= std::unordered_multimap<ptid_t, regcache *>;
using target_ptid_regcache_map
= std::unordered_map<process_stratum_target *, ptid_regcache_map>;
static target_ptid_regcache_map regcaches;
The result for registers_changed_ptid would be:
- NULL target and minus_one_ptid: we delete all the REGCACHES entries,
- NULL target and non-minus_one_ptid: invalid (checked by assert)
- non-NULL target and non-minus_one_ptid: we lookup the target
in REGCACHES, and then delete all the entries in that map
associated to that ptid.
- a non-NULL target and minus_one_ptid: we delete the map entry
in REGCACHES associated to that target.
Looking up a regcache does two map lookups instead of one, but
that is 2x amortized O(1), so should be a constant hit.
I gave this a try, to see how feasible it would be. See attached
patches on top of yours (the first two address comments I'll make below).
I've also put these in the users/palves/regcache-map branch on
sourceware.org.
I did not run performance tests, though. It may be that in
registers_changed_ptid, it would be more efficient to
not clear the first level map entry for a given target (i.e.,
destroy the second level map object completely), but instead
clear the second level map entry. I.e., avoid destroying the
unordered_multimap for a given target, only to recreate it soon
enough. But I kept it simple in case that isn't noticeable.
A similar two-level data structure would be to put an instance of:
using ptid_regcache_map
= std::unordered_multimap<ptid_t, regcache *>;
in process_stratum_target directly.
IOW, target-connection.c:process_targets has a list of open targets,
which could be used to walk over all targets in the
"NULL target and minus_one_ptid" scenario.
>
> The function regcache_thread_ptid_changed is called when a thread
> changes ptid. It is implemented efficiently using the map, although
> that's not very important: it is not called often, mostly when creating
> an inferior, on some specific platforms.
>
> Note: In hash_target_ptid, I am combining hash values from std::hash by
> summing them. I don't think it's ideal, since std::hash is just the
> identity function for base types. But I don't know what would be better
> to reduce the change of collisions. If anybody has a better idea, I'd
> be interested.
I'd maybe look in some kernel sources, e.g., Linux or BSD, what they
use as hash function for pids.
>
> This patch is a tiny bit from ROCm GDB [1] we would like to merge
> upstream. Laurent Morichetti gave be these performance numbers:
>
> The benchmark used is:
>
> time ./gdb --data-directory=data-directory /extra/lmoriche/hip/samples/0_Intro/bit_extract/bit_extract -ex "set pagination off" -ex "set breakpoint pending on" -ex "b bit_extract_kernel if \$_thread == 5" -ex run -ex c -batch
>
> It measures the time it takes to continue from a conditional breakpoint with
> 2048 threads at that breakpoint, one of them reporting the breakpoint.
>
> baseline:
> real 0m10.227s
> real 0m10.177s
> real 0m10.362s
>
> with patch:
> real 0m8.356s
> real 0m8.424s
> real 0m8.494s
>
> [1] https://github.com/ROCm-Developer-Tools/ROCgdb
>
> gdb/ChangeLog:
>
> * regcache.c (struct target_ptid): New struct.
> (hash_target_ptid): New struct.
> (target_ptid_regcache_map): New type.
> (regcaches): Change type to target_ptid_regcache_map.
> (get_thread_arch_aspace_regcache): Update to regcaches' new
> type.
> (regcache_thread_ptid_changed): Likewise.
> (registers_changed_ptid): Likewise.
> (regcaches_size): Likewise.
> (regcaches_test): Update.
> (regcache_thread_ptid_changed): Update.
> * gdbsupport/ptid.h (hash_ptid): New struct.
>
> Change-Id: Iabb0a1111707936ca111ddb13f3b09efa83d3402
> ---
> gdb/regcache.c | 192 ++++++++++++++++++++++++++++++++--------------
> gdbsupport/ptid.h | 16 ++++
> 2 files changed, 152 insertions(+), 56 deletions(-)
>
> diff --git a/gdb/regcache.c b/gdb/regcache.c
> index fb20250d20f0..eed8a8bde6e5 100644
> --- a/gdb/regcache.c
> +++ b/gdb/regcache.c
> @@ -29,7 +29,7 @@
> #include "reggroups.h"
> #include "observable.h"
> #include "regset.h"
> -#include <forward_list>
> +#include <unordered_map>
>
> /*
> * DATA STRUCTURE
> @@ -313,32 +313,73 @@ reg_buffer::assert_regnum (int regnum) const
> gdb_assert (regnum < gdbarch_num_regs (arch ()));
> }
>
> -/* Global structure containing the current regcache. */
> +/* Key type for target_ptid_regcache_map. */
> +
> +struct target_ptid
> +{
> + target_ptid (process_stratum_target *target, ptid_t ptid)
> + : target (target), ptid (ptid)
> + {}
> +
> + process_stratum_target *target;
> + ptid_t ptid;
> +
> + bool operator== (const target_ptid &other) const
> + {
> + return (this->target == other.target
> + && this->ptid == other.ptid);
> + }
> +};
> +
> +/* Hash function for target_ptid. */
> +
> +struct hash_target_ptid
> +{
> + size_t operator() (const target_ptid &val) const
> + {
> + std::hash<long> h_long;
> + hash_ptid h_ptid;
> +
> + return h_long ((long) val.target) + h_ptid (val.ptid);
Note that on 64-bit Windows, long is 32-bit, so pointers don't
fit. It just means you're only hashing half the bits.
I wonder whether using libiberty's htab_hash_pointer would be
a good idea here.
> + }
> +};
> +
> +/* Type to map a (target, ptid) tuple to a regcache. */
> +
> +using target_ptid_regcache_map
> + = std::unordered_multimap<target_ptid, regcache *, hash_target_ptid>;
> +
> +/* Global structure containing the existing regcaches. */
>
> /* NOTE: this is a write-through cache. There is no "dirty" bit for
> recording if the register values have been changed (eg. by the
> user). Therefore all registers must be written back to the
> target when appropriate. */
> -static std::forward_list<regcache *> regcaches;
> +static target_ptid_regcache_map regcaches;
>
> struct regcache *
> get_thread_arch_aspace_regcache (process_stratum_target *target,
> - ptid_t ptid, struct gdbarch *gdbarch,
> + ptid_t ptid, gdbarch *arch,
> struct address_space *aspace)
> {
> gdb_assert (target != nullptr);
>
> - for (const auto ®cache : regcaches)
> - if (regcache->target () == target
> - && regcache->ptid () == ptid
> - && regcache->arch () == gdbarch)
> - return regcache;
> -
> - regcache *new_regcache = new regcache (target, gdbarch, aspace);
> + /* Look up the regcache for this (target, ptid, arch). */
> + target_ptid key (target, ptid);
> + auto range = regcaches.equal_range (key);
> + for (auto it = range.first; it != range.second; it++)
> + {
> + if (it->second->arch () == arch)
> + return it->second;
> + }
>
> - regcaches.push_front (new_regcache);
> + /* It does not exist, create it. */
> + regcache *new_regcache = new regcache (target, arch, aspace);
> new_regcache->set_ptid (ptid);
>
> + /* Insert it in the map. */
> + regcaches.insert (std::make_pair (key, new_regcache));
> +
> return new_regcache;
> }
>
> @@ -417,10 +458,25 @@ static void
> regcache_thread_ptid_changed (process_stratum_target *target,
> ptid_t old_ptid, ptid_t new_ptid)
> {
> - for (auto ®cache : regcaches)
> + /* Find all the regcaches to updates. */
typo: to update.
> + std::vector<regcache *> regcaches_to_update;
> + target_ptid old_key (target, old_ptid);
> + auto range = regcaches.equal_range (old_key);
> + for (auto it = range.first; it != range.second; it++)
> + regcaches_to_update.push_back (it->second);
> +
> + /* Remove all entries associated to OLD_KEY. We could have erased the items
> + in the previous `for`, but it is only safe to erase items while iterating
> + starting with C++14. */
> + regcaches.erase (old_key);
I don't think that's really true in practice. The idiom used pretty
much by everyone is:
for (auto it = range.first; it != range.second;)
{
regcaches_to_update.push_back (it->second);
it = regcaches.erase (it);
}
Note that even the resolution of the C++ issue at:
https://wg21.cmeerw.net/lwg/issue2356
says:
"not that any actual implementation does that, anyway"
There's no need to be super pedantic wrt to an issue
that no compiler is going to miscompile.
See here as well:
https://stackoverflow.com/questions/25047241/c11-is-it-safe-to-remove-individual-elements-from-stdunordered-map-while-it
And then, given we know that there are no entries for
new_ptid in the map, I think we could instead do it all in
one iteration, like:
target_ptid old_key (target, old_ptid);
target_ptid new_key (target, new_ptid);
auto range = regcaches.equal_range (old_key);
for (auto it = range.first; it != range.second;)
{
regcache *rc = it->second;
rc->set_ptid (new_ptid);
/* Remove old before inserting new, to avoid rehashing, which
would invalidate iterators. */
it = regcaches.erase (it);
regcaches.insert (std::make_pair (new_key, rc));
}
> +
> + /* Update the regcaches' ptid, insert them back in the map with an updated
> + key. */
> + target_ptid new_key (target, new_ptid);
> + for (regcache *rc : regcaches_to_update)
> {
> - if (regcache->ptid () == old_ptid && regcache->target () == target)
> - regcache->set_ptid (new_ptid);
> + rc->set_ptid (new_ptid);
> + regcaches.insert (std::make_pair (new_key, rc));
> }
> }
>
> @@ -438,19 +494,53 @@ regcache_thread_ptid_changed (process_stratum_target *target,
> void
> registers_changed_ptid (process_stratum_target *target, ptid_t ptid)
> {
> - for (auto oit = regcaches.before_begin (), it = std::next (oit);
> - it != regcaches.end (); )
> + if (target == nullptr)
> {
> - struct regcache *regcache = *it;
> - if ((target == nullptr || regcache->target () == target)
> - && regcache->ptid ().matches (ptid))
> - {
> - delete regcache;
> - it = regcaches.erase_after (oit);
> - }
> - else
> - oit = it++;
> + /* Since there can be ptid clashes between targets, it's not valid to
> + pass a ptid without saying to which target it belongs. */
> + gdb_assert (ptid == minus_one_ptid);
> +
> + /* Delete all the regcaches. */
> + for (auto pair : regcaches)
> + delete pair.second;
We could store a std::unique_ptr<regcache> in the map instead
to avoid all these manual deletes.
> +
> + regcaches.clear ();
> }
> + else if (ptid != minus_one_ptid)
> + {
> + /* Non-NULL target and non-minus_one_ptid, delete all regcaches belonging
> + to this (TARGET, PTID). */
> + target_ptid key (target, ptid);
> + auto range = regcaches.equal_range (key);
> + for (auto it = range.first; it != range.second; it++)
> + delete it->second;
Like this one.
> +
> + regcaches.erase (key);
Here as well could do:
for (auto it = range.first; it != range.second;)
{
delete it->second;
it = regcaches.erase (it);
}
> + }
> + else
> + {
> + /* Non-NULL target and minus_one_ptid, delete all regcaches associated
> + to this target.
> +
> + We unfortunately don't have an efficient way to do this. Fall back
> + to iterating all items to find all those belonging to TARGET.
> +
> + Note that in C++11, it's not safe to erase map entries while
> + iterating, so we keep track of them and delete them at the end. */
Here as well.
> + std::vector<target_ptid> keys;
> +
> + for (auto pair : regcaches)
> + {
> + if (pair.second->target () == target)
> + {
> + keys.push_back (pair.first);
> + delete pair.second;
> + }
> + }
> +
> + for (auto key : keys)
> + regcaches.erase (key);
> + }
>
> if ((target == nullptr || current_thread_target == target)
> && current_thread_ptid.matches (ptid))
> @@ -1431,13 +1521,6 @@ register_dump::dump (ui_file *file)
>
> namespace selftests {
>
> -static size_t
> -regcaches_size ()
> -{
> - return std::distance (regcaches.begin (),
> - regcaches.end ());
> -}
> -
> /* Wrapper around get_thread_arch_aspace_regcache that does some self checks. */
>
> static void
> @@ -1457,7 +1540,7 @@ static void
> regcaches_test (void)
> {
> /* It is empty at the start. */
> - SELF_CHECK (regcaches_size () == 0);
> + SELF_CHECK (regcaches.size () == 0);
>
> ptid_t ptid1 (1), ptid2 (2), ptid3 (3);
>
> @@ -1469,52 +1552,52 @@ regcaches_test (void)
> test_get_thread_arch_aspace_regcache (&test_target1, ptid1,
> target_gdbarch (),
> NULL);
> - SELF_CHECK (regcaches_size () == 1);
> + SELF_CHECK (regcaches.size () == 1);
>
> /* Get regcache from (target1,ptid2), a new regcache is added to
> REGCACHES. */
> test_get_thread_arch_aspace_regcache (&test_target1, ptid2,
> target_gdbarch (),
> NULL);
> - SELF_CHECK (regcaches_size () == 2);
> + SELF_CHECK (regcaches.size () == 2);
>
> /* Get regcache from (target1,ptid3), a new regcache is added to
> REGCACHES. */
> test_get_thread_arch_aspace_regcache (&test_target1, ptid3,
> target_gdbarch (),
> NULL);
> - SELF_CHECK (regcaches_size () == 3);
> + SELF_CHECK (regcaches.size () == 3);
>
> /* Get regcache from (target1,ptid2) again, nothing is added to
> REGCACHES. */
> test_get_thread_arch_aspace_regcache (&test_target1, ptid2,
> target_gdbarch (),
> NULL);
> - SELF_CHECK (regcaches_size () == 3);
> + SELF_CHECK (regcaches.size () == 3);
>
> /* Get regcache from (target2,ptid2), a new regcache is added to
> REGCACHES, since this time we're using a different target. */
> test_get_thread_arch_aspace_regcache (&test_target2, ptid2,
> target_gdbarch (),
> NULL);
> - SELF_CHECK (regcaches_size () == 4);
> + SELF_CHECK (regcaches.size () == 4);
>
> /* Mark that (target1,ptid2) changed. The regcache of (target1,
> ptid2) should be removed from REGCACHES. */
> registers_changed_ptid (&test_target1, ptid2);
> - SELF_CHECK (regcaches_size () == 3);
> + SELF_CHECK (regcaches.size () == 3);
>
> /* Get the regcache from (target2,ptid2) again, confirming the
> registers_changed_ptid call above did not delete it. */
> test_get_thread_arch_aspace_regcache (&test_target2, ptid2,
> target_gdbarch (),
> NULL);
> - SELF_CHECK (regcaches_size () == 3);
> + SELF_CHECK (regcaches.size () == 3);
>
> /* Confirm that marking all regcaches of all targets as changed
> clears REGCACHES. */
> registers_changed_ptid (nullptr, minus_one_ptid);
> - SELF_CHECK (regcaches_size () == 0);
> + SELF_CHECK (regcaches.size () == 0);
> }
>
> class target_ops_no_register : public test_target_ops
> @@ -1847,27 +1930,24 @@ regcache_thread_ptid_changed ()
> get_thread_arch_aspace_regcache (&target1.mock_target, old_ptid, arch, NULL);
> get_thread_arch_aspace_regcache (&target2.mock_target, old_ptid, arch, NULL);
>
> - /* Return whether a regcache for (TARGET, PTID) exists in REGCACHES. */
> - auto regcache_exists = [] (process_stratum_target *target, ptid_t ptid)
> + /* Return the count of regcaches for (TARGET, PTID) in REGCACHES. */
> + auto regcache_count = [] (process_stratum_target *target, ptid_t ptid)
> {
> - for (regcache *rc : regcaches)
> - {
> - if (rc->target () == target && rc->ptid () == ptid)
> - return true;
> - }
> + target_ptid key (target, ptid);
>
> - return false;
> + auto range = regcaches.equal_range (key);
> + return std::distance (range.first, range.second);
> };
>
> - gdb_assert (regcaches_size () == 2);
> - gdb_assert (regcache_exists (&target1.mock_target, old_ptid));
> - gdb_assert (regcache_exists (&target2.mock_target, old_ptid));
> + gdb_assert (regcaches.size () == 2);
> + gdb_assert (regcache_count (&target1.mock_target, old_ptid) == 1);
> + gdb_assert (regcache_count (&target2.mock_target, old_ptid) == 1);
>
> thread_change_ptid (&target1.mock_target, old_ptid, new_ptid);
>
> - gdb_assert (regcaches_size () == 2);
> - gdb_assert (regcache_exists (&target1.mock_target, new_ptid));
> - gdb_assert (regcache_exists (&target2.mock_target, old_ptid));
> + gdb_assert (regcaches.size () == 2);
> + gdb_assert (regcache_count (&target1.mock_target, new_ptid) == 1);
> + gdb_assert (regcache_count (&target2.mock_target, old_ptid) == 1);
> }
>
> } // namespace selftests
> diff --git a/gdbsupport/ptid.h b/gdbsupport/ptid.h
> index ef52d5551260..a528312bad5e 100644
> --- a/gdbsupport/ptid.h
> +++ b/gdbsupport/ptid.h
> @@ -32,6 +32,8 @@
> thread_stratum target that might want to sit on top.
> */
>
> +#include <functional>
> +
> class ptid_t
> {
> public:
> @@ -143,6 +145,20 @@ class ptid_t
> long m_tid;
> };
>
> +/* Functor to hash a ptid. */
> +
> +struct hash_ptid
> +{
> + size_t operator() (const ptid_t &ptid) const
> + {
> + std::hash<long> long_hash;
> +
> + return (long_hash (ptid.pid ())
> + + long_hash (ptid.lwp ())
> + + long_hash (ptid.tid ()));
> + }
> +};
> +
> /* The null or zero ptid, often used to indicate no process. */
>
> extern const ptid_t null_ptid;
>
Thanks,
Pedro Alves
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-no-std-vector.patch
Type: text/x-patch
Size: 2999 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20200724/92102a78/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-std-unique_ptr.patch
Type: text/x-patch
Size: 2794 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20200724/92102a78/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Two-level-map.patch
Type: text/x-patch
Size: 8520 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20200724/92102a78/attachment-0005.bin>
More information about the Gdb-patches
mailing list