Bug 29391 - [gdb/symtab] Parallelize process_queue
Summary: [gdb/symtab] Parallelize process_queue
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 29366
  Show dependency treegraph
 
Reported: 2022-07-21 13:11 UTC by Tom de Vries
Modified: 2022-12-25 20:04 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 Tom Tromey 2022-07-22 20:46:20 UTC
It's an interesting idea but as you found there are some issues.

The main issue behind a lot of the allocation problems is that
gdb has a few per-objfile data structures that can't easily be
used from multiple threads: the obstack but also the bcache
and the demangled hash table.

Maybe these problems could all be solved by sharding, or maybe
by heap allocation.  Also I think a couple patches in the series
introduce locks where something like compare-and-swap would work
just as well.

However, I tend to think there's a better approach overall.

The way I see it, there are two main issues with CU expansion.

One is that sometimes gdb decides to expand too many CUs in
response to a request.  This is maybe covered by one of the 
dependencies of bug #29366.  I am not sure yet (haven't looked
in detail) but I suspect the fix will be something like
short-circuiting expansion for certain kinds of queries.
Like, if gdb is looking for a type, just expand the first CU
that matches.

The second problem is that CU expansion can be slow.  Here I think
gdb could do a lot better, the basic idea being lazy CU expansion.
In response to a CU expansion request, the DWARF reader would
create the symtab / compunit_symtab structures and it would also
create some "outline" struct symbols -- one for each cooked_index_entry.
Then when some attribute of a symbol is needed (say, the type),
the DWARF reader would read the rest of the symbol that that moment.

The major advantage of this approach is that most data in a CU
is not needed at all.  So, much less work would need to be done in
general.  I think it would be possible to avoid reading every DIE.

A secondary advantage is that, because the symbols are created directly
from the cooked index, we would avoid the situation where the
two readers could diverge.  That would no longer be possible at all.

There are some downsides.  It's more complex, and it is complicated to
implement and test.  Also I think it would require fixing the .debug_names
bug, and also probably removing .gdb_index support.  Finally, we'd have
to change the blockvector to be expandable.
Comment 2 Tom de Vries 2022-07-24 08:26:36 UTC
(In reply to Tom Tromey from comment #1)
> The second problem is that CU expansion can be slow.  Here I think
> gdb could do a lot better, the basic idea being lazy CU expansion.

Filed as PR29398 - [gdb/symtab] lazy CU expansion.