30826 – [gdb/symtab] Review PU presence in objfile->compunits ()

Bug 30826 - [gdb/symtab] Review PU presence in objfile->compunits ()

Summary: [gdb/symtab] Review PU presence in objfile->compunits ()

Status:	NEW

Alias:	None

Product:	gdb
Classification:	Unclassified
Component:	symtab (show other bugs)
Version:	HEAD

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2023-09-06 10:23 UTC by Tom de Vries
Modified:	2023-09-06 10:23 UTC (History)
CC List:	0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tom de Vries 2023-09-06 10:23:32 UTC

[ First discussed here ( https://sourceware.org/pipermail/gdb-patches/2023-September/202161.html ) ]

The PUs are part of the linked list objfile->compunits (), and I've recently committed a few fixes that skip PUs when walking the linked list:
...
$ git log --pretty=commit HEAD^^..HEAD
commit e061219f5d6 ("[gdb/symtab] Fix too many symbols in gdbpy_lookup_static_symbols")
commit 7023b8d86c6 ("[gdb/symtab] Handle PU in iterate_over_some_symtabs")
...

The question is whether we can just drop the PUs from the linked list.

[ On a related note, I've been wondered about adding a sorted vector alongside objfile->compunits (), which is sorted according to file offset, to address the "insertion-order is search-order" problem. If we would add such a vector, and do all searches on it instead of the linked list, then rather than deleting PUs from the linked list, we could not add them to the vector, which is slightly easier. ]

Testing a WIP patch (see the gdb-patches thread mentioned above) revealed regressions in gdb.cp/m-static.exp, where lookup_symbol_in_objfile_symtabs walks the linked list but doesn't walk the corresponding includes for each CU.

However, it could be argued that the current way of visiting PUs is more efficient. It guarantees that PUs are visited only once, and if we visit all includes for each CU, then a PU might be visited twice.

Note how the efficient approach differs depending on whether we lookup one match or all matches:
- if we lookup one match, we want to not skip PUs, but skip includes, and
- if we lookup all matches, we want to skip PUs, but not skip includes.

In the former case, when finding the match in a PU, it's virtually a match in the top-level canonical includer CU (the one you get when walking the user chain to the top).

In the latter case, a PU can be visited more than once, which is potentially inefficient. This could be addressed with caching, perhaps this is already done.