In https://sourceware.org/pipermail/gdb-patches/2022-April/188400.html, Pedro pointed out that, even with the patch in that thread, the new DWARF reader still generates a larger index: In the old indexer, with an index of a gdb from before the indexer rewrite, we had: [369902] intrusive_list<inferior, intrusive_base_node<inferior> >::begin: 9 [global, function] 255 [global, function] while in the new indexer, for the same function, we have: [369902] intrusive_list<inferior, intrusive_base_node<inferior> >::begin: 9 [global, function] 79 [global, function] 165 [global, function] 167 [global, function] [ ... many more instances ... ] I suspect this is due to the old handling of DW_AT_inline. See https://sourceware.org/pipermail/gdb-patches/2022-April/188557.html
In a follow-up email, Pedro said: Hmm, inlines was my original suspicion, but seeing that setting a breakpoint at "intrusive_list<inferior, intrusive_base_node<inferior> >::begin", even with -readnow gives you 2 locations, same number of locations as entries in the old index, could it be that this is more about declarations vs definitions? There's that check for has_pc_info or has_range_info too. So this ought to be investigated.
It does seem that the declarations end up in the symbol table. In one CU, there's a concrete instance like: <1><28afba>: Abbrev Number: 40 (DW_TAG_subprogram) <28afbb> DW_AT_specification: <0x27d5e2> ... pointing back at the method declaration. Both the old and new gdb will emit an index entry for this. However, in another CU, there's the same declaration but with no concrete instance -- and here, the old gdb does not emit an entry, but the new one does. I suspect these declarations shouldn't be in the cooked index at all.
Further searching shows that the second CU in question has: <1><d94e58>: Abbrev Number: 24 (DW_TAG_subprogram) <d94e59> DW_AT_specification: <0xd5c48b> <d94e5d> DW_AT_object_pointer: <0xd94e77> <d94e61> DW_AT_low_pc : 0x0 <d94e69> DW_AT_high_pc : 0x28 <d94e71> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <d94e73> DW_AT_call_all_tail_calls: 1 <d94e73> DW_AT_sibling : <0xd94e84> The new index seems more correct to me, and in particular the old gdb index does: (gdb) break intrusive_list<inferior, intrusive_base_node<inferior> >::begin() const Breakpoint 1 at 0x4ac474: intrusive_list<inferior, intrusive_base_node<inferior> >::begin() const. (4 locations) Whereas with the new index we get: (gdb) break intrusive_list<inferior, intrusive_base_node<inferior> >::begin() const Breakpoint 1 at 0x4ac474: intrusive_list<inferior, intrusive_base_node<inferior> >::begin() const. (58 locations) So I tend to think the old gdb was missing some locations.
FWIW with -readnow I get 58 locations, but with ordinary "gdb" (a /bin/gdb from before the new DWARF reader), I just get 4 locations. So maybe the older index was a side effect of some kind of erroneous duplication in partial symtabs.
As part of another bug, I looked into this some more: https://sourceware.org/pipermail/gdb-patches/2022-October/192662.html In particular I generated a .gdb_index for a copy of gdb, using both the old and new gdb (that is, before and after the new DWARF reader). Then I compared the list of symbols. In every case I checked, the new gdb was correct. (Well, to be clear, there was one case that was incorrect, but I landed a patch to fix this.) So, I tend to think that the old index was simply missing some entries, and nobody noticed.