29105 – new DWARF reader still slow

Bug 29105 - new DWARF reader still slow

Summary: new DWARF reader still slow

Status:	RESOLVED FIXED

Alias:	None

Product:	gdb
Classification:	Unclassified
Component:	symtab (show other bugs)
Version:	HEAD

Importance:	P2 normal
Target Milestone:	13.1
Assignee:	Tom Tromey

URL:
Keywords:

Depends on:
Blocks:	29366
	Show dependency tree / graph

Reported:	2022-04-29 19:51 UTC by Tom Tromey
Modified:	2022-12-14 10:23 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tom Tromey 2022-04-29 19:51:00 UTC

Pedro pointed out that using the index cache is still faster than
the new DWARF reader.  I think I found why:

What happens here is that gdb does the finalization step in the
background.  Interactively, this helps make it seem faster.  However,
it's not really faster, it is just deferring some work.

You can see it with an invocation like:

   gdb -q -batch -iex 'set debug timestamp 1' -iex 'set debug dwarf-read 1' -ex start ./gdb

Here for gdb-with-an-index I see:

    0.820423 [dwarf-read] dwarf2_initialize_objfile: found gdb index from file
    1.699175 [dwarf-read] process_queue: Expanding one or more symtabs of objfile /tmp/gdb.idx ...

But for ordinary gdb:

    1.310033 [dwarf-read] dwarf2_build_psymtabs_hard: Done building psymtabs of /tmp/gdb
    4.666649 [dwarf-read] process_queue: Expanding one or more symtabs of objfile /tmp/gdb ...

This is disappointing of course.  It might be possible to reduce this
time at the cost of a bit more complexity in the lookup code and perhaps
a bit more memory use.  For example, right now finalization merges the
results from all the readers into a single entry table -- but maybe the
table could remain sharded instead.  Another possibility might be to do
the work in worker threads and, instead of sharding the result, pre-sort
the vectors and use a sorted merge operation to combine them.


See the last messages in this thread:

https://sourceware.org/pipermail/gdb-patches/2022-April/188194.html

Comment 1 Tom Tromey 2022-05-13 20:56:35 UTC

I have a patch to "shard" the index.
The idea is to do each finalization step locally in a given
index, and never combine them into the cooked vector.

This seems better for performance, at least from a
time-to-first-prompt perspective.  And, the memory overhead
doesn't seem so bad, either.

Before:

(gdb) file /tmp/gdb
2022-05-13 14:54:41.981 - command started
Reading symbols from /tmp/gdb...
2022-05-13 14:54:42.391 - command finished
Command execution time: 2.202096 (cpu), 0.410121 (wall)
Space used: 10010624 (+6594560 for this command)


After:

(gdb) file /tmp/gdb
2022-05-13 14:54:19.907 - command started
Reading symbols from /tmp/gdb...
2022-05-13 14:54:20.125 - command finished
Command execution time: 1.025581 (cpu), 0.218469 (wall)
Space used: 10014720 (+6598656 for this command)

Comment 2 Tom Tromey 2022-05-13 21:01:35 UTC

... however there is a new crash somewhere so take the
numbers with a grain of salt.

Comment 3 Tom Tromey 2022-05-16 18:51:07 UTC

https://sourceware.org/pipermail/gdb-patches/2022-May/189116.html

Comment 4 Simon Marchi 2022-05-18 16:12:14 UTC

Hi Tom,

I'm not sure why I am starting to see this only today, but I see a lot of tests timing out.  gdb.base/sizeof.exp is one of them.  It looks like it's due to a slowdown that started with the new DWARF reader patches.

My build uses -O0 and all the sanitizer and glibcxx debug stuff, so it doesn't help, but I see the same relative slowdown without that.

I copied the test program of gdb.base/sizeof.exp to /tmp/sizeof so it could be reused easily across builds.

The command I use is:

  time ./gdb -q --data-directory=data-directory -nx /tmp/sizeof -ex "start"  -iex "set debug dwarf-read 0" -ex "p/d sizeof(short)" -batch

This replicates the first steps of the test.

Using 3d20b8d99a5:
./gdb -q --data-directory=data-directory -nx /tmp/sizeof -ex "start" -ex  -e  1.18s user 0.17s system 116% cpu 1.157 total

Using 3d20b8d99a5^:
./gdb -q --data-directory=data-directory -nx /tmp/sizeof -ex "start" -ex  -e  0.40s user 0.07s system 109% cpu 0.427 total

By toggling "set debug dwarf-read" to 1, I noticed that we are now expanding a lot of CUs from libc's debug info when doing the "print sizeof":

[dwarf-read] process_queue: Expanding one or more symtabs of objfile /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.31.so ...
[dwarf-read] process_queue: Expanding symtab of CU at offset 0xd3cbe
[dwarf-read] process_queue: Done expanding CU at offset 0xd3cbe
[dwarf-read] process_queue: Done expanding symtabs of /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.31.so.
[dwarf-read] process_queue: Expanding one or more symtabs of objfile /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.31.so ...
[dwarf-read] process_queue: Expanding symtab of CU at offset 0x22bf6f
[dwarf-read] process_queue: Done expanding CU at offset 0x22bf6f
[dwarf-read] process_queue: Done expanding symtabs of /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.31.so.
[dwarf-read] process_queue: Expanding one or more symtabs of objfile /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.31.so ...
[dwarf-read] process_queue: Expanding symtab of CU at offset 0x7ef56e
[dwarf-read] process_queue: Done expanding CU at offset 0x7ef56e
[dwarf-read] process_queue: Done expanding symtabs of /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.31.so.
...

We expand ~1500 CUs.

Before, we didn't do that.  So having debug info for libc is important for reproducing this, and a GDB configured with --prefix/usr or some other switch that makes it find separate debug info in /usr/lib/debug (by default a local build would look in /usr/local/lib/debug).  I am reproducing this on Ubuntu 20.04.

Comment 5 Simon Marchi 2022-05-20 01:54:22 UTC

I investigated the problem reported in my last message a little bit.  What I saw was that the cooked index contains entries for base types, like "int".  When doing "print sizeof(short)", it eventually looks up the "int" type, (since short means "short int"), and all CUs match and get expanded, because they all have an int type.

Previously with partial symbols, I don't think we had entries for those.

What is the problem here?

- Do we want entries for base types in the index?
- Should the type search algorithm expand all CUs that contain a matching type?  Couldn't it stop at the first CU that has a matching type?

Comment 6 Tom Tromey 2022-05-20 15:13:29 UTC

> So having debug info for libc is important for reproducing this

I suppose it has to be a libc that has debug info but does not
use .gdb_index.

> When doing "print sizeof(short)", it eventually looks up the "int" type,
> (since short means "short int")

That seems peculiar.

> Previously with partial symbols, I don't think we had entries for those.

Probably they got unified by the bcache.
 
> What is the problem here?
> 
> - Do we want entries for base types in the index?
> - Should the type search algorithm expand all CUs that contain a matching
> type?  Couldn't it stop at the first CU that has a matching type?

I think you can't really avoid base types in the index.
That seems like it would open up a number of problems.

Maybe just stopping on the first type would be ok.  We'll have to investigate.
Otherwise some kind of de-duplication can be done; though I'd prefer
to push this kind of thing from read-time to lookup-time.

Comment 7 Simon Marchi 2022-05-20 15:36:44 UTC

I'm currently debugging objfile::lookup_symbol, and it seems the intention to stop expanding when the symbol is found.  What I see is that we are looking for the "signed int" type:

(top-gdb) frame
#6  0x0000555555c5dd69 in objfile::lookup_symbol (this=0x555556807040, kind=STATIC_BLOCK, name=0x7fffffffaca0 "signed int", domain=VAR_DOMAIN) at symfile-debug.c:276
276           if (!iter->expand_symtabs_matching (this,

But in cooked_index_functions::expand_symtabs_matching, we decided to match index entries with "int" instead of "signed int" (see the name_vec vector):

(top-gdb) frame
#5  0x00005555558e1d07 in cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_symtab*)>, enum_flags<block_search_flag_values>, domain_enum_tag, search_domain) (this=0x555556809360, objfile=0x555556807040, file_matcher=..., lookup_name=0x7fffffffa760, symbol_matcher=..., expansion_notify=..., search_flags=..., domain=VAR_DOMAIN, kind=ALL_DOMAIN) at dwarf2/read.c:20288
20288             if (!dw2_expand_symtabs_matching_one (entry->per_cu, per_objfile,
(top-gdb) p name_vec
$13 = std::vector of length 1, capacity 1 = {"int"}

Most CUs have a type called "int", so that matches a lot of them.

Back in the expansion notifier in objfile::lookup_symbol, we search the just expanded CUs for the "signed int" type, and we don't find it, so we return false, meaning "not found, please expand the next CU".

I'm not too familiar with this code, but the discrepancy between the criteria on which we choose the CUs to expand (those that have the "int" type) and then the type we actually look for ("signed int") is odd.

In the previous psymtabs code, I think we looked for partial symbols named "signed int", and didn't find any, so it didn't cause expansion of these CUs.

Comment 8 Simon Marchi 2022-05-20 15:48:30 UTC

"signed int" becomes "int" when lookup_name_info::split_name is called with lang == language_cplus:

#0  cp_canonicalize_string (string=0x7fffffffaca0 "signed int") at cp-support.c:654                                                                                                                                                    
#1  0x0000555555c7dab4 in demangle_for_lookup (name=0x7fffffffaca0 "signed int", lang=language_cplus, storage=...) at symtab.c:1913                                                                                                    
#2  0x0000555555c7d913 in demangle_for_lookup_info::demangle_for_lookup_info (this=0x7fffffffa4b8, lookup_name=..., lang=language_cplus) at symtab.c:1864                                                                              
#3  0x00005555558255d8 in gdb::optional<demangle_for_lookup_info>::emplace<lookup_name_info const&, language> (this=0x7fffffffa4b8) at ./../gdbsupport/gdb_optional.h:155                                                              
#4  0x0000555555824fc3 in lookup_name_info::maybe_init<gdb::optional<demangle_for_lookup_info>, language> (this=0x7fffffffa470, field=...) at /tmp/apres/gdb/symtab.h:350                                                              
#5  0x0000555555824d8d in lookup_name_info::cplus (this=0x7fffffffa470) at /tmp/apres/gdb/symtab.h:322                                                                                                                                 
#6  0x00005555558468f2 in lookup_name_info::language_lookup_name (this=0x7fffffffa470, lang=language_cplus) at /tmp/apres/gdb/symtab.h:281
#7  0x00005555558f918e in lookup_name_info::split_name (this=0x7fffffffa470, lang=language_cplus) at ./symtab.h:309
#8  0x00005555558e198f in cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_sy
mtab*)>, enum_flags<block_search_flag_values>, domain_enum_tag, search_domain) (this=0x5555567721a0, objfile=0x555556593400, file_matcher=..., lookup_name=0x7fffffffa6a0, symbol_matcher=..., expansion_notify=..., search_flags=..., 
domain=VAR_DOMAIN, kind=ALL_DOMAIN) at dwarf2/read.c:20223
#9  0x0000555555c5dd69 in objfile::lookup_symbol (this=0x555556593400, kind=GLOBAL_BLOCK, name=0x7fffffffaca0 "signed int", domain=VAR_DOMAIN) at symfile-debug.c:276
#10 0x0000555555c7f0fc in lookup_symbol_via_quick_fns (objfile=0x555556593400, block_index=GLOBAL_BLOCK, name=0x7fffffffaca0 "signed int", domain=VAR_DOMAIN) at symtab.c:2451
#11 0x0000555555c7f5b9 in lookup_symbol_in_objfile (During symbol reading: cannot get low and high bounds for subprogram DIE at 0x3956955
objfile=0x555556593400, block_index=GLOBAL_BLOCK, name=0x7fffffffaca0 "signed int", domain=VAR_DOMAIN) at symtab.c:2599
#12 0x0000555555c7f7b6 in lookup_symbol_global_or_static_iterator_cb (objfile=0x555556593400, cb_data=0x7fffffffaa30) at symtab.c:2663
#13 0x0000555555c24b70 in svr4_iterate_over_objfiles_in_search_order (gdbarch=0x5555567b34a0, cb=0x555555c7f722 <lookup_symbol_global_or_static_iterator_cb(objfile*, void*)>, cb_data=0x7fffffffaa30, current_objfile=0x0) at solib-sv
r4.c:3168
#14 0x00005555556d319e in gdbarch_iterate_over_objfiles_in_search_order (gdbarch=0x5555567b34a0, cb=0x555555c7f722 <lookup_symbol_global_or_static_iterator_cb(objfile*, void*)>, cb_data=0x7fffffffaa30, current_objfile=0x0) at /tmp/
apres/gdb/gdbarch.c:4937
#15 0x0000555555c7f965 in lookup_global_or_static_symbol (name=0x7fffffffaca0 "signed int", block_index=GLOBAL_BLOCK, objfile=0x0, domain=VAR_DOMAIN) at symtab.c:2709
#16 0x0000555555c7faf7 in lookup_global_symbol (name=0x7fffffffaca0 "signed int", block=0x0, domain=VAR_DOMAIN) at symtab.c:2760
#17 0x0000555555c7f2ff in language_defn::lookup_symbol_nonlocal (this=0x5555564d2fc0 <c_language_defn>, name=0x7fffffffaca0 "signed int", block=0x0, domain=VAR_DOMAIN) at symtab.c:2520
#18 0x0000555555c7e5bd in lookup_symbol_aux (name=0x7fffffffaca0 "signed int", match_type=symbol_name_match_type::FULL, block=0x0, domain=VAR_DOMAIN, language=language_c, is_a_field_of_this=0x0) at symtab.c:2170
#19 0x0000555555c7dd18 in lookup_symbol_in_language (name=0x7fffffffaca0 "signed int", block=0x0, domain=VAR_DOMAIN, lang=language_c, is_a_field_of_this=0x0) at symtab.c:1962
#20 0x00005555559884d3 in lookup_typename (language=0x5555564d2fc0 <c_language_defn>, name=0x7fffffffaca0 "signed int", block=0x0, noerr=1) at gdbtypes.c:1694
#21 0x00005555559886ae in lookup_signed_typename (language=0x5555564d2fc0 <c_language_defn>, name=0x555555fdb81d "int") at gdbtypes.c:1723

Comment 9 Sourceware Commits 2022-05-26 13:49:39 UTC

The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=20a26f4e018cd0cb40a3b81568be0d4f164eed0f

commit 20a26f4e018cd0cb40a3b81568be0d4f164eed0f
Author: Tom Tromey <tromey@adacore.com>
Date:   Fri May 13 13:46:52 2022 -0600

    Finalize each cooked index separately
    
    After DWARF has been scanned, the cooked index code does a
    "finalization" step in a worker thread.  This step combines all the
    index entries into a single master list, canonicalizes C++ names, and
    splits Ada names to synthesize package names.
    
    While this step is run in the background, gdb will wait for the
    results in some situations, and it turns out that this step can be
    slow.  This is PR symtab/29105.
    
    This can be sped up by parallelizing, at a small memory cost.  Now
    each index is finalized on its own, in a worker thread.  The cost
    comes from name canonicalization: if a given non-canonical name is
    referred to by multiple indices, there will be N canonical copies (one
    per index) rather than just one.
    
    This requires changing the users of the index to iterate over multiple
    results.  However, this is easily done by introducing a new "chained
    range" class.
    
    When run on gdb itself, the memory cost seems rather low -- on my
    current machine, "maint space 1" reports no change due to the patch.
    
    For performance testing, using "maint time 1" and "file" will not show
    correct results.  That approach measures "time to next prompt", but
    because the patch only affects background work, this shouldn't (and
    doesn't) change.  Instead, a simple way to make gdb wait for the
    results is to set a breakpoint.
    
    Before:
    
        $ /bin/time -f%e ~/gdb/install/bin/gdb -nx -q -batch \
            -ex 'break main' /tmp/gdb
        Breakpoint 1 at 0x43ec30: file ../../binutils-gdb/gdb/gdb.c, line 28.
        2.00
    
    After:
    
        $ /bin/time -f%e ./gdb/gdb -nx -q -batch \
            -ex 'break main' /tmp/gdb
        Breakpoint 1 at 0x43ec30: file ../../binutils-gdb/gdb/gdb.c, line 28.
        0.65
    
    Regression tested on x86-64 Fedora 34.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105

Comment 10 Tom Tromey 2022-05-26 13:50:35 UTC

That patch solved one problem but not all the problems
pointed out in this bug, so I'm leaving this open.

Comment 11 Tom Tromey 2022-09-23 20:23:23 UTC

I managed to reproduce this with the 'sizeof' test program
(after hacking gdb to ignore .gdb_index, since the Fedora
libc debuginfo uses that).

I couldn't reproduce it when running on gdb itself,
and I discovered that I can avoid the problem by
using C++.  That is, when debugging sizeof:

(gdb) set debug dwarf-read 1
(gdb) set lang c++
Warning: the current language does not match this frame.
(gdb) print sizeof(short)
$1 = 2

So I tend to think this probably a bug elsewhere in gdb.

In C++, gdb canonializes inputs so that lookups will succeed.
This isn't done for C, I think, probably because it seemed
like there was no need.  But given that the DWARF says
"signed short" as a symbol name, perhaps we need some special
cases for this sort of thing.

Comment 12 Joel Brobecker 2022-10-02 00:19:41 UTC

Hi Tom,

I'm wondering whether this PR is still about the performance of the new DWARF reader, or is it now about some bug(s?). As this PR is marked as targeting GDB 13, would you mind telling us more about what's left to do for it?

Thank you!

Comment 13 Tom Tromey 2022-10-06 20:44:08 UTC

(In reply to Joel Brobecker from comment #12)

> I'm wondering whether this PR is still about the performance of the new
> DWARF reader, or is it now about some bug(s?). As this PR is marked as
> targeting GDB 13, would you mind telling us more about what's left to do for
> it?

With the new reader, gdb seems to expand too many CUs in some cases,
and we should probably fix this before 13.  Comment #5 describes the
scenario but I imagine it can be reproduced in other ways as well.

Comment 14 Simon Marchi 2022-10-24 19:27:30 UTC

I was looking at another timeout failure that happens randomly on my system, it points to this.  It's from test gdb.mi/mi-var-cmd.exp:

The test tries to create a varobj with expression "int".  It expects GDB to error out with "Attempt to use a type name as an expression."
^error,msg="-var-create: unable to create variable object"

I can replicate this with:

$ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.mi/mi-var-cmd/mi-var-cmd -ex start
(gdb) set debug dwarf-read 1
(gdb) print int
...
[dwarf-read] process_queue: Expanding one or more symtabs of objfile /usr/lib/debug/.build-id/69/389d485a9793dbe873f0ea2c93e02efaa9aa3d.debug ...
[dwarf-read] process_queue: Expanding symtab of CU at offset 0x480bd4                     
[dwarf-read] process_queue: Done expanding CU at offset 0x480bd4                                                                                                                     
[dwarf-read] process_queue: Done expanding symtabs of /usr/lib/debug/.build-id/69/389d485a9793dbe873f0ea2c93e02efaa9aa3d.debug.                  
[dwarf-read] process_queue: Expanding one or more symtabs of objfile /usr/lib/debug/.build-id/69/389d485a9793dbe873f0ea2c93e02efaa9aa3d.debug ...
[dwarf-read] process_queue: Expanding symtab of CU at offset 0x48cd5e                     
[dwarf-read] process_queue: Done expanding CU at offset 0x48cd5e                                                                                                                     
[dwarf-read] process_queue: Done expanding symtabs of /usr/lib/debug/.build-id/69/389d485a9793dbe873f0ea2c93e02efaa9aa3d.debug.   
[dwarf-read] process_queue: Expanding one or more symtabs of objfile /usr/lib/debug/.build-id/69/389d485a9793dbe873f0ea2c93e02efaa9aa3d.debug ...
[dwarf-read] process_queue: Expanding symtab of CU at offset 0x498c49                     
[dwarf-read] process_queue: Done expanding CU at offset 0x498c49                                                                                                                     
[dwarf-read] process_queue: Done expanding symtabs of /usr/lib/debug/.build-id/69/389d485a9793dbe873f0ea2c93e02efaa9aa3d.debug.
...
Attempt to use a type name as an expression


The post-indexer-changes version takes a long time, expanding pretty much all CUs, while the pre-indexer-changes version finishes almost instantly.

Comment 15 Tom Tromey 2022-11-01 20:42:05 UTC

I've been debugging this a bit.

For the sizeof test, gdb makes an index entry for "short int".
This comes directly from the DWARF:

 <1><bc>: Abbrev Number: 2 (DW_TAG_base_type)
    <bd>   DW_AT_byte_size   : 2
    <be>   DW_AT_encoding    : 5	(signed)
    <bf>   DW_AT_name        : (indirect string, offset: 0x120): short int

lookup_signed_typename first looks for "signed short" though.

(top-gdb) up
#15 0x00000000007bb408 in lookup_typename (language=0x24c0da0 <c_language_defn>, name=0x7fffffffab50 "signed short", block=0x0, 
    noerr=1) at ../../binutils-gdb/gdb/gdbtypes.c:1700
(top-gdb) up
#16 0x00000000007bb5ac in lookup_signed_typename (language=0x24c0da0 <c_language_defn>, name=0x12528f8 "short")
    at ../../binutils-gdb/gdb/gdbtypes.c:1729


Now, trying this with an older gdb shows that psyms also make these
same symbol names:

(gdb) maint print psymbols 
[...]
    `short int', type, 0x0

I'm not sure why it works in this case and not in the new code.

I added a 'maint canonicalize' command and it shows:

(gdb) maint canonicalize short
No change.
(gdb) maint canonicalize short int
canonical = short
(gdb) maint canonicalize signed short int
canonical = short
(gdb) maint canonicalize signed short 
canonical = short

That is, one oddity here is that the C++ canonical form for these names
differs from what's in the DWARF and what things like
lookup_signed_typename expect.

Next I'm going to look at canonicalization for C, at least for those
types that have multiple spellings.

Comment 16 Tom Tromey 2022-11-04 20:05:43 UTC

Name canonicalization for C seems to work fine.

I applied a patch to disable the gdb-index (Fedora debuginfo ships
with this) and then I tried the "sizeof" test with DWARF
debugging enabled, and "print sizeof(short)" doesn't expand
anything.

Only the symbol readers have to be updated to canonicalize
for C, not lookup, because the C parser arranges to only
look up the canonical names anyway.  (Well, lookup_signed_typename
did need a small tweak.)

I am not sure how to write a test for this.

Comment 17 Simon Marchi 2022-11-04 20:10:39 UTC

Not sure how to help you, I don't know anything about all of this.

In the mean time, I stumbled on another testsuite timeout probably caused by this:

$ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.base/info-os/info-os -ex start -ex 'call ((int (*) (void)) getpid) ()' -batch

The call takes a long time, many CUs are expanded, which probably shouldn't.  Probably because of the "int" in the function pointer type.  I'm noting it here so I can test it again once we have a patch to test.

Comment 18 Tom Tromey 2022-11-04 21:27:36 UTC

dw2-unusual-field-names.exp points out that the canonicalization
should probably depend on the DWARF tag, which is pretty
unfortunate.  According to the history, a type like this
is only defined by gdb itself, but of course the test invalidates
this by writing explicit DWARF to this end, necessitating
a workaround in the reader :(

Comment 19 Tom Tromey 2022-11-04 21:28:08 UTC

Also:

ptype enum EU
type = enum EU : unsigned int integer {TWO = 2}
(gdb) FAIL: gdb.dwarf2/enum-type.exp: ptype EU in C++

though maybe this can be fixed in the test.

Comment 20 Tom Tromey 2022-11-04 21:28:47 UTC

(In reply to Simon Marchi from comment #17)
> The call takes a long time, many CUs are expanded, which probably shouldn't.
> Probably because of the "int" in the function pointer type.  I'm noting it
> here so I can test it again once we have a patch to test.

Try my branch "pr-29105-signed-index"

Comment 21 Simon Marchi 2022-11-05 01:28:04 UTC

(In reply to Tom Tromey from comment #20)
> (In reply to Simon Marchi from comment #17)
> > The call takes a long time, many CUs are expanded, which probably shouldn't.
> > Probably because of the "int" in the function pointer type.  I'm noting it
> > here so I can test it again once we have a patch to test.
> 
> Try my branch "pr-29105-signed-index"

All the cases I have reported appear fixed (in that they don't expand a ton of CUs) with that branch.

Comment 22 Tom Tromey 2022-11-07 16:28:25 UTC

https://sourceware.org/pipermail/gdb-patches/2022-November/193530.html

Comment 23 Tom Kacvinsky 2022-11-16 00:09:18 UTC

I can confirm this solves at least one of the problems I have seen with slow debugging.  Those problems are sluggishness stepping into functions and program startup.  I believe the latter is due to having the glibc debug packages installed on my CentOS 7 system.  The former - slow stepping into functions - appears to be related to the LLVM linker.  If use the BFD or gold linker to generate my executables and shared libraries, stepping into function is pretty snappy.

Comment 24 Simon Marchi 2022-11-16 00:56:37 UTC

(In reply to Tom Kacvinsky from comment #23)
> I can confirm this solves at least one of the problems I have seen with slow
> debugging.  Those problems are sluggishness stepping into functions and
> program startup.  I believe the latter is due to having the glibc debug
> packages installed on my CentOS 7 system.  The former - slow stepping into
> functions - appears to be related to the LLVM linker.  If use the BFD or
> gold linker to generate my executables and shared libraries, stepping into
> function is pretty snappy.

You last problem sounds a lot like:

https://sourceware.org/bugzilla/show_bug.cgi?id=27754

Comment 25 Tom Kacvinsky 2022-11-16 15:07:33 UTC

(In reply to Simon Marchi from comment #24)
> (In reply to Tom Kacvinsky from comment #23)
> > I can confirm this solves at least one of the problems I have seen with slow
> > debugging.  Those problems are sluggishness stepping into functions and
> > program startup.  I believe the latter is due to having the glibc debug
> > packages installed on my CentOS 7 system.  The former - slow stepping into
> > functions - appears to be related to the LLVM linker.  If use the BFD or
> > gold linker to generate my executables and shared libraries, stepping into
> > function is pretty snappy.
> 
> You last problem sounds a lot like:
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=27754

Indeed it is.  I see a lot of

DW_MACRO_import - offset : 0x0

in the ELF files when compiling with -ggdb3 and using lld as the linker.  If I switch to the gold or bfd linker, the myriad "offset : 0x0" go away.  So, my suspicion is that this is not gdb's fault, but something undesirable that lld is doing.

Comment 26 Sourceware Commits 2022-12-01 16:44:55 UTC

The master branch has been updated by Simon Marchi <simark@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=00a5867df72983c3f8a11c9955c5032d6f601b70

commit 00a5867df72983c3f8a11c9955c5032d6f601b70
Author: Simon Marchi <simon.marchi@efficios.com>
Date:   Thu Dec 1 11:44:41 2022 -0500

    gdb/dwarf: add some QUIT macros
    
    While testing the fix for PR 29105, I noticed I couldn't ctrl-C my way
    out of GDB expanding many symtabs.  GDB was busy in a loop in
    cooked_index_functions::expand_symtabs_matching.  Add a QUIT there.  I
    also happened to see a spot in
    cooked_index_functions::expand_matching_symbols where a QUIT would be
    useful too, since we iterate over a potentially big number of index
    entries and expand CUs in the loop.  Add one there too.
    
    Change-Id: Ie1d650381df7f944c16d841b3e592d2dce7306c3
    Approved-By: Kevin Buettner <kevinb@redhat.com>

Comment 27 Sourceware Commits 2022-12-01 18:18:53 UTC

The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=55fc1623f942fba10362cb199f9356d75ca5835b

commit 55fc1623f942fba10362cb199f9356d75ca5835b
Author: Tom Tromey <tromey@adacore.com>
Date:   Thu Nov 3 13:49:17 2022 -0600

    Add name canonicalization for C
    
    PR symtab/29105 shows a number of situations where symbol lookup can
    result in the expansion of too many CUs.
    
    What happens is that lookup_signed_typename will try to look up a type
    like "signed int".  In cooked_index_functions::expand_symtabs_matching,
    when looping over languages, the C++ case will canonicalize this type
    name to be "int" instead.  Then this method will proceed to expand
    every CU that has an entry for "int" -- i.e., nearly all of them.  A
    crucial component of this is that the caller, objfile::lookup_symbol,
    does not do this canonicalization, so when it tries to find the symbol
    for "signed int", it fails -- causing the loop to continue.
    
    This patch fixes the problem by introducing name canonicalization for
    C.  The idea here is that, by making C and C++ agree on the canonical
    name when a symbol name can have multiple spellings, we avoid the bad
    behavior in objfile::lookup_symbol (and any other such code -- I don't
    know if there is any).
    
    Unlike C++, C only has a few situations where canonicalization is
    needed.  And, in particular, due to the lack of overloading (thus
    avoiding any issues in linespec) and due to the way c-exp.y works, I
    think that no canonicalization is needed during symbol lookup -- only
    during symtab construction.  This explains why lookup_name_info is not
    touched.
    
    The stabs reader is modified on a "best effort" basis.
    
    The DWARF reader needed one small tweak in dwarf2_name to avoid a
    regression in dw2-unusual-field-names.exp.  I think this is adequately
    explained by the comment, but basically this is a scenario that should
    not occur in real code, only the gdb test suite.
    
    lookup_signed_typename is simplified.  It used to search for two
    different type names, but now gdb can search just for the canonical
    form.
    
    gdb.dwarf2/enum-type.exp needed a small tweak, because the
    canonicalizer turns "unsigned integer" into "unsigned int integer".
    It seems better here to use the correct C type name.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105
    Tested-by: Simon Marchi <simark@simark.ca>
    Reviewed-by: Andrew Burgess <aburgess@redhat.com>

Comment 28 Tom Tromey 2022-12-01 18:22:43 UTC

Fixed.  (Note the macro thing is a different bug.)

Comment 29 Sourceware Commits 2022-12-14 10:23:37 UTC

The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9f50fe0835850645bd8ea9bb1efe1fe6c48dfb12

commit 9f50fe0835850645bd8ea9bb1efe1fe6c48dfb12
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Wed Dec 7 15:55:25 2022 +0000

    gdb/testsuite: new test for recent dwarf reader issue
    
    This commit provides a test for this commit:
    
      commit 55fc1623f942fba10362cb199f9356d75ca5835b
      Date:   Thu Nov 3 13:49:17 2022 -0600
    
          Add name canonicalization for C
    
    Which resolves PR gdb/29105.  My reason for writing this test was a
    desire to better understand the above commit, my process was to study
    the commit until I thought I understood it, then write a test to
    expose the issue.  As the original commit didn't have a test, I
    thought it wouldn't hurt to commit this upstream.
    
    The problem tested for here is already described in the above commit,
    but I'll give a brief description here.  This description describes
    GDB prior to the above commit:
    
      - Builtin types are added to GDB using their canonical name,
        e.g. "short", not "signed short",
    
      - When the user does something like 'p sizeof(short)', then this is
        handled in c-exp.y, and results in a call to lookup_signed_type
        for the name "int".  The "int" here is actually being looked up as
        the type for the result of the 'sizeof' expression,
    
      - In lookup_signed_type GDB first adds a 'signed' and looks for that
        type, so in this case 'signed int', and, if that lookup fails, GDB
        then looks up 'int',
    
      - The problem is that 'signed int' is not the canonical name for a
        signed int, so no builtin type with that name will be found, GDB
        will then go to each object file in turn looking for a matching
        type,
    
      - When checking each object file, GDB will first check the partial
        symtab to see if the full symtab should be expanded or not.
        Remember, at this point GDB is looking for 'signed int', there
        will be no partial symbols with that name, so GDB will not expand
        anything,
    
      - However, GDB checks each partial symbol using multiple languages,
        not just the current language (C in this case), so, when GDB
        checks using the C++ language, the symbol name is first
        canonicalized (the code that does this can be found
        lookup_name_info::language_lookup_name).  As the canonical form of
        'signed int' is just 'int', GDB then looks for any symbols with
        the name 'int', most partial symtabs will contain such a symbol,
        so GDB ends up expanding pretty much every symtab.
    
    The above commit fixes this by avoiding the use of non-canonical names
    with C, now the initial builtin type lookup will succeed, and GDB
    never even considers whether to expand any additional symtabs.
    
    The test case creates a library that includes char, short, int, and
    long types, and a test program that links against the library.
    
    In the test script we start the inferior, but don't allow it to
    progress far enough that the debug information for the library has
    been fully expanded yet.
    
    Then we evaluate some 'sizeof(TYPE)' expressions.
    
    In the buggy version of GDB this would cause the debug information
    for the library to be fully expanded, while in the fixed version of
    GDB this will not be the case.
    
    We use 'info sources' to determine if the debug information has been
    fully expanded or not.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105