| Summary: | Slow lookup_symbol_in_objfile | ||
|---|---|---|---|
| Product: | gdb | Reporter: | Dmitry Neverov <dmitry.neverov> |
| Component: | symtab | Assignee: | Not yet assigned to anyone <unassigned> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | brobecker, josh.cottingham, sam, tromey |
| Priority: | P2 | ||
| Version: | 13.1 | ||
| Target Milestone: | 15.1 | ||
| See Also: | https://sourceware.org/bugzilla/show_bug.cgi?id=31010 | ||
| Host: | Target: | ||
| Build: | 2023-12-01 0:00 | Last reconfirmed: | 2023-12-01 00:00:00 |
| Project(s) to access: | ssh public key: | ||
| Bug Depends on: | |||
| Bug Blocks: | 29366 | ||
| Attachments: |
gdb.log
gdb-12.1.log |
||
Created attachment 14920 [details]
gdb-12.1.log
There seems to be no 1-minute delay in gdb-12.1. It's hard to know for sure what causes the problem without more investigation. A typical cause is too much symtab expansion. Maybe retrying with "set debug symtab-create 1" would show the problem. With 'set debug symtab-create 1' gdb produces ~450Mb of logs between these 2 entries: 2023-06-06 10:54:51,286 <&" [symbol-lookup] lookup_symbol_in_objfile: lookup_symbol_in_objfile (_pcbnew.kiface, GLOBAL_BLOCK, wxObjectDataPtr, VAR_DOMAIN)\n" 2023-06-06 10:55:50,418 <&" [symbol-lookup] lookup_symbol_in_objfile: lookup_symbol_in_objfile (...) = NULL\n" All log entries are of the form like these: <&" [symtab-create] start_subfile: name = /usr/include/x86_64-linux-gnu/c++/12/bits/c++config.h, name_for_id = /usr/include/x86_64-linux-gnu/c++/12/bits/c++config.h\n" <&" [symtab-create] start_subfile: found existing symtab with name_for_id /usr/include/x86_64-linux-gnu/c++/12/bits/c++config.h\n" There are 1134332 '[symtab-create] start_subfile: name = ...' entries total, 707543 of them result in 'found existing symtab', 426789 don't find an existing symtab. There are 3519 unique names for symtab-create. If needed I think I can share all the logs. I investigated it a bit further and the slowness seems to be indeed caused by too much symtab expansion: https://sourceware.org/pipermail/gdb/2023-October/050976.html First, thank you for your investigation. That's very helpful. This seems pretty similar to bug#31010 in that the cooked-index code is expanding too many CUs for pretty much the same reason. Not sure if this is helpful but worth mentioning that bug#31010 does appear to be fixed with https://sourceware.org/pipermail/gdb-patches/2024-January/205924.html Not sure if this is helpful but worth mentioning that bug#31010 does appear to be fixed with https://sourceware.org/pipermail/gdb-patches/2024-January/205924.html *** Bug 31010 has been marked as a duplicate of this bug. *** We have a patch series for this pending copyright assignment (and some minor formatting); I'd like to get this in gdb 15, so I'm setting the target milestone. FTR: v3 patch series at: https://sourceware.org/pipermail/gdb-patches/2024-May/209010.html Reviewed by Tom who said they look reasonable (so I'm assuming they are approved). I will ping the FSF copyright office. The master branch has been updated by Tom Tromey <tromey@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3a0fae312983989a33608d924ff902d7b78e8ec1 commit 3a0fae312983989a33608d924ff902d7b78e8ec1 Author: Dmitry.Neverov <dmitry.neverov@jetbrains.com> Date: Mon May 6 17:09:17 2024 +0200 gdb/symtab: check name matches before expanding a CU The added check fixes the case when an unqualified lookup name without template arguments causes expansion of many CUs which contain the name with template arguments. This is similar to what dw2_expand_symtabs_matching_symbol does before expanding the CU. In the referenced issue the lookup name was wxObjectDataPtr and many CUs had names like wxObjectDataPtr<wxBitmapBundleImpl>. This caused their expansion and the lookup took around a minute. The added check helps to avoid the expansion and makes the symbol lookup to return in a second or so. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30520 Should be fixed. The master branch has been updated by Tom Tromey <tromey@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=aab2ac34d7f78f0b7a42cef0187dc6e4d7ec4f02 commit aab2ac34d7f78f0b7a42cef0187dc6e4d7ec4f02 Author: Tom Tromey <tom@tromey.com> Date: Fri Feb 21 09:18:28 2025 -0700 Avoid excessive CU expansion on failed matches PR symtab/31010 points out that something like "ptype INT" will expand all CUs in a typical program. The OP further points out that the original patch for PR symtab/30520: https://sourceware.org/pipermail/gdb-patches/2024-January/205924.html ... did solve the problem, but the patch changed after (my) review and reintroduced the bug. In cooked_index_functions::expand_symtabs_matching, the final component of a split name is compared with the entry's name using the usual method of calling get_symbol_name_matcher. This code iterates over languages and tries to split the original name according to each style. But, the Ada splitter uses the decoded name -- "int". This causes every C or C++ CU to be expanded. Clearly this is wrong. And, it seems to me that looping over languages and trying to guess the splitting style for the input text is probably bad. However, fixing the problem is not so easy (again due to Ada). I've filed a follow-up bug, PR symtab/32733, for this. Meanwhile, this patch changes the code to be closer to the originally-submitted patch. This works because the comparison is now done between the full name and the "lookup_name_without_params" object, which is a less adulterated variant of the original input. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31010 Tested-By: Simon Marchi <simon.marchi@efficios.com> |
Created attachment 14919 [details] gdb.log I'm facing a performance problem in `-var-list-children` gdb/mi command in gdb 13 built on commit d05c047a71374f533ed9261c6f44707285f1b302 but also earlier gdb 13 versions. It seems like it has to do with slow lookup_symbol_in_objfile. Most lookups are fast, but one takes around a minute to finish for 449Mb shared object: 2023-06-06 10:54:51,286 <&" [symbol-lookup] lookup_symbol_in_objfile: lookup_symbol_in_objfile (_pcbnew.kiface, GLOBAL_BLOCK, wxObjectDataPtr, VAR_DOMAIN)\n" 2023-06-06 10:55:50,418 <&" [symbol-lookup] lookup_symbol_in_objfile: lookup_symbol_in_objfile (...) = NULL\n" Even though a lookup for another symbol in the same session before was fast: 2023-06-06 10:54:36,910 <&" [symbol-lookup] lookup_symbol_in_objfile: lookup_symbol_in_objfile (_pcbnew.kiface, GLOBAL_BLOCK, wxWindow, VAR_DOMAIN)\n" 2023-06-06 10:54:36,910 <&" [symbol-lookup] lookup_symbol_in_objfile: lookup_symbol_in_objfile (...) = NULL\n" What could cause such a slowdown? Are there any way to work that around, e.g. by trading memory for speed?