Bug 30661 - [gdb/symtab] main symbol language lookup causes symtab expansion
Summary: [gdb/symtab] main symbol language lookup causes symtab expansion
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 14.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-21 07:34 UTC by Tom de Vries
Modified: 2023-10-30 16:14 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Tentative patch (1.80 KB, patch)
2023-07-21 14:12 UTC, Tom de Vries
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2023-07-21 07:34:24 UTC
Consider a hello world:
...
$ gcc -g hello.c 
...

With gdb-12-branch:
...
$ gdb -q -batch a.out -ex "maint info symtabs"
$
...

With gdb-13-branch:
...
$ gdb -q -batch a.out -ex "maint info symtabs"
{ objfile /data/vries/gdb/a.out ((struct objfile *) 0x146ca90)
  { ((struct compunit_symtab *) 0x1460fc0)
    debugformat DWARF 4
    producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g
    name hello.c
  ...
...

This is a feature that was added to work around the slowness in expansion of lto debug info, in order to not have to wait too long for the "file a.out" to finish.

Feature was added in commit d3214198119 ("[gdb] Use partial symbol table to find language for main").

This seems to have regressed.
Comment 1 Tom de Vries 2023-07-21 08:16:01 UTC
(In reply to Tom de Vries from comment #0)
> This seems to have regressed.

Between gdb-13-branchpoint and gdb-13-branch, bisecting points to:
...
commit 7f4307436fdab42da2b385040b90294f301ea55b
Author: Tom Tromey <tom@tromey.com>
Date:   Mon Feb 13 17:44:54 2023 -0700

    Fix "start" for D, Rust, etc
...
Comment 2 Tom de Vries 2023-07-21 09:02:16 UTC
Using this:
...
vries@xerxes:~/gdb/src> git diff
diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
index 25635d9b72e..0b2d35f4b86 100644
--- a/gdb/dwarf2/cooked-index.c
+++ b/gdb/dwarf2/cooked-index.c
@@ -62,7 +62,7 @@ bool
 language_requires_canonicalization (enum language lang)
 {
   return (lang == language_ada
-	  || lang == language_c
+	  || lang == language_d
 	  || lang == language_cplus);
 }
 
@@ -242,6 +242,10 @@ cooked_index_shard::add (sect_offset die_offset, enum dwarf_tag tag,
      implicit "main" discovery.  */
   if ((flags & IS_MAIN) != 0)
     m_main = result;
+  else if (!language_requires_canonicalization (per_cu->lang ())
+	   && m_main == nullptr
+	   && strcmp (name, "main") == 0)
+    m_main = result;
 
   return result;
 }
...
we get the old behaviour back, without regressing on gdb.dlang/dlang-start-2.exp (can't run gdb.dlang/dlang-start.exp).
Comment 3 Tom de Vries 2023-07-21 09:53:56 UTC
(In reply to Tom de Vries from comment #2)
> Using this:
> ...
> diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
> index 25635d9b72e..0b2d35f4b86 100644
> --- a/gdb/dwarf2/cooked-index.c
> +++ b/gdb/dwarf2/cooked-index.c
> @@ -62,7 +62,7 @@ bool
>  language_requires_canonicalization (enum language lang)
>  {
>    return (lang == language_ada
> -	  || lang == language_c
> +	  || lang == language_d
>  	  || lang == language_cplus);
>  }
>  
> @@ -242,6 +242,10 @@ cooked_index_shard::add (sect_offset die_offset, enum
> dwarf_tag tag,
>       implicit "main" discovery.  */
>    if ((flags & IS_MAIN) != 0)
>      m_main = result;
> +  else if (!language_requires_canonicalization (per_cu->lang ())
> +	   && m_main == nullptr
> +	   && strcmp (name, "main") == 0)
> +    m_main = result;
>  
>    return result;
>  }
> ...
> we get the old behaviour back, without regressing on
> gdb.dlang/dlang-start-2.exp (can't run gdb.dlang/dlang-start.exp).

Hmm, on the other hand I also want this to work for c++, so I guess a different fix is needed.

Anyway, FWIW, tested on top of trunk on x86_64-linux, no regressions.

Also, I found the source of the requirement to mark C as a language that requires canonicalization: c_canonicalize_name.
Comment 4 Tom de Vries 2023-07-21 14:12:24 UTC
Created attachment 14993 [details]
Tentative patch

Currently testing.

Todo: add C and C++ test-cases.
Comment 5 Tom de Vries 2023-07-31 14:16:51 UTC
Related reading: PR30174.
Comment 7 Sourceware Commits 2023-08-05 15:57:29 UTC
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d06730bc0205f7c35bfccf057ef0ef83a12206d6

commit d06730bc0205f7c35bfccf057ef0ef83a12206d6
Author: Tom de Vries <tdevries@suse.de>
Date:   Sat Aug 5 17:57:13 2023 +0200

    [gdb/symtab] Find main language without symtab expansion
    
    When loading an executable using "file a.out", the language is set according
    to a.out, which can involve looking up the language of symbol "main", which
    will cause the symtab expansion for the containing CU.
    
    Expansion of lto debug info can be slow, so in commit d3214198119 ("[gdb] Use
    partial symbol table to find language for main") a feature was added to avoid
    the symtab expansion.
    
    This feature stopped working after commit 7f4307436fd ("Fix "start" for D,
    Rust, etc").
    
    [ The commit addresses problems related to command start, which requires finding
    the main function:
    - for language D, "main" was found instead of "D main", and
    - for Rust, the correct function was found, but attributed the wrong name
      (not fully qualified). ]
    
    Reimplement the feature by adding
    cooked_index_functions::lookup_global_symbol_language.
    
    Tested on x86_64-linux.
    
    PR symtab/30661
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30661
Comment 8 Tom de Vries 2023-08-05 16:01:05 UTC
Fixed.