Bug 25700 - Forward-imported CU causes duplicate partial symtab
Summary: Forward-imported CU causes duplicate partial symtab
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-19 13:36 UTC by Tom de Vries
Modified: 2020-04-22 06:11 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2020-03-19 13:36:52 UTC
Consider the executable generated for test-case gdb.dwarf2/imported-unit.exp.

When loading the executable using various tracing:
...
$ gdb \
  outputs/gdb.dwarf2/imported-unit/imported-unit \
  -batch \
  -iex "set verbose on" \
  -iex "set debug symtab-create 1"                                 
Reading symbols from outputs/gdb.dwarf2/imported-unit/imported-unit...
Reading minimal symbols of objfile imported-unit ...
Installing 30 minimal symbols of objfile imported-unit.
Done reading minimal symbols.
Creating one or more psymtabs for objfile imported-unit ...
Created psymtab 0x2122960 for module ../sysdeps/x86_64/start.S.
Created psymtab 0x21569f0 for module init.c.
Created psymtab 0x213d620 for module ../sysdeps/x86_64/crti.S.
Created psymtab 0x213f380 for module <artificial>@0xc7.
Created psymtab 0x20e7b00 for module imported_unit.c.
Created psymtab 0x215da20 for module imported_unit.c.
Created psymtab 0x2133630 for module elf-init.c.
Created psymtab 0x215b910 for module ../sysdeps/x86_64/crtn.S.
Reading in symbols for <artificial>@0xc7...
Created compunit symtab 0x20d6f70 for <artificial>.
Created compunit symtab 0x2177490 for imported_unit.c.
...
we notice that there are two psymtabs generated for imported_unit.c.

This seems to be related to the fact that CU <artificial> imports CU imported_unit.c, and that the import is a forward reference.

After reversing the order of two CUs, such that the import is a backward reference, we have instead:
...
Reading symbols from outputs/gdb.dwarf2/imported-unit/imported-unit...
Reading minimal symbols of objfile imported-unit ...
Installing 30 minimal symbols of objfile imported-unit.
Done reading minimal symbols.
Creating one or more psymtabs for objfile imported-unit ...
Created psymtab 0x217d410 for module ../sysdeps/x86_64/start.S.
Created psymtab 0x21d84a0 for module init.c.
Created psymtab 0x21419e0 for module ../sysdeps/x86_64/crti.S.
Created psymtab 0x21dc190 for module imported_unit.c.
Created psymtab 0x21c6fa0 for module <artificial>@0x124.
Created psymtab 0x2195e30 for module elf-init.c.
Created psymtab 0x21c2c00 for module ../sysdeps/x86_64/crtn.S.
Reading in symbols for <artificial>@0x124...
Created compunit symtab 0x214f0f0 for <artificial>.
Created compunit symtab 0x21dd6d0 for imported_unit.c.
...
Comment 1 Tom de Vries 2020-03-19 13:58:52 UTC
The test-case was added for PR25065 due to a failure with gcc-9 -flto and gdb.cp/subtypes.exp.

I've just confirmed that I see forward-imported CUs there. Same with gcc-10.
Comment 2 Tom de Vries 2020-03-19 14:06:08 UTC
And to characterize the problem more precisely: after adding two more CUs that also do that forward import, we have:
...
Created psymtab 0x1a996a0 for module ../sysdeps/x86_64/start.S.
Created psymtab 0x1ac8a70 for module init.c.
Created psymtab 0x1a9aec0 for module ../sysdeps/x86_64/crti.S.
Created psymtab 0x1a5bf00 for module <artificial>@0xc7.
Created psymtab 0x1a677a0 for module imported_unit.c.
Created psymtab 0x1abb220 for module <artificial2>.
Created psymtab 0x1abb2a0 for module <artificial3>.
Created psymtab 0x1ac0080 for module imported_unit.c.
Created psymtab 0x1abb120 for module elf-init.c.
Created psymtab 0x1abb1a0 for module ../sysdeps/x86_64/crtn.S.
...

So, the problem does not grow with the number of imports, we just keep one duplicate.
Comment 3 Tom de Vries 2020-03-24 13:01:48 UTC
Tentative patch:
...
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 0e879e08a0..97f7c8f003 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -7808,7 +7808,12 @@ dwarf2_build_psymtabs_hard (struct dwarf2_per_objfile *dwarf2_per_objfile)
                           addrmap_create_mutable (&temp_obstack));
 
   for (dwarf2_per_cu_data *per_cu : dwarf2_per_objfile->all_comp_units)
-    process_psymtab_comp_unit (per_cu, false, language_minimal);
+    {
+      if (per_cu->v.psymtab != NULL)
+       /* In case a forward DW_TAG_imported_unit has read the CU already.  */
+       continue;
+      process_psymtab_comp_unit (per_cu, false, language_minimal);
+    }
 
   /* This has to wait until we read the CUs, we need the list of DWOs.  */
   process_skeletonless_type_units (dwarf2_per_objfile);
...
Comment 4 Tom de Vries 2020-03-24 13:05:09 UTC
Using a 2.8GB libxul.so debug file from opensuse tumbleweed, build with lto, we have:
...
$ time.sh gdb -batch -iex "maint set dwarf max-cache-age 1000" -iex "set language c" usr/lib/debug/usr/lib64/firefox/libxul.so-74.0-1.1.x86_64.debug 
maxmem: 10417196
real: 116.62
user: 114.85
system: 2.37
...

and with the tentative patch:
...
$ time.sh gdb -batch -iex "maint set dwarf max-cache-age 1000" -iex "set language c" usr/lib/debug/usr/lib64/firefox/libxul.so-74.0-1.1.x86_64.debug 
maxmem: 10436372
real: 77.42
user: 75.54
system: 2.46
...
Comment 5 Tom de Vries 2020-03-24 13:36:15 UTC
(In reply to Tom de Vries from comment #3)
> Tentative patch:
> ...
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 0e879e08a0..97f7c8f003 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -7808,7 +7808,12 @@ dwarf2_build_psymtabs_hard (struct dwarf2_per_objfile
> *dwarf2_per_objfile)
>                            addrmap_create_mutable (&temp_obstack));
>  
>    for (dwarf2_per_cu_data *per_cu : dwarf2_per_objfile->all_comp_units)
> -    process_psymtab_comp_unit (per_cu, false, language_minimal);
> +    {
> +      if (per_cu->v.psymtab != NULL)
> +       /* In case a forward DW_TAG_imported_unit has read the CU already. 
> */
> +       continue;
> +      process_psymtab_comp_unit (per_cu, false, language_minimal);
> +    }
>  
>    /* This has to wait until we read the CUs, we need the list of DWOs.  */
>    process_skeletonless_type_units (dwarf2_per_objfile);
> ...

I've tested the patch using target board unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects in combination with gcc-9, and I see a regression:
...
FAIL: gdb.ada/char_enum.exp: ptype pck.Global_Enum_Type
FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('x')
FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('Y')
FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('+')
...

Minimal example:
...
$ gdb outputs/gdb.ada/char_enum/foo -batch  -ex "b foo.adb:23" -ex r  -ex "ptype pck.Global_Enum_Type" 
Breakpoint 1 at 0x402c62: file /data/gdb_versions/devel/binutils-gdb.git/gdb/testsuite/gdb.ada/char_enum/foo.adb, line 23.

Breakpoint 1, foo () at /data/gdb_versions/devel/binutils-gdb.git/gdb/testsuite/gdb.ada/char_enum/foo.adb:23
23         Do_Nothing (Char'Address);  -- STOP
No definition of "pck.global_enum_type" in current context.
...
Comment 6 Tom de Vries 2020-03-24 14:17:15 UTC
(In reply to Tom de Vries from comment #5)
> (In reply to Tom de Vries from comment #3)
> > Tentative patch:
> > ...
> > diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> > index 0e879e08a0..97f7c8f003 100644
> > --- a/gdb/dwarf2/read.c
> > +++ b/gdb/dwarf2/read.c
> > @@ -7808,7 +7808,12 @@ dwarf2_build_psymtabs_hard (struct dwarf2_per_objfile
> > *dwarf2_per_objfile)
> >                            addrmap_create_mutable (&temp_obstack));
> >  
> >    for (dwarf2_per_cu_data *per_cu : dwarf2_per_objfile->all_comp_units)
> > -    process_psymtab_comp_unit (per_cu, false, language_minimal);
> > +    {
> > +      if (per_cu->v.psymtab != NULL)
> > +       /* In case a forward DW_TAG_imported_unit has read the CU already. 
> > */
> > +       continue;
> > +      process_psymtab_comp_unit (per_cu, false, language_minimal);
> > +    }
> >  
> >    /* This has to wait until we read the CUs, we need the list of DWOs.  */
> >    process_skeletonless_type_units (dwarf2_per_objfile);
> > ...
> 
> I've tested the patch using target board
> unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects in combination with
> gcc-9, and I see a regression:
> ...
> FAIL: gdb.ada/char_enum.exp: ptype pck.Global_Enum_Type
> FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('x')
> FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('Y')
> FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('+')
> ...
> 
> Minimal example:
> ...
> $ gdb outputs/gdb.ada/char_enum/foo -batch  -ex "b foo.adb:23" -ex r  -ex
> "ptype pck.Global_Enum_Type" 
> Breakpoint 1 at 0x402c62: file
> /data/gdb_versions/devel/binutils-gdb.git/gdb/testsuite/gdb.ada/char_enum/
> foo.adb, line 23.
> 
> Breakpoint 1, foo () at
> /data/gdb_versions/devel/binutils-gdb.git/gdb/testsuite/gdb.ada/char_enum/
> foo.adb:23
> 23         Do_Nothing (Char'Address);  -- STOP
> No definition of "pck.global_enum_type" in current context.
> ...

I think it's an independent issue, filed as PR25718 - "[ada] symbol in imported unit not found"
Comment 7 Tom de Vries 2020-03-27 12:50:55 UTC
(In reply to Tom de Vries from comment #5)
> (In reply to Tom de Vries from comment #3)
> > Tentative patch:
> > ...
> > diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> > index 0e879e08a0..97f7c8f003 100644
> > --- a/gdb/dwarf2/read.c
> > +++ b/gdb/dwarf2/read.c
> > @@ -7808,7 +7808,12 @@ dwarf2_build_psymtabs_hard (struct dwarf2_per_objfile
> > *dwarf2_per_objfile)
> >                            addrmap_create_mutable (&temp_obstack));
> >  
> >    for (dwarf2_per_cu_data *per_cu : dwarf2_per_objfile->all_comp_units)
> > -    process_psymtab_comp_unit (per_cu, false, language_minimal);
> > +    {
> > +      if (per_cu->v.psymtab != NULL)
> > +       /* In case a forward DW_TAG_imported_unit has read the CU already. 
> > */
> > +       continue;
> > +      process_psymtab_comp_unit (per_cu, false, language_minimal);
> > +    }
> >  
> >    /* This has to wait until we read the CUs, we need the list of DWOs.  */
> >    process_skeletonless_type_units (dwarf2_per_objfile);
> > ...
> 
> I've tested the patch using target board
> unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects in combination with
> gcc-9, and I see a regression:
> ...
> FAIL: gdb.ada/char_enum.exp: ptype pck.Global_Enum_Type
> FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('x')
> FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('Y')
> FAIL: gdb.ada/char_enum.exp: print pck.Global_Enum_Type'('+')
> ...
> 
> Minimal example:
> ...
> $ gdb outputs/gdb.ada/char_enum/foo -batch  -ex "b foo.adb:23" -ex r  -ex
> "ptype pck.Global_Enum_Type" 
> Breakpoint 1 at 0x402c62: file
> /data/gdb_versions/devel/binutils-gdb.git/gdb/testsuite/gdb.ada/char_enum/
> foo.adb, line 23.
> 
> Breakpoint 1, foo () at
> /data/gdb_versions/devel/binutils-gdb.git/gdb/testsuite/gdb.ada/char_enum/
> foo.adb:23
> 23         Do_Nothing (Char'Address);  -- STOP
> No definition of "pck.global_enum_type" in current context.
> ...

Tentative patch:
...
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index bdc893d7ad..a31c03fa51 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -5840,7 +5840,7 @@ struct dwarf2_include_psymtab : public partial_symtab
       return;
     /* It's an include file, no symbols to read for it.
        Everything is in the parent symtab.  */
-    read_dependencies (objfile);
+    dependencies[0]->read_symtab (objfile);
     m_readin = true;
   }
 
...
Comment 8 Sourceware Commits 2020-03-31 10:17:32 UTC
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=16b0db75af6b4b4d434aa84c74d58b7290e04143

commit 16b0db75af6b4b4d434aa84c74d58b7290e04143
Author: Tom de Vries <tdevries@suse.de>
Date:   Tue Mar 31 12:17:27 2020 +0200

    [gdb/testsuite] Fix c-linkage-name.exp with -flto
    
    When running test-case gdb.base/c-linkage-name.exp with target board
    unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects, I run into:
    ...
    PASS: gdb.base/c-linkage-name.exp: maint info psymtab: c-linkage-name-2.c: no
    FAIL: gdb.base/c-linkage-name.exp: print symada__cS before partial symtab \
      expansion
    ...
    
    The test-case tries to print a symbol before and after symtab expansion.
    
    And it tries to ensure (since commit 13c3a74afb) that the symtab containing
    the symbol is not yet expanded when doing the 'before' print, by placing the
    symbol in a different CU (c-linkage-name-2.c) from the one containing main
    (c-linkage-name.c), such that when we load the exec and expand the symtab
    containing main, the symtab containing the symbol isn't.
    
    The generated debug info for the test-case when using mentioned target board
    however is structured like this:
    ...
     <0><d2>: Abbrev Number: 1 (DW_TAG_compile_unit)
        <d8>   DW_AT_name        : <artificial>
     <1><f4>: Abbrev Number: 2 (DW_TAG_imported_unit)
        <f5>   DW_AT_import      : <0x16b>  [Abbrev Number: 1]
     <1><f9>: Abbrev Number: 2 (DW_TAG_imported_unit)
        <fa>   DW_AT_import      : <0x19c>  [Abbrev Number: 1]
     <1><fe>: Abbrev Number: 3 (DW_TAG_subprogram)
        <ff>   DW_AT_abstract_origin: <0x17d>
     <1><115>: Abbrev Number: 4 (DW_TAG_variable)
        <116>   DW_AT_abstract_origin: <0x1ce>
     <0><16b>: Abbrev Number: 1 (DW_TAG_compile_unit)
        <171>   DW_AT_name        : c-linkage-name.c
     <1><17d>: Abbrev Number: 2 (DW_TAG_subprogram)
        <17e>   DW_AT_name        : main
     <0><19c>: Abbrev Number: 1 (DW_TAG_compile_unit)
        <1a2>   DW_AT_name        : c-linkage-name-2.c
     <1><1ce>: Abbrev Number: 5 (DW_TAG_variable)
        <1cf>   DW_AT_name        : mundane
        <1d6>   DW_AT_linkage_name: symada__cS
    ...
    
    So, the CU named <artificial> contains both the concrete main and the concrete
    symbol, which explains the FAIL.
    
    The first test should fail, but passes for two reasons.
    
    First of all, due to PR symtab/25700, we have two regular partial symtabs
    c-linkage-name-2.c instead of one, and one of them is expanded, the other one
    not:
    ...
      { psymtab c-linkage-name-2.c ((struct partial_symtab *) 0x38d6f60)
        readin yes
      { psymtab c-linkage-name-2.c ((struct partial_symtab *) 0x38d6fe0)
        readin no
    ...
    
    And then there's the include symtab, which is also not expanded:
    ...
      { psymtab c-linkage-name-2.c ((struct partial_symtab *) 0x38143e0)
        readin no
    ...
    
    Fix the FAIL by explicitly setting the language before load, changing the
    language setting from auto/c to manual/c, such that the symtab containing main
    is no longer expanded.
    
    And make the symtab expansion testing more robust by using the output of
    "maint info symtabs" instead of "maint info psymtabs".
    
    Tested on x86_64-linux, using native and target boards cc-with-gdb-index.exp,
    cc-with-debug-names.exp, readnow.exp and
    unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects.
    
    gdb/testsuite/ChangeLog:
    
    2020-03-31  Tom de Vries  <tdevries@suse.de>
    
            * gdb.base/c-linkage-name.exp: Fix test-case comment.  Set language to
            c.  Use "maint info symtabs" to check symtab expansion.
Comment 9 Tom de Vries 2020-04-08 10:20:06 UTC
patch submitted: https://sourceware.org/pipermail/gdb-patches/2020-April/167450.html
Comment 10 Sourceware Commits 2020-04-22 06:09:50 UTC
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3d5afab3393064e563305bc27264fde5a7598635

commit 3d5afab3393064e563305bc27264fde5a7598635
Author: Tom de Vries <tdevries@suse.de>
Date:   Wed Apr 22 08:09:45 2020 +0200

    [gdb/symtab] Don't create duplicate psymtab for forward-imported CU
    
    Consider the executable generated for test-case gdb.dwarf2/imported-unit.exp.
    
    When loading the executable using various tracing:
    ...
    $ gdb \
      outputs/gdb.dwarf2/imported-unit/imported-unit \
      -batch \
      -iex "set verbose on" \
      -iex "set debug symtab-create 1"
      ...
    Created psymtab 0x213f380 for module <artificial>@0xc7.
    Created psymtab 0x20e7b00 for module imported_unit.c.
    Created psymtab 0x215da20 for module imported_unit.c.
    Created psymtab 0x2133630 for module elf-init.c.
    Created psymtab 0x215b910 for module ../sysdeps/x86_64/crtn.S.
    ...
    we notice that there are two psymtabs generated for imported_unit.c.
    
    This is due to the following: in dwarf2_build_psymtabs_hard we loop over CUs
    and generate partial symtabs for those, and if we encounter an import of
    another CU, we also generate a partial symtab for that one, unless already
    created.
    
    This works well with backward import references:
    - the imported CU is read
    - then the importing CU is read
    - the import is encountered, but the imported CU is already read, so
      we're done.
    
    But with forward import references, we have instead:
    - the importing CU is read
    - the import is encountered, and the imported CU is read
    - the imported CU is read once more
    
    Fix this by skipping already created psymtabs in the loop in
    dwarf2_build_psymtabs_hard.
    
    Tested on x86_64-linux, with native and target board
    unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects.
    
    This causes this regression with the target board:
    ...
    FAIL: gdb.ada/dgopt.exp: list x.adb:16, 16
    ...
    which I consider a seperate PR, filed as PR25801 - "Filename of shared psymtab
    is ignored".
    
    gdb/ChangeLog:
    
    2020-04-22  Tom de Vries  <tdevries@suse.de>
    
            PR symtab/25700
            * dwarf2/read.c (dwarf2_build_psymtabs_hard): Don't create psymtab for
            CU if already created.
    
    gdb/testsuite/ChangeLog:
    
    2020-04-22  Tom de Vries  <tdevries@suse.de>
    
            PR symtab/25700
            * gdb.dwarf2/imported-unit.exp: Verify that there's only one partial
            symtab for imported_unit.c.
Comment 11 Tom de Vries 2020-04-22 06:11:17 UTC
Patch with test-case committed, marking resolved-fixed.