[PATCH] gdb/dwarf: fix reading subprogram with DW_AT_specification (PR gdb/26693)

Tom de Vries tdevries@suse.de
Tue Oct 20 16:29:06 GMT 2020


On 10/20/20 5:13 PM, Simon Marchi via Gdb-patches wrote:
> Ping.  This is the last blocker for the release.
> 

Hi,

a few comments on the test-case below.

> Simon
> 
> On 2020-10-13 10:18 a.m., Simon Marchi wrote:
>> Fix a regression introduced by commit 7188ed02d2a7 ("Replace
>> dwarf2_per_cu_data::cu backlink with per-objfile map").
>>
>> This patch targets both master and gdb-10-branch, since this is a
>> regression from GDB 9.
>>
>> Analysis
>> --------
>>
>> The DWARF generated by the included test case looks like:
>>
>>     0x0000000b: DW_TAG_compile_unit
>>                   DW_AT_language [DW_FORM_sdata]    (4)
>>
>>     0x0000000d:   DW_TAG_base_type
>>                     DW_AT_name [DW_FORM_string]     ("int")
>>                     DW_AT_byte_size [DW_FORM_data1] (0x04)
>>                     DW_AT_encoding [DW_FORM_sdata]  (5)
>>
>>     0x00000014:   DW_TAG_subprogram
>>                     DW_AT_name [DW_FORM_string]     ("apply")
>>
>>     0x0000001b:   DW_TAG_subprogram
>>                     DW_AT_specification [DW_FORM_ref4]      (0x00000014 "apply")
>>                     DW_AT_low_pc [DW_FORM_addr]     (0x0000000000001234)
>>                     DW_AT_high_pc [DW_FORM_data8]   (0x0000000000000020)
>>
>>     0x00000030:     DW_TAG_template_type_parameter
>>                       DW_AT_name [DW_FORM_string]   ("T")
>>                       DW_AT_type [DW_FORM_ref4]     (0x0000000d "int")
>>
>>     0x00000037:     NULL
>>
>>     0x00000038:   NULL
>>
>> Simply loading the file in GDB makes it crash:
>>
>>     $ ./gdb -nx --data-directory=data-directory testsuite/outputs/gdb.dwarf2/pr26693/pr26693
>>     [1]    15188 abort (core dumped)  ./gdb -nx --data-directory=data-directory
>>
>> The crash happens here, where htab (a dwarf2_cu::die_hash field) is
>> unexpectedly NULL while generating partial symbols:
>>
>>     #0  0x000055555fa28188 in htab_find_with_hash (htab=0x0, element=0x7fffffffbfa0, hash=27) at /home/simark/src/binutils-gdb/libiberty/hashtab.c:591
>>     #1  0x000055555cb4eb2e in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22951
>>     #2  0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968
>>     #3  0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441
>>     #4  0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469
>>     #5  0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737
>>     #6  0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230
>>     #7  0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614
>>     #8  0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712
>>     #9  0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073
>>
>> The special thing about this DWARF is that the subprogram at 0x1b is a
>> template specialization described with DW_AT_specification, and has no
>> DW_AT_name in itself.  To compute the name of this subprogram,
>> partial_die_full_name needs to load the full DIE for this partial DIE.
>> The name is generated from the templated function name and the actual
>> tempalate parameter values of the specialization.
>>
>> To load the full DIE, partial_die_full_name creates a dummy DWARF
>> attribute of form DW_FORM_ref_addr that points to our subprogram's DIE,
>> and calls follow_die_ref on it.  This eventually causes
>> load_full_comp_unit to be called for the exact same CU we are currently
>> making partial symbols for:
>>
>>     #0  load_full_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, skip_partial=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:9238
>>     #1  0x000055555cb4e943 in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22942
>>     #2  0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968
>>     #3  0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441
>>     #4  0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469
>>     #5  0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737
>>     #6  0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230
>>     #7  0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614
>>     #8  0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712
>>     #9  0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073
>>
>> load_full_comp_unit creates a cutu_reader for the CU.  Since a dwarf2_cu
>> object already exists for the CU, load_full_comp_unit is expected to
>> find it and pass it to cutu_reader, so that cutu_reader doesn't create
>> a new dwarf2_cu for the CU.
>>
>> And this is the difference between before and after the regression.
>> Before commit 7188ed02d2a7, the dwarf2_per_cu_data -> dwarf2_cu link was
>> a simple pointer in dwarf2_per_cu_data.  This pointer was set up when
>> starting the read the partial symbols.  So it was already available at
>> that point where load_full_comp_unit gets called.  Post-7188ed02d2a7,
>> this link is per-objfile, kept in the dwarf2_per_objfile::m_dwarf2_cus
>> hash map.  The entry is only put in the hash map once the partial
>> symbols have been successfully read, when cutu_reader::keep is called.
>> Therefore, it is _not_ set at the point load_full_comp_unit is called.
>>
>> As a consequence, a new dwarf2_cu object gets created and initialized by
>> load_full_comp_unit (including initializing that dwarf2_cu::die_hash
>> field).  Meanwhile, the dwarf2_cu object created and used by the callers
>> up the stack does not get initialized for full symbol reading, and the
>> dwarf2_cu::die_hash field stays unexpectedly NULL.
>>
>> Solution
>> --------
>>
>> Since the caller of load_full_comp_unit knows about the existing
>> dwarf2_cu object for the CU we are reading (the one load_full_comp_unit
>> is expected to find), we can simply make it pass it down, instead of
>> having load_full_comp_unit look up the per-objfile map.
>>
>> load_full_comp_unit therefore gets a new `existing_cu` parameter.  All
>> other callers get updated to pass `per_objfile->get_cu (per_cu)`, so the
>> behavior shouldn't change for them, compared to the current HEAD.
>>
>> A test is added, which is the bare minimum to reproduce the issue.
>>
>> Notes
>> -----
>>
>> The original problem was reproduced by downloading
>>
>>     https://github.com/oneapi-src/oneTBB/releases/download/v2020.3/tbb-2020.3-lin.tgz
>>
>> and loading libtbb.so in GDB.  This code was compiled with the Intel
>> C/C++ compiler.  I was not able to reproduce the issue using GCC, I
>> think because GCC puts a DW_AT_name in the specialized subprogram, so
>> there's no need for partial_die_full_name to load the full DIE of the
>> subprogram, and the faulty code doesn't execute.
>>
>> gdb/ChangeLog:
>>
>> 	PR gdb/26693
>> 	* dwarf2/read.c (load_full_comp_unit): Add existing_cu
>> 	parameter.
>> 	(load_cu): Pass existing CU.
>> 	(process_imported_unit_die): Likewise.
>> 	(follow_die_offset): Likewise.
>>
>> gdb/testsuite/ChangeLog:
>>
>> 	PR gdb/26693
>> 	* gdb.dwarf2/template-specification-full-name.c: New test.
>> 	* gdb.dwarf2/template-specification-full-name.exp: New test.
>>
>> Change-Id: I57c8042f96c45f15797a3848e4d384181c56bb44
>> ---
>>  gdb/dwarf2/read.c                             | 15 ++--
>>  .../template-specification-full-name.c        | 22 ++++++
>>  .../template-specification-full-name.exp      | 75 +++++++++++++++++++
>>  3 files changed, 107 insertions(+), 5 deletions(-)
>>  create mode 100644 gdb/testsuite/gdb.dwarf2/template-specification-full-name.c
>>  create mode 100644 gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp
>>
>> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
>> index 2ec3789135d6..a8b48374200c 100644
>> --- a/gdb/dwarf2/read.c
>> +++ b/gdb/dwarf2/read.c
>> @@ -1606,6 +1606,7 @@ static int create_all_type_units (dwarf2_per_objfile *per_objfile);
>>  
>>  static void load_full_comp_unit (dwarf2_per_cu_data *per_cu,
>>  				 dwarf2_per_objfile *per_objfile,
>> +				 dwarf2_cu *existing_cu,
>>  				 bool skip_partial,
>>  				 enum language pretend_language);
>>  
>> @@ -2385,7 +2386,8 @@ load_cu (dwarf2_per_cu_data *per_cu, dwarf2_per_objfile *per_objfile,
>>    if (per_cu->is_debug_types)
>>      load_full_type_unit (per_cu, per_objfile);
>>    else
>> -    load_full_comp_unit (per_cu, per_objfile, skip_partial, language_minimal);
>> +    load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
>> +			 skip_partial, language_minimal);
>>  
>>    dwarf2_cu *cu = per_objfile->get_cu (per_cu);
>>    if (cu == nullptr)
>> @@ -9230,12 +9232,12 @@ die_eq (const void *item_lhs, const void *item_rhs)
>>  static void
>>  load_full_comp_unit (dwarf2_per_cu_data *this_cu,
>>  		     dwarf2_per_objfile *per_objfile,
>> +		     dwarf2_cu *existing_cu,
>>  		     bool skip_partial,
>>  		     enum language pretend_language)
>>  {
>>    gdb_assert (! this_cu->is_debug_types);
>>  
>> -  dwarf2_cu *existing_cu = per_objfile->get_cu (this_cu);
>>    cutu_reader reader (this_cu, per_objfile, NULL, existing_cu, skip_partial);
>>    if (reader.dummy_p)
>>      return;
>> @@ -10101,7 +10103,8 @@ process_imported_unit_die (struct die_info *die, struct dwarf2_cu *cu)
>>  
>>        /* If necessary, add it to the queue and load its DIEs.  */
>>        if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language))
>> -	load_full_comp_unit (per_cu, per_objfile, false, cu->language);
>> +	load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
>> +			     false, cu->language);
>>  
>>        cu->per_cu->imported_symtabs_push (per_cu);
>>      }
>> @@ -22931,7 +22934,8 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz,
>>  
>>        /* If necessary, add it to the queue and load its DIEs.  */
>>        if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language))
>> -	load_full_comp_unit (per_cu, per_objfile, false, cu->language);
>> +	load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
>> +			     false, cu->language);
>>  
>>        target_cu = per_objfile->get_cu (per_cu);
>>      }
>> @@ -22939,7 +22943,8 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz,
>>      {
>>        /* We're loading full DIEs during partial symbol reading.  */
>>        gdb_assert (per_objfile->per_bfd->reading_partial_symbols);
>> -      load_full_comp_unit (cu->per_cu, per_objfile, false, language_minimal);
>> +      load_full_comp_unit (cu->per_cu, per_objfile, cu, false,
>> +			   language_minimal);
>>      }
>>  
>>    *ref_cu = target_cu;
>> diff --git a/gdb/testsuite/gdb.dwarf2/template-specification-full-name.c b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.c
>> new file mode 100644
>> index 000000000000..9d7b2f1a4c28
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.c
>> @@ -0,0 +1,22 @@
>> +/* This testcase is part of GDB, the GNU debugger.
>> +
>> +   Copyright 2020 Free Software Foundation, Inc.
>> +
>> +   This program is free software; you can redistribute it and/or modify
>> +   it under the terms of the GNU General Public License as published by
>> +   the Free Software Foundation; either version 3 of the License, or
>> +   (at your option) any later version.
>> +
>> +   This program is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +   GNU General Public License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
>> +
>> +int
>> +main (void)
>> +{
>> +  return 0;
>> +}
>> diff --git a/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp
>> new file mode 100644
>> index 000000000000..87b3604e0760
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.dwarf2/template-specification-full-name.exp
>> @@ -0,0 +1,75 @@
>> +# Copyright 2020 Free Software Foundation, Inc.
>> +
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
>> +# Test to reproduce the crash described in PR 26693.  The following DWARF was
>> +# crashing GDB while loading partial symbols: a DW_TAG_subprogram with a
>> +# DW_AT_specification (pointing to another subprogram), without a DW_AT_name
>> +# and with a template parameter (DW_TAG_template_type_parameter).  More
>> +# precisely, the crash was happening when trying to compute the full name of the
>> +# subprogram.
>> +
>> +load_lib dwarf.exp
>> +
>> +if {![dwarf2_support]} {
>> +    return 0
>> +}
>> +
>> +standard_testfile .c .S
>> +

Consider using:
...
standard_testfile main.c .S
...
and dropping gdb.dwarf2/template-specification-full-name.c.

>> +set asm_file [standard_output_file $srcfile2]
>> +Dwarf::assemble $asm_file {
>> +    cu {} {
>> +	DW_TAG_compile_unit {
>> +	    {DW_AT_language @DW_LANG_C_plus_plus}
>> +        } {
>> +	    declare_labels templated_subprogram int
>> +
>> +	    int: DW_TAG_base_type {
>> +		{DW_AT_name "int"}
>> +		{DW_AT_byte_size 4 DW_FORM_data1}
>> +		{DW_AT_encoding @DW_ATE_signed}
>> +	    }
>> +
>> +	    # The templated subprogram.
>> +	    templated_subprogram: DW_TAG_subprogram {
>> +		{DW_AT_name "apply"}
>> +	    }
>> +
>> +	    # The template specialization (the low and high PC are phony).
>> +	    DW_TAG_subprogram {
>> +		{DW_AT_specification :$templated_subprogram}
>> +		{DW_AT_low_pc 0x1234 DW_FORM_addr}
>> +		{DW_AT_high_pc 0x20 DW_FORM_data8}

I applied the patch on trunk and ran the tests, and ran into:
...
(gdb) print apply<int>^M
Cannot access memory at address 0x1234^M
(gdb) FAIL: gdb.dwarf2/template-specification-full-name.exp: print
apply<int>
...

So I guess we'll have to fix the hardcoded high/low.

Using this instead works for me, in combination with using main.c
(otherwise the main_label is missing):
...
                {MACRO_AT_range {main}}
...

Thanks,
- Tom

>> +	    } {
>> +		DW_TAG_template_type_param {
>> +		  {DW_AT_name "T"}
>> +		  {DW_AT_type :$int DW_FORM_ref4}
>> +		}
>> +	    }
>> +	}
>> +    }
>> +}
>> +
>> +if { [prepare_for_testing "failed to prepare" ${testfile} \
>> +	  [list $srcfile $asm_file] {nodebug}] } {
>> +    return -1
>> +}
>> +
>> +if ![runto_main] {
>> +    return -1
>> +}
>> +
>> +# Just a sanity check to make sure GDB slurped the symbols correctly.
>> +gdb_test "print apply<int>" " = {void \\(void\\)} $hex <apply<int>\\(\\)>"
>> -- 
>> 2.28.0
>>
> 


More information about the Gdb-patches mailing list