Bug 27212 - ./dwz: xxx: Invalid DW_AT_decl_file file number 20
Summary: ./dwz: xxx: Invalid DW_AT_decl_file file number 20
Status: RESOLVED FIXED
Alias: None
Product: dwz
Classification: Unclassified
Component: default (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-20 08:38 UTC by Jakub Jelinek
Modified: 2021-02-08 15:51 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
tarball with gencheck genchecksum genconstants (93.52 KB, application/x-xz)
2021-01-20 08:38 UTC, Jakub Jelinek
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2021-01-20 08:38:15 UTC
Created attachment 13136 [details]
tarball with gencheck genchecksum genconstants

./dwz -m xxx gencheck genchecksum genconstants 
./dwz: xxx: Invalid DW_AT_decl_file file number 20

This is with DWARF5 .debug_info but still DWARF3 .debug_line sections.
Comment 1 Mark Wielaard 2021-01-20 10:59:26 UTC
Replicated with just two of the input files:
./dwz -m xxx gencheck genconstants
./dwz: xxx: Invalid DW_AT_decl_file file number 20
Comment 2 Jakub Jelinek 2021-01-20 18:06:11 UTC
Even just one copied twice:
cp -a gencheck{,2}; ./dwz -m xxx gencheck gencheck2
./dwz: xxx: Invalid DW_AT_decl_file file number 20
Comment 3 Jakub Jelinek 2021-01-20 18:28:26 UTC
So we have in gencheck's .debug_abbrev:
   7      DW_TAG_variable    [no children]
    DW_AT_name         DW_FORM_strp
    DW_AT_decl_file    DW_FORM_implicit_const: 20
    DW_AT_decl_line    DW_FORM_data2
    DW_AT_decl_column  DW_FORM_implicit_const: 24
    DW_AT_type         DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0
and in .debug_info:
 <1><1223>: Abbrev Number: 7 (DW_TAG_variable)
    <1224>   DW_AT_name        : (indirect string, offset: 0x18c8): PTA_3DNOW
    <1228>   DW_AT_decl_file   : 20
    <1228>   DW_AT_decl_line   : 2394
    <122a>   DW_AT_decl_column : 24
    <122a>   DW_AT_type        : <0x1204>
The DW_AT_decl_file uses DW_FORM_implicit_const 20, and that is fine given corresponding:
 The File Name Table (offset 0x194):
  Entry Dir     Time    Size    Name
  1     1       0       0       gencheck.c
  2     1       0       0       wide-int-bitmask.h
  3     2       0       0       stddef.h
  4     3       0       0       types.h
  5     4       0       0       struct_FILE.h
  6     4       0       0       FILE.h
  7     5       0       0       cstring
  8     5       0       0       type_traits
  9     5       0       0       cstdlib
  10    6       0       0       std_abs.h
  11    7       0       0       string.h
  12    7       0       0       stdlib.h
  13    3       0       0       stdlib-float.h
  14    3       0       0       stdlib-bsearch.h
  15    5       0       0       stdlib.h
  16    3       0       0       stdint-uintn.h
  17    8       0       0       c++config.h
  18    1       0       0       flag-types.h
  19    7       0       0       stdio.h
  20    9       0       0       i386.h
  21    1       0       0       wide-int.h
But somehow the cu we see during reading has only fewer files than that.
Comment 4 Jakub Jelinek 2021-01-20 19:40:53 UTC
So, I think part of the fix should be:
--- dwz.c.jj	2021-01-18 10:18:21.288726894 +0100
+++ dwz.c	2021-01-20 20:14:35.361320394 +0100
@@ -10456,7 +10456,9 @@ build_abbrevs_for_die (htab_t h, dw_cu_r
 		    case DW_FORM_data4: value = read_32 (ptr); break;
 		    case DW_FORM_data8: value = read_64 (ptr); break;
 		    case DW_FORM_udata: value = read_uleb128 (ptr); break;
-		    case DW_FORM_implicit_const: break;
+		    case DW_FORM_implicit_const:
+		      value = reft->values[i];
+		      break;
 		    default:
 		      error (0, 0, "Unhandled %s for %s",
 			     get_DW_FORM_str (form),
@@ -10465,10 +10467,12 @@ build_abbrevs_for_die (htab_t h, dw_cu_r
 		    }
 		  /* Note that the value is only used for calculating the
 		     DIE size and possibly change form. Doesn't change the
-		     implicit_const from or value.  */
+		     implicit_const form or value, but line_htab_lookup
+		     needs to be called anyway, so that it records the used
+		     file.  */
+		  value = line_htab_lookup (refcu, value);
 		  if (form != DW_FORM_implicit_const)
 		    {
-		      value = line_htab_lookup (refcu, value);
 		      if (value <= 0xff)
 			{
 			  form = DW_FORM_data1;
But that doesn't really help to fix this completely.
The problem is I think that write_abbrev is always called before write_info,
and the remapping of the DW_AT_decl_file/DW_AT_call_file DW_FORM_implicit_const
is done only in write_die which is called either recursively or from write_info/write_types.
Comment 5 Jakub Jelinek 2021-01-20 19:43:53 UTC
Therefore, I think the right fix is:
diff --git a/dwz.c b/dwz.c
index 308bcba..6117c70 100644
--- a/dwz.c
+++ b/dwz.c
@@ -10456,19 +10456,18 @@ build_abbrevs_for_die (htab_t h, dw_cu_ref cu, dw_die_ref die,
 		    case DW_FORM_data4: value = read_32 (ptr); break;
 		    case DW_FORM_data8: value = read_64 (ptr); break;
 		    case DW_FORM_udata: value = read_uleb128 (ptr); break;
-		    case DW_FORM_implicit_const: break;
+		    case DW_FORM_implicit_const:
+		      value = reft->values[i];
+		      break;
 		    default:
 		      error (0, 0, "Unhandled %s for %s",
 			     get_DW_FORM_str (form),
 			     get_DW_AT_str (reft->attr[i].attr));
 		      return 1;
 		    }
-		  /* Note that the value is only used for calculating the
-		     DIE size and possibly change form. Doesn't change the
-		     implicit_const from or value.  */
+		  value = line_htab_lookup (refcu, value);
 		  if (form != DW_FORM_implicit_const)
 		    {
-		      value = line_htab_lookup (refcu, value);
 		      if (value <= 0xff)
 			{
 			  form = DW_FORM_data1;
@@ -10488,7 +10487,7 @@ build_abbrevs_for_die (htab_t h, dw_cu_ref cu, dw_die_ref die,
 		  t->attr[j].attr = reft->attr[i].attr;
 		  t->attr[j].form = form;
 		  if (form == DW_FORM_implicit_const)
-		    t->values[j] = reft->values[i];
+		    t->values[j] = value;
 		  j++;
 		  continue;
 		}
@@ -12088,10 +12087,8 @@ write_die (unsigned char *ptr, dw_cu_ref cu, dw_die_ref die,
 		  update = true;
 		  break;
 		case DW_FORM_implicit_const:
-		  /* Negative means, already transformed.  */
-		  if (reft->values[i] >= 0)
-		    update = true;
-		  value = reft->values[i];
+		  /* DW_FORM_implicit_const should have been updated
+		     already when computing abbrevs.  */
 		  break;
 		default: abort ();
 		}
@@ -12104,9 +12101,6 @@ write_die (unsigned char *ptr, dw_cu_ref cu, dw_die_ref die,
 		    case DW_FORM_data2: write_16 (ptr, value); break;
 		    case DW_FORM_data4: write_32 (ptr, value); break;
 		    case DW_FORM_udata: write_uleb128 (ptr, value); break;
-		    case DW_FORM_implicit_const:
-		      reft->values[i] = -value; /* Note, negated.  */
-		      break;
 		    default: abort ();
 		    }
 		}
and this passes the #c0 testcase.
Comment 6 Mark Wielaard 2021-01-20 21:15:12 UTC
The "dance" with having the file number only translated once and then storing the negated number was to prevent translating it again when we processed the same abbrev for another DIE. I am trying to convince myself this cannot happen with this patch. But am not 100% sure.

If we go with this change then the write_die part can be simplified a little by getting rid of the bool update variable before the first switch statement (and setting it to true in all the cases) and just have case DW_FORM_implicit_const: break; in the second switch statement.
Comment 7 Jakub Jelinek 2021-01-20 21:32:22 UTC
I understood what you meant with that dance, but as the abbrev is stored first the value doesn't really make it into the abbrev.  Furthermore, the abbrev comparison/hashing needs to include those exact values, so they shouldn't be modified afterwards.
The way (I understand how) build_abbrevs_for_die works is essentially it creates a new abbrev with the exact values needed in there for each DIE and then computes hash for that abbrev and looks it up in the hash table, if we don't already have a dup.
So, if we use the remapped file number in the temporary abbrev's values that build_abbrevs_for_die is writing, all DIEs (that are otherwise same) that translate to the same final value will share the same abbrev.

You're right that the update var can be killed, instead case DW_FORM_implicit_const: can do just j++; continue;
And, we don't have DW_FORM_data8 handled after the lookup in write_die, but handle it before the lookup.  Strange inconsistency (though it is very unlikely something would be emitting 64-bit file numbers).
Comment 8 Mark Wielaard 2021-01-20 21:56:58 UTC
(In reply to Jakub Jelinek from comment #7)
> I understood what you meant with that dance, but as the abbrev is stored
> first the value doesn't really make it into the abbrev.  Furthermore, the
> abbrev comparison/hashing needs to include those exact values, so they
> shouldn't be modified afterwards.

O, you are right. So my worries about (and the code workarounds for) changing an abbrev implicit_const value multiple times were unnecessary. We can do it simply in build_abbrevs_for_die because the passed in struct abbrev_tag *t is a "scratch" one setup by the caller(s) to make sure we have something to construct the new abbrev in. Great. Then I totally agree with this patch.
Comment 9 Jakub Jelinek 2021-01-21 07:58:31 UTC
author	Jakub Jelinek <jakub@redhat.com>	
	Thu, 21 Jan 2021 07:43:08 +0000 (08:43 +0100)
committer	Jakub Jelinek <jakub@redhat.com>	
	Thu, 21 Jan 2021 07:43:08 +0000 (08:43 +0100)
commit	47af8da7c8533924e96d1dbd97625b6a4dc35855
tree	4a924fd99e9ee98dcc4125836309dff06de39b19	tree
parent	40d5efd4edc52a7d2ed02f8400a9beb129053271	commit | diff
Assorted DW_FORM_implicit_const fixes

The problem is I think that write_abbrev is always called before write_info,
and the remapping of the DW_AT_decl_file/DW_AT_call_file DW_FORM_implicit_const
is done only in write_die which is called either recursively or from
write_info/write_types.  Which means the DW_FORM_implicit_const value encoded
in the abbrev was never updated.

Other fixes include hashing the DW_FORM_implicit_const value in checksum_die
and more importantly comparing the values in die_eq_1.

2021-01-21  Jakub Jelinek  <jakub@redhat.com>

PR dwz/27212
PR dwz/27213
* dwz.c (checksum_die): For DW_FORM_implicit_const hash t->values[i].
(die_eq_1): For DW_FORM_implicit_const compare t?->values[?].
(build_abbrevs_for_die): For DW_FORM_implicit_const on DW_AT_*_file
call line_htab_lookup too and store the result into t->values[j].
Otherwise, handle even file ids larger than 32-bit.
(write_die): For DW_FORM_implicit_const on DW_AT_*_file just j++
and continue, don't call line_htab_lookup nor adjust anything.
Simplify.
(alt_clear_dups): Fix function comment.
Comment 10 Mark Wielaard 2021-02-08 15:51:13 UTC
Fixed by the commit from comment #9