Created attachment 13136 [details] tarball with gencheck genchecksum genconstants ./dwz -m xxx gencheck genchecksum genconstants ./dwz: xxx: Invalid DW_AT_decl_file file number 20 This is with DWARF5 .debug_info but still DWARF3 .debug_line sections.
Replicated with just two of the input files: ./dwz -m xxx gencheck genconstants ./dwz: xxx: Invalid DW_AT_decl_file file number 20
Even just one copied twice: cp -a gencheck{,2}; ./dwz -m xxx gencheck gencheck2 ./dwz: xxx: Invalid DW_AT_decl_file file number 20
So we have in gencheck's .debug_abbrev: 7 DW_TAG_variable [no children] DW_AT_name DW_FORM_strp DW_AT_decl_file DW_FORM_implicit_const: 20 DW_AT_decl_line DW_FORM_data2 DW_AT_decl_column DW_FORM_implicit_const: 24 DW_AT_type DW_FORM_ref4 DW_AT value: 0 DW_FORM value: 0 and in .debug_info: <1><1223>: Abbrev Number: 7 (DW_TAG_variable) <1224> DW_AT_name : (indirect string, offset: 0x18c8): PTA_3DNOW <1228> DW_AT_decl_file : 20 <1228> DW_AT_decl_line : 2394 <122a> DW_AT_decl_column : 24 <122a> DW_AT_type : <0x1204> The DW_AT_decl_file uses DW_FORM_implicit_const 20, and that is fine given corresponding: The File Name Table (offset 0x194): Entry Dir Time Size Name 1 1 0 0 gencheck.c 2 1 0 0 wide-int-bitmask.h 3 2 0 0 stddef.h 4 3 0 0 types.h 5 4 0 0 struct_FILE.h 6 4 0 0 FILE.h 7 5 0 0 cstring 8 5 0 0 type_traits 9 5 0 0 cstdlib 10 6 0 0 std_abs.h 11 7 0 0 string.h 12 7 0 0 stdlib.h 13 3 0 0 stdlib-float.h 14 3 0 0 stdlib-bsearch.h 15 5 0 0 stdlib.h 16 3 0 0 stdint-uintn.h 17 8 0 0 c++config.h 18 1 0 0 flag-types.h 19 7 0 0 stdio.h 20 9 0 0 i386.h 21 1 0 0 wide-int.h But somehow the cu we see during reading has only fewer files than that.
So, I think part of the fix should be: --- dwz.c.jj 2021-01-18 10:18:21.288726894 +0100 +++ dwz.c 2021-01-20 20:14:35.361320394 +0100 @@ -10456,7 +10456,9 @@ build_abbrevs_for_die (htab_t h, dw_cu_r case DW_FORM_data4: value = read_32 (ptr); break; case DW_FORM_data8: value = read_64 (ptr); break; case DW_FORM_udata: value = read_uleb128 (ptr); break; - case DW_FORM_implicit_const: break; + case DW_FORM_implicit_const: + value = reft->values[i]; + break; default: error (0, 0, "Unhandled %s for %s", get_DW_FORM_str (form), @@ -10465,10 +10467,12 @@ build_abbrevs_for_die (htab_t h, dw_cu_r } /* Note that the value is only used for calculating the DIE size and possibly change form. Doesn't change the - implicit_const from or value. */ + implicit_const form or value, but line_htab_lookup + needs to be called anyway, so that it records the used + file. */ + value = line_htab_lookup (refcu, value); if (form != DW_FORM_implicit_const) { - value = line_htab_lookup (refcu, value); if (value <= 0xff) { form = DW_FORM_data1; But that doesn't really help to fix this completely. The problem is I think that write_abbrev is always called before write_info, and the remapping of the DW_AT_decl_file/DW_AT_call_file DW_FORM_implicit_const is done only in write_die which is called either recursively or from write_info/write_types.
Therefore, I think the right fix is: diff --git a/dwz.c b/dwz.c index 308bcba..6117c70 100644 --- a/dwz.c +++ b/dwz.c @@ -10456,19 +10456,18 @@ build_abbrevs_for_die (htab_t h, dw_cu_ref cu, dw_die_ref die, case DW_FORM_data4: value = read_32 (ptr); break; case DW_FORM_data8: value = read_64 (ptr); break; case DW_FORM_udata: value = read_uleb128 (ptr); break; - case DW_FORM_implicit_const: break; + case DW_FORM_implicit_const: + value = reft->values[i]; + break; default: error (0, 0, "Unhandled %s for %s", get_DW_FORM_str (form), get_DW_AT_str (reft->attr[i].attr)); return 1; } - /* Note that the value is only used for calculating the - DIE size and possibly change form. Doesn't change the - implicit_const from or value. */ + value = line_htab_lookup (refcu, value); if (form != DW_FORM_implicit_const) { - value = line_htab_lookup (refcu, value); if (value <= 0xff) { form = DW_FORM_data1; @@ -10488,7 +10487,7 @@ build_abbrevs_for_die (htab_t h, dw_cu_ref cu, dw_die_ref die, t->attr[j].attr = reft->attr[i].attr; t->attr[j].form = form; if (form == DW_FORM_implicit_const) - t->values[j] = reft->values[i]; + t->values[j] = value; j++; continue; } @@ -12088,10 +12087,8 @@ write_die (unsigned char *ptr, dw_cu_ref cu, dw_die_ref die, update = true; break; case DW_FORM_implicit_const: - /* Negative means, already transformed. */ - if (reft->values[i] >= 0) - update = true; - value = reft->values[i]; + /* DW_FORM_implicit_const should have been updated + already when computing abbrevs. */ break; default: abort (); } @@ -12104,9 +12101,6 @@ write_die (unsigned char *ptr, dw_cu_ref cu, dw_die_ref die, case DW_FORM_data2: write_16 (ptr, value); break; case DW_FORM_data4: write_32 (ptr, value); break; case DW_FORM_udata: write_uleb128 (ptr, value); break; - case DW_FORM_implicit_const: - reft->values[i] = -value; /* Note, negated. */ - break; default: abort (); } } and this passes the #c0 testcase.
The "dance" with having the file number only translated once and then storing the negated number was to prevent translating it again when we processed the same abbrev for another DIE. I am trying to convince myself this cannot happen with this patch. But am not 100% sure. If we go with this change then the write_die part can be simplified a little by getting rid of the bool update variable before the first switch statement (and setting it to true in all the cases) and just have case DW_FORM_implicit_const: break; in the second switch statement.
I understood what you meant with that dance, but as the abbrev is stored first the value doesn't really make it into the abbrev. Furthermore, the abbrev comparison/hashing needs to include those exact values, so they shouldn't be modified afterwards. The way (I understand how) build_abbrevs_for_die works is essentially it creates a new abbrev with the exact values needed in there for each DIE and then computes hash for that abbrev and looks it up in the hash table, if we don't already have a dup. So, if we use the remapped file number in the temporary abbrev's values that build_abbrevs_for_die is writing, all DIEs (that are otherwise same) that translate to the same final value will share the same abbrev. You're right that the update var can be killed, instead case DW_FORM_implicit_const: can do just j++; continue; And, we don't have DW_FORM_data8 handled after the lookup in write_die, but handle it before the lookup. Strange inconsistency (though it is very unlikely something would be emitting 64-bit file numbers).
(In reply to Jakub Jelinek from comment #7) > I understood what you meant with that dance, but as the abbrev is stored > first the value doesn't really make it into the abbrev. Furthermore, the > abbrev comparison/hashing needs to include those exact values, so they > shouldn't be modified afterwards. O, you are right. So my worries about (and the code workarounds for) changing an abbrev implicit_const value multiple times were unnecessary. We can do it simply in build_abbrevs_for_die because the passed in struct abbrev_tag *t is a "scratch" one setup by the caller(s) to make sure we have something to construct the new abbrev in. Great. Then I totally agree with this patch.
author Jakub Jelinek <jakub@redhat.com> Thu, 21 Jan 2021 07:43:08 +0000 (08:43 +0100) committer Jakub Jelinek <jakub@redhat.com> Thu, 21 Jan 2021 07:43:08 +0000 (08:43 +0100) commit 47af8da7c8533924e96d1dbd97625b6a4dc35855 tree 4a924fd99e9ee98dcc4125836309dff06de39b19 tree parent 40d5efd4edc52a7d2ed02f8400a9beb129053271 commit | diff Assorted DW_FORM_implicit_const fixes The problem is I think that write_abbrev is always called before write_info, and the remapping of the DW_AT_decl_file/DW_AT_call_file DW_FORM_implicit_const is done only in write_die which is called either recursively or from write_info/write_types. Which means the DW_FORM_implicit_const value encoded in the abbrev was never updated. Other fixes include hashing the DW_FORM_implicit_const value in checksum_die and more importantly comparing the values in die_eq_1. 2021-01-21 Jakub Jelinek <jakub@redhat.com> PR dwz/27212 PR dwz/27213 * dwz.c (checksum_die): For DW_FORM_implicit_const hash t->values[i]. (die_eq_1): For DW_FORM_implicit_const compare t?->values[?]. (build_abbrevs_for_die): For DW_FORM_implicit_const on DW_AT_*_file call line_htab_lookup too and store the result into t->values[j]. Otherwise, handle even file ids larger than 32-bit. (write_die): For DW_FORM_implicit_const on DW_AT_*_file just j++ and continue, don't call line_htab_lookup nor adjust anything. Simplify. (alt_clear_dups): Fix function comment.
Fixed by the commit from comment #9