The terseness of the current "Unknown DWARF" error messages makes it very difficult to analyze what is going on in cases of not yet parseable or standard-violating DWARF data. They should give more information, namely include: - a hexdump of the data around the offending entity, - whether the entity is invalid (get_DW_*_name()==NULL) or merely not yet handled by the code, - current values of DW_AT_{name,producer,comp_dir,language,...}, - and instructions to include that output in the bug report and retain a copy of the unstripped binary for upload upon request. From my cursory understanding the 3rd item might be easily implementable with global variables holding copies of the most recently seen values. Regards, Dennis.
The following commits address some of difficulty tracking down where the bad DWARF was detected. They don't go so far as your suggestions, but I believe they are already useful. Trying to get more information than the current (DIE) offsets might also be tricky when we are dealing with bad DWARF. commit 1e2beb218060515eb1e4f54a0ff6b3714b532e31 (HEAD -> master) Author: Mark Wielaard <mark@klomp.org> Date: Sun Feb 21 16:55:17 2021 +0100 Print abbrev or DIE offset for Unknown DWARF error message. * dwz.c (read_abbrev): Add .debug_abbrev offset to error message. (read_exprloc): Print DIE offset that referenced the unknown operand in error message. (read_expr_low_mem_phase1): Likewise. (read_debug_info): Add die_offset to error messages for unknown forms, attributes extending beyond end of CU or unknown block form attributes. https://sourceware.org/bugzilla/show_bug.cgi?id=27363 commit 4705796eb538761db37d5e4ef42171f08c394a65 Author: Mark Wielaard <mark@klomp.org> Date: Tue Jan 26 21:12:18 2021 +0100 Add DIE offsets in error messages to make it easier to find what is wrong. With the following patch dwz will give a message like: libmozjs-78.so: Couldn't find DIE at [bd6b507] referenced by DW_AT_abstract_origin from DIE at [bd5bb9b] Which makes it a easier to figure out what is going on. In the above case you can simply lookup the producer of the CU for those two DIEs. Which turned out the be "clang LLVM (rustc version 1.49.0)" which seems to have gotten the abstract origin reference wrong. * dwz.c (read_exprloc): Add DIE offsets to error messages. (checksum_die): Likewise.
For some reason I added commit 2 from comment #1 but not commit 1. I have done so now: commit beca0b4f1423f97f7a2da74a45f9f86d401e4ad2 Author: Mark Wielaard <mark@klomp.org> Date: Sun Feb 21 16:55:17 2021 +0100 Print abbrev or DIE offset for Unknown DWARF error message. * dwz.c (read_abbrev): Add .debug_abbrev offset to error message. (read_exprloc): Print DIE offset that referenced the unknown operand in error message. (read_expr_low_mem_phase1): Likewise. (read_debug_info): Add die_offset to error messages for unknown forms, attributes extending beyond end of CU or unknown block form attributes. https://sourceware.org/bugzilla/show_bug.cgi?id=27363