Summary: | readelf..info misreports DW_FORM_loclistx, DW_FORM_rnglistx | ||
---|---|---|---|
Product: | binutils | Reporter: | Vsevolod Alekseyev <sevaa> |
Component: | binutils | Assignee: | Nick Clifton <nickc> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | h3xrabbit, nickc |
Priority: | P2 | ||
Version: | 2.39 | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | 2022-06-23 00:00:00 | |
Attachments: |
Proposed Patch
Proposed Patch Proposed Patch Test binary Proposed patch take 2 |
Description
Vsevolod Alekseyev
2022-06-20 18:04:07 UTC
The culprit is, probably, the case statement at dwarf.c:2774, where fetch_indexed_addr() is being called both for DW_FORM_addrxN (where it belongs) and for DW_FORM_loclistx/DW_FORM_rnglistx (where it does not). For the latter, fetch_indexed_value should be used instead. The text description is off, too. The loclistx value doesn't represent an address index. Created attachment 14159 [details]
Proposed Patch
Based upon your observation, would you mind trying out this patch and letting me know if it works ?
Comment on attachment 14159 [details]
Proposed Patch
Does not compile, "index" is not declared.
Even declared as "dwarf_vma index", the numbers are all wrong. Looks like fetch_indexed_value is not taking into account the DW_AT_loclists_base/DW_AT_rnglists_base attribute value, which stores the address of the offset table of the current compile unit's contribution in debug_loclists/debug_rnglists, respectively. Similar issue here: #29266. Created attachment 14160 [details]
Proposed Patch
oops - I missed out a thunk...
Created attachment 14161 [details]
Proposed Patch
Ok, lets see if the third time is the charm.
I am not 100% about the offset computation inside fetch_indexed_value() though, so some tweaking may still be needed. It would be helpful if there was a test case I could examine...
Created attachment 14162 [details]
Test binary
Have a test binary.
The warnings that "readelf --debug-dump=info" is giving on that binary are all bogus; they all have to do with readelf looking for the offset in the wrong place. The patches are supposed to be cumulative, right? And the first one is already not in the bug? So there is no way to apply them on top of HEAD anymore, unless I've saved all three? There is a bit of parsing complication here that I don't think the current parser quite appreciates. The DWARF bitness may vary between CUs in indexed sections, and short of going through the headers, linked list style, there is no way to determine the bitness for any given section from the DIE data. The structure of debug_loclists goes like this: CU_header_0: length AKA bitness indicator (4 or 12) version (2) address_size (1) selector_size (1) offset_entry_count (4) offsets[0] (4 or 8) <-- DW_AT_loclists_base points here offsets[1] (4 or 8) ...more offsets ...the actual loclists The offsets table contain the offsets of the target loclists, relative to the offsets table start. The size of the offset (4 or 8) is determined by the length field in the header, the usual DWARF style. Now, from the value of DW_AT_loclists_base alone, it's pretty much impossible to tell whether the section is 32- or 64-bit DWARF (except for the 0th CU, where the DW_AT_loclists_base can be either 12 or 20). In subsequent CUs, you can't seek back from the offset table to the top of header, because it's variable length. If you look at the dword at DW_AT_loclists_base-20, it may be 0xffffffff by pure accident. Similar situation in debug_rnglists, debug_addr, debug_str_offsets. A proper parser should go through all CU headers in the section (you can sort of fast-forward through them by skipping by length), determine and store the bitness of each, and then recover the bitness of any particular CU by matching the DW_AT_rnglists_base against that. One *slightly* better alternative would be - reuse the bitness from the corresponding CU in the debug_info. While it is possible to compose a correct DWARF dataset where the bitness between contributions from the same compile unit in different section varies, it would be a pain in the neck for the compiler vendor. Or you can just assume 32 bits. The spec encourages the implementors not to use 64-bit DWARF unless absolutely necessary. How often does one see 4GB+ sections in a binary? (In reply to Vsevolod Alekseyev from comment #10) Hi Vsevolod, > The DWARF bitness may vary between CUs in indexed > sections, Presumably this would be an unusual case. At least I would hope so... I think however that would should be able to handle it. The code in binutils/dwarf.c already stores the loclists_base on a per-CU basis, so we can easily add code to store the bitness too. I am looking into creating a patch that will do this... It's not as bad; a gentleman at the DWARF mailing list pointed out there was a rule in section 7.4 near the end, that the bitness of the same CU's contributions in different sections should be the same: "The 32-bit and 64-bit DWARF format conventions must not be intermixed within a single compilation unit." So reusing the bitness from the corresponding CU in the debug_info is a proper and correct thing to do. The master branch has been updated by Nick Clifton <nickc@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=dbcbf67ca565ec29f13a2302dcdf9b01ef7832ca commit dbcbf67ca565ec29f13a2302dcdf9b01ef7832ca Author: Nick Clifton <nickc@redhat.com> Date: Tue Jun 28 12:30:19 2022 +0100 Fix the display of the idnex values for DW_FORM_loclistx and DW_FORM_rnglistx. Correct the display of .debug.loclists sections. PR 29267 * dwarf.c (display_debug_rnglists): New function, broken out of.. (display_debug_ranges): ... here. (read_and_display_attr_value): Correct calculation of index displayed for DW_FORM_loclistx and DW_FORM_rnglistx. * testsuite/binutils-all/x86-64/pr26808.dump: Update expected output. Hi Vsevolod, OK, I have checked in a patch which, combined with others now also checked in, should address this issue. Please can you give the current binutils sources a whirl and see if readelf's output is now correct. Cheers Nick I see you've changed the output format. That will take some time to catch up. The loclist entries in sections with nonblank offset tables in are dumped differently; the start/end address of location entries is not resolved relative to the corresponding CU's base PC. This is inconsistent with the past behavior, and rather misleading. Didn't check the rangelists yet. Same story in rnglists. (In reply to Vsevolod Alekseyev from comment #17) > The loclist entries in sections with nonblank offset tables in are dumped > differently; the start/end address of location entries is not resolved > relative to the corresponding CU's base PC. This is inconsistent with the > past behavior, and rather misleading. I made these changes in order to bring readelf's output more inline with the output from eu-readelf. I have been using that tool's output as a comparison for the updates for this issue, and using the same general format helps with that. But I am also willing to undo unnecessary formatting changes, so please can you provide me an example of the before and after formatting, so that I can be sure that I am changing the correct things. :-) At this juncture, we gave up on using readelf as a reference implementation of DWARF parsing. You may do with this issue as you please. Created attachment 15150 [details]
Proposed patch take 2
Tested against llvm-dwarfdump on the same corpus of binaries as #30880.
The master branch has been updated by Nick Clifton <nickc@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=8c7125feaaa495b2aa0fed6cd077e35d0d90df43 commit 8c7125feaaa495b2aa0fed6cd077e35d0d90df43 Author: Vsevolod Alekseyev <sevaa@sprynet.com> Date: Tue Oct 3 09:27:27 2023 +0100 Fix: readelf..info misreports DW_FORM_loclistx, DW_FORM_rnglistx PR 29267 * dwarf.c (fetch_indexed_value): Delete. (fetch_indexed_offset): Correct base address calculation. (read_and_display_attr_value): Replace uses of fetch_indexed_value with fetch_indexed_offset. Thanks - patch applied. Thanks. I'll take a look at 30880 next. On a side note - this one is marked as fixed, and 30792 is not, is that right? (In reply to Vsevolod Alekseyev from comment #24) > On a side note - this one is marked as fixed, and 30792 is not, is that > right? Doh - sorry - forgot to close 30792. Doing it now... |