When readelf --debug-dump=info prints attributes of type DW_FORM_loclistx, DW_FORM_rnglistx, it prints what should be the offset of the loclist/rangelist in the respective section. The raw value of this attribute is supposed to be an index into the offsets table in the compilation unit header inside the debug_loclists or debug_rnglists sections, respectively, while the address of the offset table itself should be taken from the DW_AT_loclists_base or DW_AT_rnglists_base, respecively, in the CU's top DIE.
But it looks like readelf looks up the offset in the debug_addr section, instead of debug_loclists/debug_rnglists section.
This even causes warnings of type "Offset into section .debug_addr too big:", because the offset was never meant to be in .debug_addr.
I've got a repro for that from clang++-11, but it's big. I'll try to find a smaller one.
The culprit is, probably, the case statement at dwarf.c:2774, where fetch_indexed_addr() is being called both for DW_FORM_addrxN (where it belongs) and for DW_FORM_loclistx/DW_FORM_rnglistx (where it does not). For the latter, fetch_indexed_value should be used instead.
The text description is off, too. The loclistx value doesn't represent an address index.
Created attachment 14159 [details]
Based upon your observation, would you mind trying out this patch and letting me know if it works ?
Comment on attachment 14159 [details]
Does not compile, "index" is not declared.
Even declared as "dwarf_vma index", the numbers are all wrong.
Looks like fetch_indexed_value is not taking into account the DW_AT_loclists_base/DW_AT_rnglists_base attribute value, which stores the address of the offset table of the current compile unit's contribution in debug_loclists/debug_rnglists, respectively. Similar issue here: #29266.
Created attachment 14160 [details]
oops - I missed out a thunk...
Created attachment 14161 [details]
Ok, lets see if the third time is the charm.
I am not 100% about the offset computation inside fetch_indexed_value() though, so some tweaking may still be needed. It would be helpful if there was a test case I could examine...
Created attachment 14162 [details]
Have a test binary.
The warnings that "readelf --debug-dump=info" is giving on that binary are all bogus; they all have to do with readelf looking for the offset in the wrong place.
The patches are supposed to be cumulative, right? And the first one is already not in the bug? So there is no way to apply them on top of HEAD anymore, unless I've saved all three?
There is a bit of parsing complication here that I don't think the current parser quite appreciates. The DWARF bitness may vary between CUs in indexed sections, and short of going through the headers, linked list style, there is no way to determine the bitness for any given section from the DIE data.
The structure of debug_loclists goes like this:
length AKA bitness indicator (4 or 12)
offsets (4 or 8) <-- DW_AT_loclists_base points here
offsets (4 or 8)
...the actual loclists
The offsets table contain the offsets of the target loclists, relative to the offsets table start. The size of the offset (4 or 8) is determined by the length field in the header, the usual DWARF style.
Now, from the value of DW_AT_loclists_base alone, it's pretty much impossible to tell whether the section is 32- or 64-bit DWARF (except for the 0th CU, where the DW_AT_loclists_base can be either 12 or 20). In subsequent CUs, you can't seek back from the offset table to the top of header, because it's variable length. If you look at the dword at DW_AT_loclists_base-20, it may be 0xffffffff by pure accident.
Similar situation in debug_rnglists, debug_addr, debug_str_offsets.
A proper parser should go through all CU headers in the section (you can sort of fast-forward through them by skipping by length), determine and store the bitness of each, and then recover the bitness of any particular CU by matching the DW_AT_rnglists_base against that.
One *slightly* better alternative would be - reuse the bitness from the
corresponding CU in the debug_info. While it is possible to compose a correct DWARF dataset where the bitness between contributions from the same compile unit in different section varies, it would be a pain in the neck for the compiler vendor.
Or you can just assume 32 bits. The spec encourages the implementors not to use 64-bit DWARF unless absolutely necessary. How often does one see 4GB+ sections in a binary?
(In reply to Vsevolod Alekseyev from comment #10)
> The DWARF bitness may vary between CUs in indexed
Presumably this would be an unusual case. At least I would hope so...
I think however that would should be able to handle it. The code in binutils/dwarf.c already stores the loclists_base on a per-CU basis,
so we can easily add code to store the bitness too. I am looking into
creating a patch that will do this...
It's not as bad; a gentleman at the DWARF mailing list pointed out there was a rule in section 7.4 near the end, that the bitness of the same CU's contributions in different sections should be the same:
"The 32-bit and 64-bit DWARF format conventions must not be intermixed within a single compilation unit."
So reusing the bitness from the corresponding CU in the debug_info is a proper and correct thing to do.
The master branch has been updated by Nick Clifton <email@example.com>:
Author: Nick Clifton <firstname.lastname@example.org>
Date: Tue Jun 28 12:30:19 2022 +0100
Fix the display of the idnex values for DW_FORM_loclistx and DW_FORM_rnglistx. Correct the display of .debug.loclists sections.
* dwarf.c (display_debug_rnglists): New function, broken out of..
(display_debug_ranges): ... here.
(read_and_display_attr_value): Correct calculation of index
displayed for DW_FORM_loclistx and DW_FORM_rnglistx.
* testsuite/binutils-all/x86-64/pr26808.dump: Update expected
OK, I have checked in a patch which, combined with others now also checked in, should address this issue. Please can you give the current binutils sources a whirl and see if readelf's output is now correct.
I see you've changed the output format. That will take some time to catch up.
The loclist entries in sections with nonblank offset tables in are dumped differently; the start/end address of location entries is not resolved relative to the corresponding CU's base PC. This is inconsistent with the past behavior, and rather misleading.
Didn't check the rangelists yet.
Same story in rnglists.
(In reply to Vsevolod Alekseyev from comment #17)
> The loclist entries in sections with nonblank offset tables in are dumped
> differently; the start/end address of location entries is not resolved
> relative to the corresponding CU's base PC. This is inconsistent with the
> past behavior, and rather misleading.
I made these changes in order to bring readelf's output more inline with
the output from eu-readelf. I have been using that tool's output as a
comparison for the updates for this issue, and using the same general format
helps with that.
But I am also willing to undo unnecessary formatting changes, so please
can you provide me an example of the before and after formatting, so that
I can be sure that I am changing the correct things. :-)
At this juncture, we gave up on using readelf as a reference implementation of DWARF parsing. You may do with this issue as you please.