Bug 27805 - libdwfl: Unable to extract source line information for RISC-V binary
Summary: libdwfl: Unable to extract source line information for RISC-V binary
Status: RESOLVED FIXED
Alias: None
Product: elfutils
Classification: Unclassified
Component: libdw (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-30 12:31 UTC by John Doe
Modified: 2023-10-09 16:15 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
RV32IMC binary to reproduce the outlined issue (103.28 KB, application/octet-stream)
2021-04-30 12:31 UTC, John Doe
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Doe 2021-04-30 12:31:12 UTC
Created attachment 13416 [details]
RV32IMC binary to reproduce the outlined issue

For the attached RV32IMC (i.e. RISC-V) binary, libdwfl from elfutlis 0.183 is incapable of extracting source line information from provided DWARF debugging information. This can be easily reproduced by comparing the behavior of elfutils's addr2line with binutils's addr2line program.

With binutils:

  $ addr2line --version
  GNU addr2line (GNU Binutils) 2.35.2
  $ addr2line -e /tmp/elfutils-srcline-bug.elf 20400058
  /home/john/src/RIOT/examples/hello-world/main.c:26

With elfutils:

  $ addr2line --version
  addr2line (elfutils) 0.183
  $ addr2line -e /tmp/elfutils-srcline-bug.elf 20400058
  ??:0

Briefly debugging this, it seems to me that dwfl_module_getsrc seems to return NULL and error out with DWARF_E_INVALID_OFFSET for this given binary.
Comment 1 Jim Wilson 2021-05-01 20:48:57 UTC
Using readelf -wr to look at the debug_aranges section, I see entries like

  Length:                   44
  Version:                  2
  Offset into .debug_info:  0x987
  Pointer Size:             4
  Segment Size:             0

    Address    Length
    00000000 00000000
    20400132 00000002
    20400134 0000003a
    00000000 00000000

An address/length entry of 0/0 is supposed to mark the end of the list, but here we have one at the beginning.  This is confusing elfutils which is trying to move byte by byte through the aranges section. libdw/dwarf_aranges.c has
          /* Two zero values mark the end.  */
          if (range_address == 0 && range_length == 0)
            break;
and then assumes that the next entry is immediately following, which it isn't, and it ends up reading garbage.  binutils seems to be using the length field to find the last entry.  And readelf is ignoring the 0/0 end of list rule so that we can see the invalid entries.

There are a lot of aranges that have 0/0 entries not at the end of the list.
Comment 2 Mark Wielaard 2021-05-05 14:36:29 UTC
(In reply to Jim Wilson from comment #1)
> Using readelf -wr to look at the debug_aranges section, I see entries like
> 
>   Length:                   44
>   Version:                  2
>   Offset into .debug_info:  0x987
>   Pointer Size:             4
>   Segment Size:             0
> 
>     Address    Length
>     00000000 00000000
>     20400132 00000002
>     20400134 0000003a
>     00000000 00000000
> 
> An address/length entry of 0/0 is supposed to mark the end of the list, but
> here we have one at the beginning.  This is confusing elfutils which is
> trying to move byte by byte through the aranges section.
> libdw/dwarf_aranges.c has
>           /* Two zero values mark the end.  */
>           if (range_address == 0 && range_length == 0)
>             break;
> and then assumes that the next entry is immediately following, which it
> isn't, and it ends up reading garbage.  binutils seems to be using the
> length field to find the last entry.  And readelf is ignoring the 0/0 end of
> list rule so that we can see the invalid entries.
> 
> There are a lot of aranges that have 0/0 entries not at the end of the list.

Any idea where they come from?
And what does it look like when using -gdwarf-5?
You should get a .debug_rnglists section in that case which has explicit end of list markers (before DWARF5 the double zero addresses are interpreted as end of list).
Comment 3 Jim Wilson 2021-05-05 15:42:45 UTC
My first thought was linkonce/comdat, but that is used by C++ and would have shown up before.  So that leaves -gc-sections.  I can reproduce with a simple example.

rohan:2010$ cat tmp.c
extern int sub1 (int);
extern int sub2 (int);
extern int sub3 (int);
extern int sub4 (int);
int main (void) { return sub2 (sub4 (0)); }
rohan:2011$ cat tmp2.c
int sub1 (int i) {return i + 10; }
int sub2 (int i) {return i + 20; }
int sub3 (int i) {return i - 10; }
int sub4 (int i) {return i - 20; }
rohan:2012$ riscv32-unknown-elf-gcc -O2 tmp.c tmp2.c -ffunction-sections -Wl,-gc-sections -g
rohan:2013$ readelf -wr a.out
Contents of the .debug_aranges section:

  Length:                   28
  Version:                  2
  Offset into .debug_info:  0x0
  Pointer Size:             4
  Segment Size:             0

    Address    Length
    00010074 0000000e 
    00000000 00000000 
  Length:                   52
  Version:                  2
  Offset into .debug_info:  0x7c
  Pointer Size:             4
  Segment Size:             0

    Address    Length
    00000000 00000000 
    00010114 00000004 
    00000000 00000000 
    00010118 00000004 
    00000000 00000000 

rohan:2014$ 

I get the same result with an x86_64-linux compiler.  And I get the same result with -gdwarf-5.
Comment 4 Jim Wilson 2021-05-05 15:59:24 UTC
Actually I just noticed with the x86_64-linux compiler I'm getting addresses of 0 but lengths of 4 which would be OK.

  Length:                   92
  Version:                  2
  Offset into .debug_info:  0x8c
  Pointer Size:             8
  Segment Size:             0

    Address            Length
    0000000000000000 0000000000000004 
    0000000000000620 0000000000000004 
    0000000000000000 0000000000000004 
    0000000000000630 0000000000000004 
    0000000000000000 0000000000000000 

This is an Ubuntu 18.04 gcc-7.6 toolchain.  Not clear why it is different.
Comment 5 Mark Wielaard 2021-05-06 09:43:26 UTC
The length being 4 does make some sense even if the start address is zero. A range list entry contains of a beginning address and an ending address. The ending address marks the first address past the end of the address range.

So it might be a difference in how the ending address is represented as symbol offset/relocation.
Comment 6 John Doe 2021-07-28 11:55:54 UTC
I can confirm that compiling the code without -gc-sections seems to resolve this bug. Probably not RISC-V specific.
Comment 7 Mark Wielaard 2023-10-06 12:10:31 UTC
Do as suggested in comment #1 (and what binutils apparently does), skip zero entries when not at the end of the table:

https://inbox.sourceware.org/elfutils-devel/20231006120329.340788-1-mark@klomp.org/T/#u
Comment 8 Mark Wielaard 2023-10-09 16:15:21 UTC
commit ace48815682214308d2f849f149250a6562c59fe
Author: Mark Wielaard <mark@klomp.org>
Date:   Fri Oct 6 13:56:55 2023 +0200

    libdw: Skip zero entries in aranges
    
    An address/length entry of two zeros is supposed to mark the end of a
    table. But in some cases a producer might leave zero entries in the
    table (for example when using gcc -ffunction-sections -gc-sections).
    
    Since we know the lenght of the table we can just skip such entries
    and continue to the end.
    
        * libdw/dwarf_getaranges.c (dwarf_getaranges): Calculate endp.
        When seeing two zero values, check we are at endp.
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=27805
    
    Signed-off-by: Mark Wielaard <mark@klomp.org>