Bug 29529 - [objdump] -l flag can't parse -gdwarf-5 file name info from clang
Summary: [objdump] -l flag can't parse -gdwarf-5 file name info from clang
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-08-26 20:42 UTC by Nick Desaulniers
Modified: 2022-08-30 18:56 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
clang++ -g -gdwarf-5 x.cpp (2.75 KB, application/x-sharedlib)
2022-08-26 20:42 UTC, Nick Desaulniers
Details
Proposed Patch (1.50 KB, patch)
2022-08-30 14:57 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Desaulniers 2022-08-26 20:42:13 UTC
Created attachment 14300 [details]
clang++ -g -gdwarf-5 x.cpp

via: https://github.com/compiler-explorer/compiler-explorer/issues/3991

It looks like `objdump -dl a.out` isn't able to parse file info correctly from DWARFv5 binaries generated by Clang.

I'm not sure if Clang is perhaps generating invalid debug info, but it seems that llvm-objdump is able to parse output from either GCC or Clang.

```
$ cat x.cpp
int square(int num) {
    return num * num;
}

int main () {
    return square(3);
}
$ g++ -g -gdwarf-5 x.cpp
$ objdump -dl a.out | grep main\>: -A 13
0000000000001138 <main>:
main():
/tmp/x.cpp:5
    1138:	55                   	push   %rbp
    1139:	48 89 e5             	mov    %rsp,%rbp
/tmp/x.cpp:6
    113c:	bf 03 00 00 00       	mov    $0x3,%edi
    1141:	e8 e3 ff ff ff       	call   1129 <_Z6squarei>
    1146:	90                   	nop
/tmp/x.cpp:7
    1147:	5d                   	pop    %rbp
    1148:	c3                   	ret
    1149:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
$ llvm-objdump -dl a.out| grep main\>: -A 13
0000000000001138 <main>:
; main():
; /tmp/x.cpp:5
    1138: 55                           	pushq	%rbp
    1139: 48 89 e5                     	movq	%rsp, %rbp
; /tmp/x.cpp:6
    113c: bf 03 00 00 00               	movl	$3, %edi
    1141: e8 e3 ff ff ff               	callq	0x1129 <_Z6squarei>
    1146: 90                           	nop
; /tmp/x.cpp:7
    1147: 5d                           	popq	%rbp
    1148: c3                           	retq
    1149: 0f 1f 80 00 00 00 00         	nopl	(%rax)
$ clang++ -g -gdwarf-5 x.cpp
$ llvm-objdump -dl a.out| grep main\>: -A 13
0000000000001140 <main>:
; main():
; /tmp/x.cpp:5
    1140: 55                           	pushq	%rbp
    1141: 48 89 e5                     	movq	%rsp, %rbp
    1144: 48 83 ec 10                  	subq	$16, %rsp
    1148: c7 45 fc 00 00 00 00         	movl	$0, -4(%rbp)
; /tmp/x.cpp:6
    114f: bf 03 00 00 00               	movl	$3, %edi
    1154: e8 d7 ff ff ff               	callq	0x1130 <_Z6squarei>
    1159: 48 83 c4 10                  	addq	$16, %rsp
    115d: 5d                           	popq	%rbp
    115e: c3                           	retq
    115f: 90                           	nop
$ objdump -dl a.out | grep main\>: -A 13    
0000000000001140 <main>:
main():
<unknown>:5
    1140:	55                   	push   %rbp
    1141:	48 89 e5             	mov    %rsp,%rbp
    1144:	48 83 ec 10          	sub    $0x10,%rsp
    1148:	c7 45 fc 00 00 00 00 	movl   $0x0,-0x4(%rbp)
<unknown>:6
    114f:	bf 03 00 00 00       	mov    $0x3,%edi
    1154:	e8 d7 ff ff ff       	call   1130 <_Z6squarei>
    1159:	48 83 c4 10          	add    $0x10,%rsp
    115d:	5d                   	pop    %rbp
    115e:	c3                   	ret
    115f:	90                   	nop
```

See those `<unknown>` in the above.

```
$ clang --version
clang version 16.0.0 (git@github.com:llvm/llvm-project.git 51a643230eadd67bf5ac18befc28d85ed4e83c81)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /android0/llvm-project/llvm/build/bin

$ g++ --version
g++ (Debian 11.3.0-5) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ objdump --version
GNU objdump (GNU Binutils for Debian) 2.38.90.20220713
Copyright (C) 2022 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.

$ llvm-objdump --version | head -n 2
LLVM (http://llvm.org/):
  LLVM version 16.0.0git
```
Comment 1 David Blaikie 2022-08-27 01:26:53 UTC
So DWARFv5 added a zeroth entry to the line table (used to be the line table started at 1).

GCC produces two entries even for a simple file - a zeroth and a first, both with the same value, and GCC always use the first.

Clang produces one entry and uses that.

Looks like objdump is ignoring/marking the zeroth as "unknown" - if you add a #line to the source so that Clang uses that as a first entry (after the zeroth original entry) then objdump behaves fine.

So, yeah, missing zeroth entry in DWARFv5 (now that it has zeroth entries) somewhere in objdump.
Comment 2 Nick Clifton 2022-08-30 14:57:50 UTC
Created attachment 14304 [details]
Proposed Patch
Comment 3 Sourceware Commits 2022-08-30 15:02:18 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=37833b966576c5d25e797ea3b6c33d0459a71892

commit 37833b966576c5d25e797ea3b6c33d0459a71892
Author: Nick Clifton <nickc@redhat.com>
Date:   Tue Aug 30 16:01:20 2022 +0100

    BFD library: Use entry 0 in directory and filename tables of DWARF-5 debug info.
    
            PR 29529
            * dwarf2.c (struct line_info_table): Add new field:
            use_dir_and_file_0.
            (concat_filename): Use new field to help select the correct table
            slot.
            (read_formatted_entries): Do not skip entry 0.
            (decode_line_info): Set new field depending upon the version of
            DWARF being parsed.  Initialise filename based upon the setting of
            the new field.
Comment 4 Nick Clifton 2022-08-30 15:04:09 UTC
Hi Nick,

  Thanks for reporting this problem.

  You are correct - there was an implicit assumption in the DWARF decoding
  logic in the BFD library that dir0 and file0 were unused.  I have now
  checked in a patch to correct this.

Cheers
  Nick
Comment 5 Nick Desaulniers 2022-08-30 18:56:09 UTC
Thank you Nick and David!