When researching the quality of debuginfo generated by compilers with various options I found that that systemtap -l returned an inordinately few number of probe points for the code compiled with gcc and LTO enabled (https://github.com/wcohen/quality_info/blob/master/bin/gen_rpm_variants): $ stap -v -L 'process("./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld").statement("*@*:*")' |wc Pass 1: parsed user script and 575 library scripts using 1253792virt/1024272res/13080shr/1011084data kb, in 2340usr/250sys/2639real ms. Pass 2: analyzed script: 1196 probes, 0 functions, 0 embeds, 0 globals using 1644104virt/1411320res/14128shr/1401396data kb, in 8030usr/120sys/8246real ms. 1196 16473 419224 This is compared to the ~250K for the default compile options: $ stap -v -L 'process("./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld").statement("*@*:*")' |wc Pass 1: parsed user script and 575 library scripts using 1253792virt/1024476res/13280shr/1011084data kb, in 2380usr/240sys/2636real ms. Pass 2: analyzed script: 249528 probes, 0 functions, 0 embeds, 0 globals using 1919176virt/1689856res/14280shr/1676468data kb, in 112090usr/11990sys/124827real ms. 249528 3799250 94003914 There does seem to be a significant amount of line info in the lto $ eu-readelf -S binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/.debug/ld-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64.debug There are 39 section headers, starting at offset 0x3ba54a8: ... [31] .debug_line PROGBITS 0000000000000000 02130bd7 001ab9ee 0 0 0 1 When comparted to the non-lto version: $ eu-readelf -S binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/.debug/ld-2.31.1-29.fc30_gcc_o2__g_.x86_64.debug There are 42 section headers, starting at offset 0x34b57f0: ... [33] .debug_line PROGBITS 0000000000000000 01a50f48 0025a74e 0 0 0 1 readelf can decode the line information and show there is a lot of lines available in both versions: [wcohen@cervelo BUILDROOT]$ readelf -wLK ./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld|wc 801150 1923252 41456606 [wcohen@cervelo BUILDROOT]$ readelf -wLK ./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld|wc 566402 1363476 29302199 gdb allows setting a breakpoint on main.cc:139 on both LTO and non-LTO versions. However, systemtap does not see any of the lines for main in the lto version: [wcohen@cervelo BUILDROOT]$ stap -l 'process("./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld").statement("main@*:*")'|wc 0 0 0 [wcohen@cervelo BUILDROOT]$ stap -l 'process("./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld").statement("main@*:*")'|wc 97 97 18236 How to reproduce: wget https://kojipkgs.fedoraproject.org//packages/binutils/2.31.1/29.fc30/src/binutils-2.31.1-29.fc30.src.rpm sudo yum builddep binutils-2.31.1-29.fc30.src.rpm -y rpm -Uvh binutils-2.31.1-29.fc30.src.rpm cd ~/rpmbuild/SPECS wget https://raw.githubusercontent.com/wcohen/quality_info/systemtap_lto/bin/gen_rpm_variants ./gen_rpm_variants binutils.spec /usr/bin/ld Wait a while while the variants are built. Once done should have the variants in ~/rpmbuild/BUILDROOT to further investigate the issue.
Created attachment 12499 [details] Much smaller reproduer example This very small example demonstrates the problem. When -flto is enabled no probeable lines listed. Remove -flto and systemtap is able to list the two lines of executable code: $ g++ -o pr25549 -g -O2 -flto pr25549.cxx $ ~/research/profiling/systemtap_write/install/bin/stap -L 'process("./pr25549").statement("main@*:*")' $ g++ -o pr25549 -g -O2 pr25549.cxx $ ~/research/profiling/systemtap_write/install/bin/stap -L 'process("./pr25549").statement("main@*:*")' process("/home/wcohen/rpmbuild/BUILDROOT/pr25549").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:4") $argc:int $argv:char** process("/home/wcohen/rpmbuild/BUILDROOT/pr25549").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:6") $argc:int $argv:char**
The eu-readelf --debug-dump=info output of the non-lto and lto versions of the reproducer show one difference that is maybe be tripping up systemtap. On non-lto: [ 2f1] subprogram abbrev: 16 external (flag_present) yes name (strp) "main" decl_file (data1) pr25549.cxx (1) decl_line (data1) 3 decl_column (data1) 5 type (ref4) [ 61] low_pc (addr) 0x0000000000401040 <main> high_pc (data8) 21 (0x0000000000401055 <.annobin_static_reloc.c_end.hot>) frame_base (exprloc) [ 0] call_frame_cfa GNU_all_call_sites (flag_present) yes sibling (ref4) [ 357] on lto one, no low_pc, high_pc: [ 36e] subprogram abbrev: 16 external (flag_present) yes name (strp) "main" decl_file (data1) pr25549.cxx (7) decl_line (data1) 3 decl_column (data1) 5 type (ref4) [ de] sibling (ref4) [ 397] Comparing the execution of the working non-LTO and the non-working LTO it looks like the following line tapset.cxx:2403 returns nothing to iterate over, so no lines are reported for the LTO version of the code : auto bfis = q->filtered_all();
Systemtap's logic to determine whether a function should be added to the list of filtered_functions in a query is getting confused by the where the information about the function entry address is placed for the LTO binaries. query_dwarf_func (Dwarf_Die * func, dwarf_query * q) seems to miss the fact that function entry address is stored in another DIE and never pushs the function to the list. Shows the DIEs extracted from the reproducer by dwgrep: $ dwgrep pr25549.lto -e 'entry ?TAG_subprogram' [36e] subprogram external true name "main" decl_file "/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx" decl_line 3 decl_column 5 type [de] base_type sibling [397] pointer_type [2e] subprogram abstract_origin [36e] subprogram low_pc 0x401040 high_pc 21 frame_base 0..0xffffffffffffffff:0 call_frame_cfa GNU_all_call_sites true sibling [7f] subprogram
It appears the root cause is a mismatch between two parts of systemtap, a srcfile enumeration pass (taking dwarf_getsrcfiles() in dfwlpp::collect_srcfiles_matching), and the filtering of dwarf_decl_file()'s against that list. In the LTO case, the quasi-inlined copy of main() has no decl_* parts at all, since those are in the abstract_origin DIEs. For some reason, dwarf_decl_file(func) in query_dwarf_func() returns 0, which will not match any of the elements in the filtered_srcfiles[]. I'd expect it to return a legit value because it seems to have a dwarf_attr_integrate call inside. Maybe it's an issue because this is a cross-CU abstract_origin reference?
I cannot replicate the issue. Using g++ (GCC) 10.0.1 20200430 (Red Hat 10.0.1-0.13) the example from Comment #1 seems to work as intended, even when compiled with LTO: $ g++ -o pr25549 -g -O2 -flto pr25549.cxx $ stap -L 'process("./pr25549").statement("main@*:*")' process("/opt/home/mark/systemtap/pr25549").statement("main@/home/mark/systemtap/pr25549.cxx:4") $argc:int $argv:char** process("/opt/home/mark/systemtap/pr25549").statement("main@/home/mark/systemtap/pr25549.cxx:6") $argc:int $argv:char** Could someone attach a pr25549 binary which fails?
Created attachment 12512 [details] fedora 31 binaries
Thanks, I can replicate it with those binaries. And the problem is precisely where you thought it was. This is a cross-CU abstract_origin reference and dwarf_decl_file was using the line table of the original DIE, not of the abstract DIE (attribute) that was eventually resolved. The following elfutils patch fixes it: diff --git a/libdw/dwarf_decl_file.c b/libdw/dwarf_decl_file.c index 5657132f..d4aa0a18 100644 --- a/libdw/dwarf_decl_file.c +++ b/libdw/dwarf_decl_file.c @@ -55,7 +55,7 @@ dwarf_decl_file (Dwarf_Die *die) } /* Get the array of source files for the CU. */ - struct Dwarf_CU *cu = die->cu; + struct Dwarf_CU *cu = attr_mem.cu; if (cu->lines == NULL) { Dwarf_Lines *lines; I'll audit the other code that uses dwarf_attr_integrate, to see if this is a more common mistake.
I locally built a elfutils-0.179-2.fc30.src.rpm with the proposed patch and installed the resulting rpms. The patch does make things work better. For the simple reproducer: [wcohen@cervelo BUILDROOT]$ stap -L 'process("./pr25549").statement("main@*:*")' process("/home/wcohen/rpmbuild/BUILDROOT/pr25549").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:4") $argc:int $argv:char** process("/home/wcohen/rpmbuild/BUILDROOT/pr25549").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:6") $argc:int $argv:char** [wcohen@cervelo BUILDROOT]$ stap -L 'process("./pr25549.lto").statement("main@*:*")' process("/home/wcohen/rpmbuild/BUILDROOT/pr25549.lto").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:4") $argc:int $argv:char** process("/home/wcohen/rpmbuild/BUILDROOT/pr25549.lto").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:6") $argc:int $argv:char** For the original example that caused the PR to be filed there are many more probe (21694 vs 1196) for the lto version, but still an order of magnitude less than the non-lto version (255842): [wcohen@cervelo BUILDROOT]$ stap -v -L 'process("./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld").statement("*@*:*")' |wc Pass 1: parsed user script and 577 library scripts using 1279164virt/1049920res/13264shr/1036408data kb, in 2230usr/260sys/2499real ms. Pass 2: analyzed script: 21694 probes, 0 functions, 0 embeds, 0 globals using 2078028virt/1843376res/14204shr/1835272data kb, in 11060usr/270sys/11552real ms. 21694 477333 10389959 [wcohen@cervelo BUILDROOT]$ stap -v -L 'process("./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld").statement("*@*:*")' |wc Pass 1: parsed user script and 577 library scripts using 1279164virt/1049904res/13252shr/1036408data kb, in 2170usr/270sys/2459real ms. Pass 2: analyzed script: 255842 probes, 0 functions, 0 embeds, 0 globals using 1958204virt/1729244res/14192shr/1715448data kb, in 90860usr/10320sys/101800real ms. 255842 3891737 96145934 It would be expected that the lto version would be optimized so there are fewer probe points, but 90% reduction in probe points seems unlikely given the relative sizes of the binaries: [wcohen@cervelo BUILDROOT]$ ls -l ./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld -rwxr-xr-x. 2 wcohen wcohen 3941264 Oct 14 2019 ./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld [wcohen@cervelo BUILDROOT]$ ls -l ./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld -rwxr-xr-x. 2 wcohen wcohen 2215376 Oct 14 2019 ./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld
It looks like llvm generated debuginfo doesn't have the abstract_origin, so even using the old prepatched elfutils systemtap is able to find the lines for clang -flto code. Things still work fine with the patched elfutils. Compiling the example with clang and running systemtap with the patched elfutils the results look sane: [wcohen@cervelo BUILDROOT]$ rpm -q clang clang-8.0.0-3.fc30.x86_64 [wcohen@cervelo BUILDROOT]$ clang -o pr25549_llvm -g -O2 pr25549.cxx [wcohen@cervelo BUILDROOT]$ stap -v -L 'process("./pr25549_llvm").statement("*@*:*")' Pass 1: parsed user script and 577 library scripts using 1279160virt/1049792res/13144shr/1036404data kb, in 2320usr/260sys/2582real ms. process("/home/wcohen/rpmbuild/BUILDROOT/pr25549_llvm").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:4") /* pc=.absolute+0x1130 */ $argc:int $argv:char** process("/home/wcohen/rpmbuild/BUILDROOT/pr25549_llvm").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:5") /* pc=.absolute+0x1131 */ $argc:int $argv:char** process("/home/wcohen/rpmbuild/BUILDROOT/pr25549_llvm").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:6") /* pc=.absolute+0x113b */ Pass 2: analyzed script: 3 probes, 0 functions, 0 embeds, 0 globals using 1337580virt/1109364res/14236shr/1094824data kb, in 300usr/10sys/320real ms. [wcohen@cervelo BUILDROOT]$ clang -o pr25549_llvm_lto -g -O2 -flto pr25549.cxx [wcohen@cervelo BUILDROOT]$ stap -v -L 'process("./pr25549_llvm_lto").statement("*@*:*")' Pass 1: parsed user script and 577 library scripts using 1279164virt/1050012res/13360shr/1036408data kb, in 2280usr/290sys/2568real ms. process("/home/wcohen/rpmbuild/BUILDROOT/pr25549_llvm_lto").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:4") /* pc=.absolute+0x1130 */ $argc:int $argv:char** process("/home/wcohen/rpmbuild/BUILDROOT/pr25549_llvm_lto").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:5") /* pc=.absolute+0x1131 */ $argc:int $argv:char** process("/home/wcohen/rpmbuild/BUILDROOT/pr25549_llvm_lto").statement("main@/home/wcohen/rpmbuild/BUILDROOT/pr25549.cxx:6") /* pc=.absolute+0x113b */ Pass 2: analyzed script: 3 probes, 0 functions, 0 embeds, 0 globals using 1337584virt/1109572res/14440shr/1094828data kb, in 310usr/10sys/316re
(In reply to William Cohen from comment #8) > For the original example that caused the PR to be filed there are many more > probe (21694 vs 1196) for the lto version, but still an order of magnitude > less than the non-lto version (255842): > > [wcohen@cervelo BUILDROOT]$ stap -v -L > 'process("./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld"). > statement("*@*:*")' |wc > Pass 1: parsed user script and 577 library scripts using > 1279164virt/1049920res/13264shr/1036408data kb, in 2230usr/260sys/2499real > ms. > Pass 2: analyzed script: 21694 probes, 0 functions, 0 embeds, 0 globals > using 2078028virt/1843376res/14204shr/1835272data kb, in > 11060usr/270sys/11552real ms. > 21694 477333 10389959 > [wcohen@cervelo BUILDROOT]$ stap -v -L > 'process("./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld"). > statement("*@*:*")' |wc > Pass 1: parsed user script and 577 library scripts using > 1279164virt/1049904res/13252shr/1036408data kb, in 2170usr/270sys/2459real > ms. > Pass 2: analyzed script: 255842 probes, 0 functions, 0 embeds, 0 globals > using 1958204virt/1729244res/14192shr/1715448data kb, in > 90860usr/10320sys/101800real ms. > 255842 3891737 96145934 Are there specific probe points that you are missing? > It would be expected that the lto version would be optimized so there are > fewer probe points, but 90% reduction in probe points seems unlikely given > the relative sizes of the binaries: > > [wcohen@cervelo BUILDROOT]$ ls -l > ./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld > -rwxr-xr-x. 2 wcohen wcohen 3941264 Oct 14 2019 > ./binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld > [wcohen@cervelo BUILDROOT]$ ls -l > ./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld > -rwxr-xr-x. 2 wcohen wcohen 2215376 Oct 14 2019 > ./binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld Are these binaries available somewhere for inspection?
I have put the f30 ld binary and associated debuginfo file in http://people.redhat.com/wcohen/pr25549/ld/ The ld built with lto and its debuginfo file is in: http://people.redhat.com/wcohen/pr25549/ld.lto/ With LTO would expect that some functions and statements get eliminated if they are unused. However, 90% reduction of the statement probe points seems unreasonably good considering that the binary is only 50% smaller and the text sections are not that much smaller: [wcohen@cervelo BUILDROOT]$ eu-readelf -S binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld |grep text [15] .text PROGBITS 0000000000037410 00037410 0018fd45 0 AX 0 0 16 [wcohen@cervelo BUILDROOT]$ eu-readelf -S binutils-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64/usr/bin/ld |grep text [13] .text PROGBITS 00000000000329a0 000329a0 00156535 0 AX 0 0 16 As far as probe points there are probe points missing for entire files (ignore the '-./' at the beginning the - is from the diff and the "./" was substitution to make the remove to original full path to make comparing things from build directories more comparable): -./adler32.c -./aligned_buffer.h -./allocator.h -./alloc_traits.h -./argv.c -./basic_ios.h -./binary.h -./char_traits.h -./compress.c -./copy-relocs.h -./cp-demangle.c -./cp-demint.c -./cplus-dem.c -./crc32.c -./d-demangle.c -./debug.h -./deflate.c -./defstd.cc -./../elfcpp/elfcpp_file.h -./../elfcpp/elfcpp.h -./../elfcpp/elfcpp_swap.h -./errors.h -./fcntl2.h -./filename_cmp.c -./fileread.h -./freebsd.h -./fstream -./functional_hash.h -./gc.cc -./hex.c -./inffast.c -./inflate.c -./inftrees.c -./ios_base.h -./istream -./lbasename.c -./list.tcc -./locale_facets.h -./lrealpath.c -./md5.c -./move.h -./new_allocator.h -./predefined_ops.h -./reloc.h -./reloc-types.h -./rust-demangle.c -./script-sections.h -./sha1.c -./stat.h -./stdio2.h -./stdlib.h -./stl_construct.h -./stl_function.h -./stl_iterator_base_funcs.h -./stl_set.h -./stl_tempbuf.h -./stl_uninitialized.h -./streambuf -./string_fortified.h -./string.h -./stringpool.h -./tls.h -./trees.c -./unistd.h -./unlink-if-ordinary.c -./utility -./xexit.c -./xmalloc.c -./xmemdup.c -./xstrdup.c -./zutil.c
(In reply to William Cohen from comment #11) > I have put the f30 ld binary and associated debuginfo file in > > http://people.redhat.com/wcohen/pr25549/ld/ > > The ld built with lto and its debuginfo file is in: > > http://people.redhat.com/wcohen/pr25549/ld.lto/ Thanks, but do you also have a build before debug splitting by rpmbuild? Asking because I just saw an issue reported against debugedit with LTO builds: https://github.com/rpm-software-management/rpm/issues/1207 Or do you have at least the build.log so we can see if there are any issues reported during the build? BTW. ld-2.31.1-29.fc30_gcc_o2__g_.x86_64.debug seems to use an alt (multi dwz) file (not included in the downloads), but ld-2.31.1-29.fc30_gcc_o2_lto_g_.x86_64.debug doesn't. Did the lto version not get through dwz or was there some other issue that prevented the alt file from being created? Also, is it possible to rebuild with GCC10, older gcc seem to have some lto/debuginfo quirks. > As far as probe points there are probe points missing for entire files > (ignore the '-./' at the beginning the - is from the diff and the "./" was > substitution to make the remove to original full path to make comparing > things from build directories more comparable): > > -./adler32.c > -./aligned_buffer.h > -./allocator.h > -./alloc_traits.h > -./argv.c > -./basic_ios.h > -./binary.h > -./char_traits.h > -./compress.c > -./copy-relocs.h > -./cp-demangle.c > -./cp-demint.c > -./cplus-dem.c > -./crc32.c > -./d-demangle.c > -./debug.h > -./deflate.c > -./defstd.cc > -./../elfcpp/elfcpp_file.h > -./../elfcpp/elfcpp.h > -./../elfcpp/elfcpp_swap.h > -./errors.h > -./fcntl2.h > -./filename_cmp.c > -./fileread.h > -./freebsd.h > -./fstream > -./functional_hash.h > -./gc.cc > -./hex.c > -./inffast.c > -./inflate.c > -./inftrees.c > -./ios_base.h > -./istream > -./lbasename.c > -./list.tcc > -./locale_facets.h > -./lrealpath.c > -./md5.c > -./move.h > -./new_allocator.h > -./predefined_ops.h > -./reloc.h > -./reloc-types.h > -./rust-demangle.c > -./script-sections.h > -./sha1.c > -./stat.h > -./stdio2.h > -./stdlib.h > -./stl_construct.h > -./stl_function.h > -./stl_iterator_base_funcs.h > -./stl_set.h > -./stl_tempbuf.h > -./stl_uninitialized.h > -./streambuf > -./string_fortified.h > -./string.h > -./stringpool.h > -./tls.h > -./trees.c > -./unistd.h > -./unlink-if-ordinary.c > -./utility > -./xexit.c > -./xmalloc.c > -./xmemdup.c > -./xstrdup.c > -./zutil.c All these seem to come from "outside" the binutils gold build dir. It might be that there is some confusion in the DWARF about the combined code. These files have most likely been build inside different working dirs (comp_dirs). So it might be that with lto/debuginfo merging they come out against the wrong (relative) build dir?
It took a bit longer to build similar binaries. binutils-2.31 doesn't compile on f32, Used binutils-2.34-2.fc32.src.rpm on f32 with gcc-10.0.1-0.14.fc32.x86_64 built the regular default binutil with: rpmbuild -ba binutils.spec The lto enabled one with rpmbuild --define "%optflags -flto -ffat-lto-objects -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection" -bi binutils.spec The ld binary is ld-new in the build. They have been placed in: http://people.redhat.com/wcohen/pr25549/binutils-2.34.base/gold/ld-new http://people.redhat.com/wcohen/pr25549/binutils-2.34.lto/gold/ld-new The results similar for these binaries to the earlier results with the patched elfutils being used: [wcohen@fedora32 BUILD]$ stap -v -L 'process("./binutils-2.34.lto/gold/ld-new").statement("*@*:*")' |wc Pass 1: parsed user script and 502 library scripts using 442596virt/212504res/13092shr/199240data kb, in 470usr/110sys/582real ms. Pass 2: analyzed script: 19429 probes, 0 functions, 0 embeds, 0 globals using 540720virt/308324res/14024shr/297364data kb, in 6690usr/180sys/7011real ms. 19429 454247 8913840 [wcohen@fedora32 BUILD]$ stap -v -L 'process("./binutils-2.34.base/gold/ld-new").statement("*@*:*")' |wc Pass 1: parsed user script and 502 library scripts using 442596virt/212048res/12636shr/199240data kb, in 470usr/60sys/532real ms. Pass 2: analyzed script: 257942 probes, 0 functions, 0 embeds, 0 globals using 1068956virt/838748res/13376shr/825600data kb, in 90030usr/10680sys/103013real ms. 257942 3926369 88009445 Part of the issue with the file might be due to the file names being mis-attributed in systemtap. I saw that in a number of places with the binutils-2.31 I was analyzing earliers. When comparing probe points for main.cc I noticed that there were a number of lines way pass the end of the 332 line main.cc file. The "-v -L" option includes a PC location so I was able to find various references to the same instruction: $ grep 3e9ec a process("/home/wcohen/rpmbuild/BUILDROOT/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld").statement("main@/usr/src/debug/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/gold/main.cc:1445") /* pc=.dynamic+0x3e9ec */ $args:string $errors:class Errors $command_line:class Command_line $timer:class Timer $mapfile:class Mapfile* $workqueue:class Workqueue $input_objects:class Input_objects $gc:class Garbage_collection $icf:class Icf $symtab:class Symbol_table $layout:class Layout $search_path:class Dirsearch process("/home/wcohen/rpmbuild/BUILDROOT/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld").statement("main@/usr/src/debug/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/gold/main.cc:223") /* pc=.dynamic+0x3e9ec */ $args:string $errors:class Errors $command_line:class Command_line $timer:class Timer $mapfile:class Mapfile* $workqueue:class Workqueue $input_objects:class Input_objects $gc:class Garbage_collection $icf:class Icf $symtab:class Symbol_table $layout:class Layout $search_path:class Dirsearch process("/home/wcohen/rpmbuild/BUILDROOT/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld").statement("set_gc@/usr/src/debug/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/gold/symtab.h:1445") /* pc=.dynamic+0x3e9ec */ $this:class Symbol_table* const $gc:class Garbage_collection* process("/home/wcohen/rpmbuild/BUILDROOT/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/usr/bin/ld").statement("set_gc@/usr/src/debug/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/gold/symtab.h:223") /* pc=.dynamic+0x3e9ec */ $this:class Symbol_table* const $gc:class Garbage_collection* Looking through the readelf --dump-debug=decodedline output it appears that readelf gets something reasonable: Looking at the output of readelf for above it look like uses first file name with view from last line number: ./main.cc:[++] main.cc 222 0x3e9e3 3 main.cc 223 0x3e9ec ./symtab.h:[++] symtab.h 1444 0x3e9ec 1 symtab.h 1445 0x3e9ec 2 symtab.h 1445 0x3e9ec 3 eu-readelf also looks reasonable: /usr/src/debug/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/gold/main.cc (mtime: 0, length: 0) 222:3 0 0 0 +0x000000000003e9e3 <main+0x4a3> 223:5 S 0 0 0 +0x000000000003e9ec <main+0x4ac> /usr/src/debug/binutils-2.31.1-29.fc30_gcc_o2__g_.x86_64/gold/symtab.h (mtime: 0, length: 0) 1444:3 S 0 0 0 +0x000000000003e9ec <main+0x4ac> 1445:5 S 0 0 0 +0x000000000003e9ec <main+0x4ac> 1445:15 0 0 0 +0x000000000003e9ec <main+0x4ac>
commit 143974310e8cf9a05580aee9aed6cee9db22900d seems to help