Bug 25145 - stap -L 'process("<binary>").statement("*@*:*")' very slow on aarch64 and power64le machines
Summary: stap -L 'process("<binary>").statement("*@*:*")' very slow on aarch64 and pow...
Status: RESOLVED WORKSFORME
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-29 15:36 UTC by William Cohen
Modified: 2024-04-11 18:04 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description William Cohen 2019-10-29 15:36:39 UTC
When doing some experiments to see what variables were visible at various statement lines in a program I found that the following command was MUCH slower on aarch64 and ppc64le than x86_64.  On fc30 x86_64 the following command took less than 2 minutes to run:

$  time stap -L 'process("./usr/bin/ld").statement("*@*:*")'|wc
  71548 1016514 27678760

real	1m20.291s
user	1m11.253s
sys	0m8.840s


On fc30 ppc64le with binaries generated with same options takes almost 20 minutes:

#  time stap -L 'process("./usr/bin/ld").statement("*@*:*")'|wc
  77651 1170757 30577233

real	19m13.698s
user	19m9.413s
sys	0m4.103s



Did a "perf record -a" and "perf report" for a portion of the run to see where time was being spent:

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 39K of event 'cycles'
# Event count (approx.): 28867284968
#
# Children      Self  Command          Shared Object             Symbol                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
# ........  ........  ...............  ........................  .........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
#
    98.71%     0.00%  stap             stap                      [.] query_module
            |
            ---query_module
               dwarf_query::handle_query_module
               dwarf_query::query_module_dwarf
               dwflpp::iterate_over_cus<void>
               query_cu
               dwflpp::iterate_over_srcfile_lines<void>
               |          
               |--96.60%--query_srcfile_line
               |          |          
               |          |--67.84%--query_statement
               |          |          |          
               |          |           --67.84%--dwarf_query::add_probe_point
               |          |                     |          
               |          |                     |--65.52%--dwarf_query::add_probe_point
               |          |                     |          |          
               |          |                     |           --63.80%--dwfl_module_getsym_info
               |          |                     |                     |          
               |          |                     |                     |--11.73%--0x7fff91a029ab
               |          |                     |                     |          |          
               |          |                     |                     |          |--10.02%--gelf_getsymshndx
               |          |                     |                     |          |          
               |          |                     |                     |          |--0.86%--0x7fff919c9d00
               |          |                     |                     |          |          
               |          |                     |                     |           --0.84%--0x7fff919c9d0c
               |          |                     |                     |          
               |          |                     |                     |--11.67%--0x7fff91a02a0b
               |          |                     |                     |          |          
               |          |                     |                     |          |--8.89%--gelf_getshdr
               |          |                     |                     |          |          
               |          |                     |                     |          |--2.02%--0x7fff919c9ac0
               |          |                     |                     |          |          
               |          |                     |                     |           --0.75%--0x7fff919c9acc
               |          |                     |                     |          
               |          |                     |                     |--5.80%--0x7fff91a029ff
               |          |                     |                     |          |          
               |          |                     |                     |           --5.79%--elf_getscn
               |          |                     |                     |          
               |          |                     |                     |--4.33%--0x7fff91a02f4b
               |          |                     |                     |          |          
               |          |                     |                     |          |--3.17%--0x7fff91a175a3
               |          |                     |                     |          |          |          
               |          |                     |                     |          |          |--1.08%--0x7fff8fe93ae0
               |          |                     |                     |          |          |          
               |          |                     |                     |          |           --0.76%--0x7fff8fe93ae8
               |          |                     |                     |          |          
               |          |                     |                     |           --0.59%--0x7fff91a175b0
               |          |                     |                     |          
               |          |                     |                     |--2.41%--0x7fff91a02f78
               |          |                     |                     |          
               |          |                     |                     |--2.24%--0x7fff91a02b18
               |          |                     |                     |          
               |          |                     |                     |--1.58%--0x7fff91a029d4
               |          |                     |                     |          
               |          |                     |                     |--1.37%--0x7fff91a02b14
               |          |                     |                     |          
               |          |                     |                     |--1.32%--0x7fff91a029b8
               |          |                     |                     |          
               |          |                     |                     |--1.28%--0x7fff91a029f4
               |          |                     |                     |          
               |          |                     |                     |--1.13%--0x7fff91a02f3c
               |          |                     |                     |          
               |          |                     |                     |--1.01%--0x7fff91a02a80
               |          |                     |                     |          
               |          |                     |                     |--0.77%--0x7fff91a02f54
               |          |                     |                     |          
               |          |                     |                     |--0.76%--0x7fff91a028fc
               |          |                     |                     |          
               |          |                     |                      --0.59%--0x7fff91a02f4c
               |          |                     |          
               |          |                     |--1.12%--00000021.plt_call.dwfl_module_getsym_info@@ELFUTILS_0.158
               |          |                     |          
               |          |                      --1.11%--uprobe_derived_probe::uprobe_derived_probe
               |          |                                dwarf_derived_probe::dwarf_derived_probe
               |          |                                |          
               |          |                                 --0.90%--dwarf_derived_probe::saveargs
               |          |          
               |          |--25.22%--dwflpp::die_has_pc
               |          |          |          
               |          |           --24.94%--dwflpp::die_has_pc
               |          |                     |          
               |          |                      --24.94%--dwarf_haspc
               |          |                                |          
               |          |                                 --23.93%--dwarf_ranges
               |          |                                           |          
               |          |                                           |--13.41%--dwarf_highpc
               |          |                                           |          |          
               |          |                                           |          |--7.17%--dwarf_attr
               |          |                                           |          |          |          
               |          |                                           |          |          |--2.90%--0x7fff919cf717
               |          |                                           |          |          |          |          
               |          |                                           |          |          |          |--0.86%--0x7fff919e2248
               |          |                                           |          |          |          |          
               |          |                                           |          |          |           --0.61%--0x7fff919e23a8
               |          |                                           |          |          |          
               |          |                                           |          |          |--0.78%--0x7fff919cf734
               |          |                                           |          |          |          
               |          |                                           |          |           --0.55%--0x7fff919cf4a4
               |          |                                           |          |          
               |          |                                           |          |--3.55%--dwarf_lowpc
               |          |                                           |          |          |          
               |          |                                           |          |          |--2.19%--dwarf_attr
               |          |                                           |          |          |          |          
               |          |                                           |          |          |           --0.99%--0x7fff919cf717
               |          |                                           |          |          |          
               |          |                                           |          |           --0.94%--dwarf_formaddr
               |          |                                           |          |          
               |          |                                           |          |--0.84%--dwarf_formaddr
               |          |                                           |          |          
               |          |                                           |           --0.59%--dwarf_formudata
               |          |                                           |          
               |          |                                           |--3.44%--dwarf_lowpc
               |          |                                           |          |          
               |          |                                           |          |--2.33%--dwarf_attr
               |          |                                           |          |          |          
               |          |                                           |          |           --1.00%--0x7fff919cf717
               |          |                                           |          |          
               |          |                                           |           --0.81%--dwarf_formaddr
               |          |                                           |          
               |          |                                            --2.14%--dwarf_attr
               |          |                                                      |          
               |          |                                                       --0.78%--0x7fff919cf717
               |          |          
               |           --3.23%--dwarf_query::filtered_all
               |                     |          
               |                     |--1.96%--std::vector<base_func_info, std::allocator<base_func_info> >::_M_realloc_insert<base_func_info const&>
               |                     |          |          
               |                     |           --1.46%--std::vector<base_func_info, std::allocator<base_func_info> >::_M_realloc_insert<base_func_info const&>
               |                     |                     |          
               |                     |                      --1.45%--__memcpy_power7
               |                     |          
               |                      --1.19%--dwarf_query::filtered_all
               |                                |          
               |                                 --0.97%--__memcpy_power7
               |          
                --1.59%--dwflpp::collect_all_lines
                          |          
                          |--1.03%--dwflpp::get_cu_lines_sorted_by_lineno
                          |          |          
                          |           --0.52%--?? (inlined)
                          |          
                           --0.56%--add_matching_lines_in_func (inlined)
Comment 1 William Cohen 2024-04-11 18:04:29 UTC
On a fresh ppc64le Fedora 39 install and build of systemtap git checkout (83ea7cbc0fcfd9caf).  The reproducer runs reasonably fast:

# rpm -q kernel systemtap elfutils binutils binutils-debuginfo
kernel-6.8.4-200.fc39.ppc64le
systemtap-5.1-1.fc39.ppc64le
elfutils-0.191-2.fc39.ppc64le
binutils-2.40-14.fc39.ppc64le
binutils-debuginfo-2.40-14.fc39.ppc64le
# time stap -L 'process("/usr/bin/ld").statement("*@*:*")'|wc
  11271   75465 2338858

real	0m11.103s
user	0m5.482s
sys	0m0.094s