Bug 2335 - gprof reads executable 10x slower on opteron/x86_64
Summary: gprof reads executable 10x slower on opteron/x86_64
Status: ASSIGNED
Alias: None
Product: binutils
Classification: Unclassified
Component: gprof (show other bugs)
Version: 2.16
: P2 normal
Target Milestone: ---
Assignee: Ben Elliston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-14 08:03 UTC by Dirk-Jan
Modified: 2007-03-12 03:25 UTC (History)
1 user (show)

See Also:
Host: x86_64-unknown-linux-gnu
Target:
Build:
Last reconfirmed:


Attachments
a.out (5.59 KB, application/octet-stream)
2006-04-07 06:24 UTC, Dirk-Jan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk-Jan 2006-02-14 08:03:44 UTC
 
Comment 1 Dirk-Jan 2006-02-14 08:15:14 UTC
running gprof on a large executable takes 10 to 20 times longer on a 64bit
executable then on a 32bit executable. the 32bit is build on rh72 with gcc3.2.3
the 64bit on rhel3 with gcc3.2.3. The issue is even in gprof of 2.16 binutils.

The problem is reading the symbols. 
in elf.c _bfd_elf_find_nearest_line calss first 
_bfd_dwarf2_find_nearest_line which resolves some of them
then 
_bfd_stabs_find_nearest_line resolves most for the 32 bit case but non for the
64bit case.
As a result it goes into the elf_find_function in the end which takes a lot of
time. It seems to get there ventually.

I found out that adding -gstabs1 does the trick and makes it as fast for 64bit
as the 32bit executable. I noticed that somehow there is always a .stab section
the the 32bit executable. It seems it cannot resolve very well using dwarf2 and
therefore needs stabs debug info.

Comment 2 Ben Elliston 2006-04-04 06:02:02 UTC
Can you attach some example binaries, please?  It does not make sense that
x86-64 binaries produced by a modern version of GCC would produce stabs
debugging info.  I suspect your debugging session has given you that impression,
but that it would not be the case.  Thanks.
Comment 3 Dirk-Jan 2006-04-04 07:05:20 UTC
(In reply to comment #2)
> Can you attach some example binaries, please?  It does not make sense that
> x86-64 binaries produced by a modern version of GCC would produce stabs
> debugging info.  I suspect your debugging session has given you that impression,
> but that it would not be the case.  Thanks.

I have not said 64bit executables produce stabs info. But forcing to do so does
fix slow runtime of gprof on 64bit. 

Using nm or objdump I was seeing a .stabs section on the 32bit which was not
there in the 64bit case. Also as I said the function _bfd_elf_find_nearest_line
in gprof uses _bfd_stabs_find_nearest_line many more times succesfully then
_bfd_dwarf2_find_nearest_line. In case of 64bit it even does fail on that was
and gets into the slow elf_find_function.

Adding -gstabs1 to the 64bit build makes it as fast as 32bit. It seems there is
something related to stabs info being available that makes the difference in
speed of gprof.

So no there is by default NO stabs info in the 64bit exe BUT if it is there
gprof is no longer 10x to 20x slower.

Without stabs info gprof if not efficient (it fails almost always on the
_bfd_dwarf2_find_nearest_line). If gprof would be able to read via
_bfd_dwarf2_find_nearest_line as much as via _bfd_stabs_find_nearest_line then
it might be fast on 64bit without stabs info which is the default.
Comment 4 Ben Elliston 2006-04-07 00:36:04 UTC
You should not be seeing a .stabs section in an x86 Linux binary, either.
Can you please attach some example binaries to this bug report?
Comment 5 Dirk-Jan 2006-04-07 06:24:53 UTC
Subject: Re:  gprof reads executable 10x slower on opteron/x86_64

Note that the real problem is not stabs or not. But 10x slowdown on 
opteron64 bit when gprof reads the executable. Adding -gstab1 then 
solves the problem. Also I noticed the resolves in gprof are done via a 
function with stabs and not the one with dwarf.

using a gcc4.0.2 compiler and binutils 2.16.1 on linux72:

main.cxx:
#include "iostream.h"

int main(){
         cout<<"Hello world";
}

g++ main.cxx
objdump a.out

....
Sections:
Idx Name          Size      VMA       LMA       File off  Algn
   0 .interp       00000013  08048114  08048114  00000114  2**0
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   1 .note.ABI-tag 00000020  08048128  08048128  00000128  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   2 .hash         00000048  08048148  08048148  00000148  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   3 .dynsym       000000d0  08048190  08048190  00000190  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   4 .dynstr       0000016d  08048260  08048260  00000260  2**0
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   5 .gnu.version  0000001a  080483ce  080483ce  000003ce  2**1
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   6 .gnu.version_r 00000070  080483e8  080483e8  000003e8  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   7 .rel.dyn      00000010  08048458  08048458  00000458  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   8 .rel.plt      00000038  08048468  08048468  00000468  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   9 .init         00000018  080484a0  080484a0  000004a0  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, CODE
  10 .plt          00000080  080484b8  080484b8  000004b8  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, CODE
  11 .text         000001e4  08048540  08048540  00000540  2**4
                   CONTENTS, ALLOC, LOAD, READONLY, CODE
  12 .fini         0000001e  08048724  08048724  00000724  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, CODE
  13 .rodata       00000014  08048744  08048744  00000744  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
  14 .eh_frame     000000b0  08048758  08048758  00000758  2**2
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
  15 .ctors        0000000c  08049808  08049808  00000808  2**2
                   CONTENTS, ALLOC, LOAD, DATA
  16 .dtors        0000000c  08049814  08049814  00000814  2**2
                   CONTENTS, ALLOC, LOAD, DATA
  17 .jcr          00000004  08049820  08049820  00000820  2**2
                   CONTENTS, ALLOC, LOAD, DATA
  18 .dynamic      000000e0  08049824  08049824  00000824  2**2
                   CONTENTS, ALLOC, LOAD, DATA
  19 .got          00000004  08049904  08049904  00000904  2**2
                   CONTENTS, ALLOC, LOAD, DATA
  20 .got.plt      00000028  08049908  08049908  00000908  2**2
                   CONTENTS, ALLOC, LOAD, DATA
  21 .data         0000000c  08049930  08049930  00000930  2**2
                   CONTENTS, ALLOC, LOAD, DATA
  22 .bss          000000b0  0804993c  0804993c  0000093c  2**3
                   ALLOC
  23 .stab         000007a4  00000000  00000000  0000093c  2**2
                   CONTENTS, READONLY, DEBUGGING
  24 .stabstr      00001985  00000000  00000000  000010e0  2**0
                   CONTENTS, READONLY, DEBUGGING
  25 .comment      000000e7  00000000  00000000  00002a65  2**0
                   CONTENTS, READONLY
  26 .note         0000003c  00000000  00000000  00002b4c  2**0
                   CONTENTS, READONLY


Dirk-Jan


bje at sources dot redhat dot com wrote:
> ------- Additional Comments From bje at sources dot redhat dot com  2006-04-07 00:36 -------
> You should not be seeing a .stabs section in an x86 Linux binary, either.
> Can you please attach some example binaries to this bug report?
> 

-- 

"CONFIDENTIALITY NOTICE:    This e-mail may contain information that is
confidential and proprietary to Magma, and Magma hereby designates the
information in this e-mail as confidential.    The information is
intended only for the use of the individual or entity named above.  If
you are not the intended recipient, you are hereby notified that any
disclosure, copying, distribution or use of any of the information
contained in this transmission is strictly prohibited and that you
should immediately destroy this e-mail and its contents and notify
Magma."


Comment 6 Dirk-Jan 2006-04-07 06:24:53 UTC
Created attachment 958 [details]
a.out
Comment 7 H.J. Lu 2007-03-11 16:40:46 UTC
Is this the same as bug 114?
Comment 8 Ben Elliston 2007-03-12 03:25:17 UTC
It may be the same -- Dirk, were you running with --line, by chance?
Comment 9 Dirk-Jan 2007-03-12 08:32:20 UTC
Subject: Re:  gprof reads executable 10x slower on opteron/x86_64

No.

Just g++ -O3 -gp to compile and just gprof on the executable and
gmon.out. The executable is 100Mb large.

Adding stab debug info in place of dwarf2 makes it run fast. Running in
GDB with dwarf debug I see it not using the "dwarf" named functions to
retrieve info.

Dirk-Jan

bje at sources dot redhat dot com wrote:
> ------- Additional Comments From bje at sources dot redhat dot com  2007-03-12 03:25 -------
> It may be the same -- Dirk, were you running with --line, by chance?
> 
>