Bug 747 - Profiling of nested functions
Summary: Profiling of nested functions
Status: NEW
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.15
: P2 enhancement
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-02-16 12:16 UTC by Laurent GUERBY
Modified: 2020-01-31 02:20 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Laurent GUERBY 2005-02-16 12:16:49 UTC
It looks like nested functions are not properly reported by gprof.
The original problem was on Ada code with nested
functions. This is with GCC from CVS HEAD 20050213 and GNU gprof 2.15.91.0.2
on a x86 SuSE 9.2 system. A GCC developper suggested that the appropriate calls
seemed to be generated properly, so the problem was likely in gprof.

$ cat cn.c
#define N 10000000
static int sum=0;
void p1(void) {
  int i;
  for(i=1;i<=N;i++) sum+=(i+1)/i;
}
void p2(void) {
  int f2(int x) {
    return x;
  }
  int i;
  for(i=1;i<=2*N;i++) sum+=f2(i+1)/i;
}
void p3(void) {
  int i;
  for(i=1;i<=3*N;i++) sum+=(i+1)/i;
}

int main(void) {
  p1();
  p2();
  p3();
  return 0;
}
$ gcc -pg -g cn.c 
$ ./a.out
$ gprof ./a.out 
[...]
Flat profile:
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 49.49      8.70     8.70        1     8.70     8.70  p3
 34.13     14.70     6.00        1     6.00     6.00  p2
 16.38     17.58     2.88        1     2.88     2.88  p1
[...]
index % time    self  children    called     name
                                                 <spontaneous>
[1]    100.0    0.00   17.58                 main [1]
                8.70    0.00       1/1           p3 [2]
                6.00    0.00       1/1           p2 [3]
                2.88    0.00       1/1           p1 [4]
-----------------------------------------------
                8.70    0.00       1/1           main [1]
[2]     49.5    8.70    0.00       1         p3 [2]
-----------------------------------------------
                             20000000             p2 [3]
                6.00    0.00       1/1           main [1]
[3]     34.1    6.00    0.00       1+20000000 p2 [3]
                             20000000             p2 [3]
-----------------------------------------------
                2.88    0.00       1/1           main [1]
[4]     16.4    2.88    0.00       1         p1 [4]
-----------------------------------------------
Comment 1 Nick Clifton 2005-03-15 16:34:00 UTC
Hi Laurent,

  This is not a bug but rather a feature of gprof.  It considers nested
functions to be part of their parent function and by default it will include
their timing in their parent's total.  You can get to see the time spent in the
nested function if you really want to however - by using the "-l" switch to
gprof.  For example with the test program you supplied I get the following
results from "gprof -l -b ./a.out":

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 38.25      0.70     0.70                             p3 (profile.c:34 @8048537)
 36.61      1.37     0.67                             p2 (profile.c:25 @80484d1)
 13.11      1.61     0.24                             p1 (profile.c:10 @8048488)
  7.10      1.74     0.13                             f2 (profile.c:19 @8048518)
  1.09      1.76     0.02        1    20.00    20.00  p3 (profile.c:30 @8048523)
  1.09      1.78     0.02                             f2 (profile.c:20 @8048521)
  0.82      1.79     0.01                             p3 (profile.c:33 @804855c)
  0.82      1.81     0.01                             p3 (profile.c:35 @804856a)
  0.55      1.82     0.01                             p1 (profile.c:9 @80484ad)
  0.55      1.83     0.01                             p2 (profile.c:24 @8048500)
  0.00      1.83     0.00 20000001     0.00     0.00  p2 (profile.c:15 @80484bd)
  0.00      1.83     0.00        1     0.00     0.00  p1 (profile.c:7 @8048474)

[...]
 
Index by function name

   [4] f2 (profile.c:19 @ 8048518) [10] p1 (profile.c:9 @ 80484ad) [5] p3
(prole.c:30 @ 8048523)
   [6] f2 (profile.c:20 @ 8048521) [12] p2 (profile.c:15 @ 80484bd) [1] p3
(profile.c:34 @ 8048537)
  [13] p1 (profile.c:7 @ 8048474) [2] p2 (profile.c:25 @ 80484d1) [8] p3
(profile.c:33 @ 804855c)
   [3] p1 (profile.c:10 @ 8048488) [11] p2 (profile.c:24 @ 8048500) [9] p3
(profile.c:35 @ 804856a)

Cheers
  Nick
Comment 2 Laurent GUERBY 2005-03-15 20:20:13 UTC
Thanks Nick for your answer. I missed the -l flag in the documentation, very
useful indeed!

The per "line" profile looks indeed correct, but
the call graph suffers from this nested=parent assimilation and shows p2 as a
recursive function (if I understand correctly):

index % time    self  children    called     name
                0.12    0.00       1/1           main (cn.c:21 @ 80485dc) [6]
[5]      0.7    0.12    0.00       1         p3 (cn.c:14 @ 804856d) [5]
-----------------------------------------------
                0.00    0.00       1/20000001     main (cn.c:21 @ 80485dc) [6]
                0.00    0.00 20000000/20000001     p2 (cn.c:12 @ 8048518) [2]
[11]     0.0    0.00    0.00 20000001         p2 (cn.c:7 @ 804850d) [11]
-----------------------------------------------
                0.00    0.00       1/1           main (cn.c:20 @ 80485d7) [38]
[12]     0.0    0.00    0.00       1         p1 (cn.c:3 @ 80484c4) [12]
-----------------------------------------------

I feel that treating nested functions as normal functions would be more helpful
for GNU Ada and Pascal for example, but this is a new feature request no longer
a bug. I don't know if it's easy to do in gprof, GCC 4.0 transforms all nested
functions to top level functions (with arguments) during code generation, it
wasn't doing so before so may be it helps.

I will try "-l" on the larger original Ada software and report back in a few days.

Thanks again for your time.

Laurent
Comment 3 Ben Elliston 2006-03-17 01:50:59 UTC
Changing to an enhancement request.