This is sources Bugzilla
Bugzilla Version 2.17.5
Bugzilla Bug 2060
  improve translated C code to reduce compile & run time Last modified: 2006-01-26 23:01:30
     Query page      Enter new bug
Bug#: 2060   Hardware:   Reporter: Martin Hunt <hunt@redhat.com>
Host: Target: Build:
Product:     Add CC:
Component:   Version:   CC:
Status: RESOLVED   Priority:  
Resolution: FIXED   Severity:  
Assigned To: Frank Ch. Eigler <fche@redhat.com>   Target Milestone:  
Summary:
Keywords:

Attachment Description Type Created Actions
sys.stp my test case text/plain 2005-12-15 18:10 Edit None
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 2060 depends on: Show dependency tree
Show dependency graph
Bug 2060 blocks: 2111

Additional Comments:


Leave as RESOLVED FIXED
Reopen bug
Mark bug as VERIFIED

View Bug Activity   |   Format For Printing


Description:   Last confirmed: 2006-01-23 18:13 Opened: 2005-12-15 18:09
Typical compile times for simple scripts probing kernel.syscall.* is 1 to 2 minutes.

~> time stap -p2 sys.stp > foo

real    0m1.570s
user    0m1.518s
sys     0m0.052s
~> time stap -p3 sys.stp > foo

real    0m3.282s
user    0m2.136s
sys     0m1.148s
~> wc -l foo
183458 foo
~> time stap -p4 sys.stp

real    1m27.217s
user    1m23.365s
sys     0m4.691s

So we have a 183458 line C file to compile. The context struct itself is over
14000 lines long and includes stuff like:
struct function__module_flags_str_locals {
      int64_t f;
      union {
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
        struct {
        };
      };
      string_t __retvalue;
    } function__module_flags_str;

Everything else in the C file looks normal at first glance. Very repetetive,
obviously.

------- Additional Comment #1 From Martin Hunt 2005-12-15 18:10 -------
Created an attachment (id=804)
my test case

------- Additional Comment #2 From Frank Ch. Eigler 2005-12-15 18:16 -------
Are you sure you're running cvs systemtap?
Graydon made a big improvement in just this area of code a few days ago: bug #1931

------- Additional Comment #3 From Frank Ch. Eigler 2005-12-15 18:17 -------
Never mind, misunderstood your timings.
Needs further study.

------- Additional Comment #4 From Graydon Hoare 2005-12-21 01:36 -------
This looks decidedly wrong. Off hand I can't tell why. It's possible that we're
simply generating too much code -- maybe 200 syscalls times a handful of
parameter-accessor functions makes "too much code" -- but it also looks like
we're generating junk as well. 

------- Additional Comment #5 From Frank Ch. Eigler 2006-01-04 21:45 -------
Experiments ongoing.

Counterintuitively, it seems like the probe handler bodies are *not* the
dominant factor.  With all ~500 of them commented out, the compile time is still
just as long.  Judging by the resulting function/symbol sizes, I infer that the
module_init/module_exit functions are stressing the C compiler most, and
therefore will look there first.

------- Additional Comment #6 From Frank Ch. Eigler 2006-01-10 18:52 -------
Patches just committed appear to improve this significantly.

------- Additional Comment #7 From Martin Hunt 2006-01-10 20:34 -------
BEFORE
~> time stap -p4 sys.stp
real    1m40.500s
user    1m35.334s
sys     0m5.947s

AFTER
~> time stap -p4 sys.stp
real    0m47.287s
user    0m46.979s
sys     0m1.393s

That was a big improvement. Still, I hope we can eventually improve upon this. 
I suggest keeping this open at a lowered priority.

------- Additional Comment #8 From Frank Ch. Eigler 2006-01-10 20:44 -------
Right.  I anticipate further improvements are possible along these lines:

- reducing the amount of code generated (duh), particularly:
  - collecting the activity-count additions & especially checks
  - reducing the frequency of last_stmt assignments, and last_eerror checks
  - raising some global variable locking/unlocking code up to the outermost
nesting level of probe/function bodies; beyond simplifying the emitted C code,
this could reduce potential concurrency but it would kill a bunch of race conditions
- adjusting the kbuild CFLAGS to lessen optimization

------- Additional Comment #9 From Frank Ch. Eigler 2006-01-10 20:52 -------
*** Bug 1159 has been marked as a duplicate of this bug. ***

------- Additional Comment #10 From Frank Ch. Eigler 2006-01-10 23:18 -------
*** Bug 1330 has been marked as a duplicate of this bug. ***

------- Additional Comment #11 From Frank Ch. Eigler 2006-01-23 18:13 -------
- will include lock lifting, unused $target elimination, and one or two other
optimizations

------- Additional Comment #12 From Frank Ch. Eigler 2006-01-24 17:58 -------
mostly done; need just lock lifting now

------- Additional Comment #13 From Frank Ch. Eigler 2006-01-26 23:01 -------
lock lifting done.
other future improvements are possible; will be tracked separately.

     Query page      Enter new bug
Actions: New | Query | bug # | Reports | Requests   New Account | Log In