Bug 2228 - benchmark suite
Summary: benchmark suite
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-27 19:34 UTC by Frank Ch. Eigler
Modified: 2010-05-21 19:36 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
a little start to -DBENCHMARK (333 bytes, patch)
2006-05-01 17:55 UTC, Frank Ch. Eigler
Details | Diff
collect and print out cycle information. (1.92 KB, patch)
2006-05-02 18:06 UTC, William Cohen
Details | Diff
Patch to map timing information back to script (785 bytes, patch)
2006-05-08 20:27 UTC, William Cohen
Details | Diff
include symbolic location information (1.08 KB, patch)
2006-05-08 20:58 UTC, William Cohen
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Ch. Eigler 2006-01-27 19:34:16 UTC
To recap the discussion, we need a benchmark suite to measure both
translation-time performance and script execution performance.  Microbenchmarks
focused on particular scripting constructs (loops, conditionals, string
operations) should be included.

The suite should be in the form of a dejagnu test that anyone can run to look
for performance regressions.  Eye candy is optional.
Comment 1 William Cohen 2006-01-27 21:02:06 UTC
It would be a good idea to have the stap translator be able to report the time
for various phases, e.g. parse, elaborate, translate, compile, and run script.
Have something similar to the gcc -ftime-report.

-Will
Comment 2 Frank Ch. Eigler 2006-01-28 01:23:59 UTC
Added code to print timing information in verbose mode.
# stap -p4 -v [...] 2>&1 | grep 'Pass.*in.*ms'
Pass 1: parsed user script and 10 library script(s) in 380+10 (u+s) ms.
Pass 2: analyzed user script.  595 probe(s), 32 function(s), 2 global(s) in
4480+600 (u+s) ms.
Pass 3: translated to C into "/tmp/stapYStmvr/stap_21070.c" in 620+150 (u+s) ms.
Pass 4: compiled C into "stap_21070.ko" in 45510+1040 (u+s) ms.

Will work on making verbosity a gradated measure, so plain "-v" prints just
those lines, and more "-v"s spout more details.
Comment 3 William Cohen 2006-01-30 15:43:38 UTC
I was thinking of having a separate option "-t" for the timing information to
Minimize the amount of additional information the scripts have to parse to get
timing information. However, the current scheme with the graduated "-v" seems to
keep down the amount of extraneous information.

It would be worthwhile to include wall clock time to get some idea how long the
user would perceive things as taking.
Comment 4 William Cohen 2006-03-03 17:12:29 UTC
Now collecting latency information for the various phases of systemtap
translator. Modified tests/testsuite/lib/stap_run.exp and stap_run2.exp to make
use of the timing information generated by "stap -v". The information is written
into the systemtap.log when the dejagnutests are run. This information can be
extracted in tab separated form with the simple script in
tests/tools/stap_metrics. The resulting information can be imported into a
spreadsheet and analyzed.

There is also a tests/tools/stap_profile script that runs the dejagnu systemtap
tests under OProfile. This provides some measurements where the hotspots in the
instrumentation system are.
Comment 5 William Cohen 2006-03-03 17:13:33 UTC
Now collecting latency information for the various phases of systemtap
translator. Modified tests/testsuite/lib/stap_run.exp and stap_run2.exp to make
use of the timing information generated by "stap -v". The information is written
into the systemtap.log when the dejagnutests are run. This information can be
extracted in tab separated form with the simple script in
tests/tools/stap_metrics. The resulting information can be imported into a
spreadsheet and analyzed.

There is also a tests/tools/stap_profile script that runs the dejagnu systemtap
tests under OProfile. This provides some measurements where the hotspots in the
instrumentation system are.
Comment 6 William Cohen 2006-04-20 19:56:07 UTC
Frank asked me to add some support to get timing information about the SystemTap
probe actions and provide some statistics about the probe actions. The thought
is to have a "-t" (benchmark timing) option for stap to instrument the probe
prologues and epilogues.  As a quick sanity I quickly wrote something this
afternoon as a strictly systemtap script with no modifications to the systemtap
translator. One thing I noticed is that the @avg doe not handle no entries
gracefully. It just gives a "ERROR: divide by zero"

global _kf_sys_getuid
global _begin

probe kernel.function("sys_getuid")
{
	start = get_cycles()
	/* FIXME action goes in here */
	stop = get_cycles()
	_kf_sys_getuid <<< (stop-start)
}

probe begin
{
	start = get_cycles()
	printf( "starting\n" );
	stop = get_cycles()
	_begin <<< (stop-start)
}

probe end
{
	printf( "kernel.function(\"sys_getuid\") %d@%d\n",
	  @count(_kf_sys_getuid), @avg(_kf_sys_getuid) )
	printf( "begin %d@%d\n",  @count(_begin), @avg(_begin) )
	
}
Comment 7 Frank Ch. Eigler 2006-05-01 17:55:50 UTC
Created attachment 993 [details]
a little start to -DBENCHMARK

Additional stuff needed includes:
- code in c_unparser::emit_probe to emit a Stat object for each probe
- code in emit_probe_epilogue to add the cycles_elapsed into that Stat object
- code in c_unparser::emit_module_exit, within the unregister_probe_* loop, to
  print a brief summary of the computed Stat values for each probe.
Comment 8 William Cohen 2006-05-02 18:06:13 UTC
Created attachment 997 [details]
 collect and print out cycle information.

Have a macro STP_TIMING that could be defined to trigger the measurements
Comment 9 William Cohen 2006-05-08 20:27:36 UTC
Created attachment 1009 [details]
Patch to map timing information back to script

The attached patch provide file and line number information to make it easier
to figure out what parts of the script the timing information refers to.
Comment 10 William Cohen 2006-05-08 20:58:39 UTC
Created attachment 1011 [details]
include symbolic location information
Comment 11 Frank Ch. Eigler 2006-05-08 21:51:40 UTC
Go ahead and commit; we can tweak the format logic in plain cvs.
Comment 12 Frank Ch. Eigler 2007-05-08 20:26:44 UTC
If you don't intend to do any more work on this, please close the bug.
Comment 13 Frank Ch. Eigler 2007-08-17 21:36:56 UTC
Basic script timing code has been committed long ago.
We still lack an organized benchmark suite in dejagnu -
something to run if we make code generation changes.
Comment 14 David Smith 2007-08-20 17:50:29 UTC
One possible place to look for microbenchmark ideas is in the runtime/bench2
directory.  I used the scripts there when doing some optimization work in the past.
Comment 15 Frank Ch. Eigler 2010-05-21 19:36:00 UTC
stap -t is well baked.  Individual suspected performance bottlenecks are already
being tracked separately.