This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Controlling probe overhead


I've attached yet another version of the patch. I've renamed this to be "overload" processing (as Frank suggested in another email). See comments below.

Stone, Joshua I wrote:
David Smith wrote:
Making exceptions for begin/end probe is going to be a bit difficult,
since the common_probe_entryfn_prologue() and
common_probe_entryfn_epilogue() functions don't really have any
context of where they are called from.

It's simple to add that context -- we can add a parameter to those functions so that the default does output the throttling checks, and then modify the call site in be_derived_probe_group::emit_module_decls() to turn them off.

Yep, you were right, it wasn't hard at all to add that context. Done.


Besides sharing code with STP_TIMING, I also added a command-line
switch to turn this new functionality off (but your idea of tunable
thresholds is probably better).

BTW, I had to rework the STP_TIMING code a very small bit to make it work correctly with the STP_OVERLOAD code. The STP_TIMING code was storing cycle counts as 32-bit values, where the STP_OVERLOAD code wanted 64-bit cycle counts. The STP_TIMING code now truncates down to 32-bits a little later than it did originally.


An option to disable it is a good idea.  As for tuning the threshold, we
could make a new -D option like MAXOVERHEAD.  If this is a percentage,
then internally we can just define this:

#define STP_ACCOUNTING_THRESHOLD
MAX_OVERHEAD*STP_ACCOUNTING_INTERVAL/100

This way we expose some control without exposing the implementation of
the threshold and interval.

With the attached patch, you can tune (using "stap -D") both the INTERVAL and the THRESHOLD. I didn't implement your above idea of MAX_OVERHEAD (although it is certainly doable).


I've been testing with the attached patch, and it works very nicely for me. I can run the test suite with no interference from the OVERLOAD stuff. I've run several stress tests with no interference from the OVERLOAD stuff. I've have one stress test (that Frank wrote) that will make a RHEL5 system non-responsive. The system doesn't crash - just decides to no longer take any input. The overload code kills the script in less than 3 minutes.

Note that I haven't implemented the new error probes you and Frank discussed. I'd like to get the current code in (since it is quite useful in its current state) before thinking about error probes.

--
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)
? stp_overload.patch
? systemtap-0.5.13.tar.gz
Index: main.cxx
===================================================================
RCS file: /cvs/systemtap/src/main.cxx,v
retrieving revision 1.66
diff -u -p -r1.66 main.cxx
--- main.cxx	9 Feb 2007 13:45:49 -0000	1.66
+++ main.cxx	14 Mar 2007 15:07:28 -0000
@@ -100,6 +100,7 @@ usage (systemtap_session& s, int exitcod
     << endl
     << "   -x PID     sets target() to PID" << endl
     << "   -t         benchmarking timing information generated" << endl
+    << "   -O         turn off automatic probe overload handling" << endl
     ;
   // -d: dump safety-related external references
 
@@ -196,6 +197,7 @@ main (int argc, char * const argv [])
   s.architecture = string (buf.machine);
   s.verbose = 0;
   s.timing = 0;
+  s.overload = true;
   s.guru_mode = false;
   s.bulk_mode = false;
   s.unoptimized = false;
@@ -266,7 +268,7 @@ main (int argc, char * const argv [])
   while (true)
     {
       // NB: also see find_hash(), help(), switch stmt below, stap.1 man page
-      int grc = getopt (argc, argv, "hVMvtp:I:e:o:R:r:m:kgPc:x:D:bs:u");
+      int grc = getopt (argc, argv, "hVMvtp:I:e:o:R:r:m:kgPc:x:D:bs:uO");
       if (grc < 0)
         break;
       switch (grc)
@@ -287,6 +289,10 @@ main (int argc, char * const argv [])
 	  s.timing ++;
 	  break;
 
+        case 'O':
+	  s.overload = false;
+	  break;
+
         case 'p':
           s.last_pass = atoi (optarg);
           if (s.last_pass < 1 || s.last_pass > 5)
Index: session.h
===================================================================
RCS file: /cvs/systemtap/src/session.h,v
retrieving revision 1.16
diff -u -p -r1.16 session.h
--- session.h	9 Feb 2007 13:45:49 -0000	1.16
+++ session.h	14 Mar 2007 15:07:28 -0000
@@ -95,6 +95,7 @@ struct systemtap_session
   unsigned perfmon;
   bool symtab;
   bool prologue_searching;
+  bool overload;
 
   // Cache data
   bool use_cache;
Index: tapsets.cxx
===================================================================
RCS file: /cvs/systemtap/src/tapsets.cxx,v
retrieving revision 1.184
diff -u -p -r1.184 tapsets.cxx
--- tapsets.cxx	7 Mar 2007 15:45:23 -0000	1.184
+++ tapsets.cxx	14 Mar 2007 15:07:29 -0000
@@ -128,16 +128,17 @@ be_derived_probe::join_group (systemtap_
 
 // ------------------------------------------------------------------------
 void
-common_probe_entryfn_prologue (translator_output* o, string statestr)
+common_probe_entryfn_prologue (translator_output* o, string statestr,
+			       bool overload_processing = true)
 {
   o->newline() << "struct context* __restrict__ c;";
   o->newline() << "unsigned long flags;";
 
-  o->newline() << "#ifdef STP_TIMING";
-  // NB: we truncate cycles counts to 32 bits.  Perhaps it should be
-  // fewer, if the hardware counter rolls over really quickly.  See
-  // also ...epilogue().
-  o->newline() << "int32_t cycles_atstart = (int32_t) get_cycles ();";
+  if (overload_processing)
+    o->newline() << "#if defined(STP_TIMING) || defined(STP_OVERLOAD)";
+  else
+    o->newline() << "#ifdef STP_TIMING";
+  o->newline() << "cycles_t cycles_atstart = get_cycles ();";
   o->newline() << "#endif";
 
 #if 0 /* XXX: PERFMON */
@@ -192,18 +193,51 @@ common_probe_entryfn_prologue (translato
 
 
 void
-common_probe_entryfn_epilogue (translator_output* o)
+common_probe_entryfn_epilogue (translator_output* o,
+			       bool overload_processing = true)
 {
-  o->newline() << "#ifdef STP_TIMING";
+  if (overload_processing)
+    o->newline() << "#if defined(STP_TIMING) || defined(STP_OVERLOAD)";
+  else
+    o->newline() << "#ifdef STP_TIMING";
   o->newline() << "{";
-  o->newline(1) << "int32_t cycles_atend = (int32_t) get_cycles ();";
-  // Handle 32-bit wraparound.
-  o->newline() << "int32_t cycles_elapsed = (cycles_atend > cycles_atstart)";
-  o->newline(1) << "? (cycles_atend - cycles_atstart)";
-  o->newline() << ": (~(int32_t)0) - cycles_atstart + cycles_atend + 1;";
+  o->newline(1) << "cycles_t cycles_atend = get_cycles ();";
+  // NB: we truncate cycles counts to 32 bits.  Perhaps it should be
+  // fewer, if the hardware counter rolls over really quickly.  We
+  // handle 32-bit wraparound here.
+  o->newline() << "int32_t cycles_elapsed = ((int32_t)cycles_atend > (int32_t)cycles_atstart)";
+  o->newline(1) << "? ((int32_t)cycles_atend - (int32_t)cycles_atstart)";
+  o->newline() << ": (~(int32_t)0) - (int32_t)cycles_atstart + (int32_t)cycles_atend + 1;";
+  o->indent(-1);
 
+  o->newline() << "#ifdef STP_TIMING";
   o->newline() << "if (likely (c->statp)) _stp_stat_add(*c->statp, cycles_elapsed);";
-  o->indent(-1);
+  o->newline() << "#endif";
+
+  if (overload_processing)
+    {
+      o->newline() << "#ifdef STP_OVERLOAD";
+      o->newline() << "{";
+      // If the cycle count has wrapped (cycles_atend > cycles_base),
+      // let's go ahead and pretend the interval has been reached.
+      // This should reset cycles_base and cycles_sum.
+      o->newline(1) << "cycles_t interval = (cycles_atend > c->cycles_base)";
+      o->newline(1) << "? (cycles_atend - c->cycles_base)";
+      o->newline() << ": (STP_OVERLOAD_INTERVAL + 1);";
+      o->newline(-1) << "c->cycles_sum += cycles_elapsed;";
+      o->newline() << "if (interval > STP_OVERLOAD_INTERVAL) {";
+      o->newline(1) << "if (c->cycles_sum > STP_OVERLOAD_THRESHOLD) {";
+      o->newline(1) << "_stp_error (\"probe overhead exceeded threshold\");";
+      o->newline() << "atomic_set (&session_state, STAP_SESSION_ERROR);";
+      o->newline(-1) << "}";
+
+      o->newline() << "c->cycles_base = cycles_atend;";
+      o->newline() << "c->cycles_sum = 0;";
+      o->newline(-1) << "}";
+      o->newline(-1) << "}";
+      o->newline() << "#endif";
+    }
+
   o->newline(-1) << "}";
   o->newline() << "#endif";
 
@@ -238,17 +272,17 @@ be_derived_probe_group::emit_module_decl
   s.op->newline() << "/* ---- begin/end probes ---- */";
   s.op->newline() << "void enter_begin_probe (void (*fn)(struct context*)) {";
   s.op->indent(1);
-  common_probe_entryfn_prologue (s.op, "STAP_SESSION_STARTING");
+  common_probe_entryfn_prologue (s.op, "STAP_SESSION_STARTING", false);
   s.op->newline() << "c->probe_point = \"begin\";";
   s.op->newline() << "(*fn) (c);";
-  common_probe_entryfn_epilogue (s.op);
+  common_probe_entryfn_epilogue (s.op, false);
   s.op->newline(-1) << "}";
   s.op->newline() << "void enter_end_probe (void (*fn)(struct context*)) {";
   s.op->indent(1);
-  common_probe_entryfn_prologue (s.op, "STAP_SESSION_STOPPING");
+  common_probe_entryfn_prologue (s.op, "STAP_SESSION_STOPPING", false);
   s.op->newline() << "c->probe_point = \"end\";";
   s.op->newline() << "(*fn) (c);";
-  common_probe_entryfn_epilogue (s.op);
+  common_probe_entryfn_epilogue (s.op, false);
   s.op->newline(-1) << "}";
 }
 
Index: translate.cxx
===================================================================
RCS file: /cvs/systemtap/src/translate.cxx,v
retrieving revision 1.159
diff -u -p -r1.159 translate.cxx
--- translate.cxx	23 Feb 2007 15:30:15 -0000	1.159
+++ translate.cxx	14 Mar 2007 15:07:29 -0000
@@ -805,6 +805,13 @@ c_unparser::emit_common_header ()
   o->newline() << "atomic_t error_count = ATOMIC_INIT (0);";
   o->newline() << "atomic_t skipped_count = ATOMIC_INIT (0);";
   o->newline();
+  o->newline() << "#ifndef STP_OVERLOAD_INTERVAL";
+  o->newline() << "#define STP_OVERLOAD_INTERVAL 1000000000LL";
+  o->newline() << "#endif";
+  o->newline() << "#ifndef STP_OVERLOAD_THRESHOLD";
+  o->newline() << "#define STP_OVERLOAD_THRESHOLD 500000000LL";
+  o->newline() << "#endif";
+  o->newline();
   o->newline() << "struct context {";
   o->newline(1) << "atomic_t busy;";
   o->newline() << "const char *probe_point;";
@@ -822,6 +829,10 @@ c_unparser::emit_common_header ()
   o->newline() << "#ifdef STP_TIMING";
   o->newline() << "Stat *statp;";
   o->newline() << "#endif";
+  o->newline() << "#ifdef STP_OVERLOAD";
+  o->newline() << "cycles_t cycles_base;";
+  o->newline() << "cycles_t cycles_sum;";
+  o->newline() << "#endif";
   o->newline() << "union {";
   o->indent(1);
 
@@ -4091,7 +4102,10 @@ translate_pass (systemtap_session& s)
 	}
 
       if (s.timing)
-	s.op->newline() << "#define STP_TIMING" << " " << s.timing ;
+	s.op->newline() << "#define STP_TIMING" << " " << s.timing;
+
+      if (s.overload)
+	s.op->newline() << "#define STP_OVERLOAD" << " " << s.overload;
 
       if (s.perfmon)
 	s.op->newline() << "#define STP_PERFMON";

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]