This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH] Scheduler Tapset based on kernel tracepoints
- From: William Cohen <wcohen at redhat dot com>
- To: Kiran <kiran at linux dot vnet dot ibm dot com>
- Cc: systemtap at sources dot redhat dot com
- Date: Fri, 18 Sep 2009 09:42:05 -0400
- Subject: Re: [PATCH] Scheduler Tapset based on kernel tracepoints
- References: <1253178628.4364.18.camel@kiran-laptop>
Kiran wrote:
> > Hi,
> >
> > This patch adds kernel tracepoints based probes to the scheduler tapset
> > along with the testcase, scheduler-test-tracepoints.stp and an example
> > script, sched_switch.stp.
> >
> > Signed-off-by: Kiran Prakash <kiran@linux.vnet.ibm.com>
> >
> > diff -Naur systemtap-0.9.9-orig/tapset/scheduler.stp
systemtap-0.9.9/tapset/scheduler.stp
> > --- systemtap-0.9.9-orig/tapset/scheduler.stp 2009-09-17 02:35:18.000000000
-0400
> > +++ systemtap-0.9.9/tapset/scheduler.stp 2009-09-17 02:32:49.000000000 -0400
> > @@ -33,7 +33,7 @@
> > * idle - boolean indicating whether current is the idle process
> > */
> > probe scheduler.cpu_off
> > - = kernel.function("context_switch")
> > + = kernel.trace("sched_switch")
> > {
> > task_prev = $prev
> > task_next = $next
> > @@ -124,6 +124,7 @@
> > %( arch != "x86_64" && arch != "ia64" %?
> > kernel.function("__switch_to")
> > %:
> > + kernel.trace("sched_switch") ?,
> > kernel.function("context_switch")
> > %)
> > {
Tracepoints were already added for scheduler.cpu_off and scheduler.ctxswitch. a
couple days ago in:
http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=commit;h=ef0e74fc1131f1d217c78aa839d0de731ea7c940
Note that the checked in patch uses the "!" operator to prefer the tracepoints
if available, so the probe points still work for older kernels. probe
scheduler.sched_switch already implemented as scheduler.ctxswitch.
Have the tapset probe points been tried out on older kernels, e.g. Fedora 10/11
or RHEL5?
> > diff -Naur
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.meta
systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.meta
> > ---
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.meta
1969-12-31 19:00:00.000000000 -0500
> > +++
systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.meta
2009-09-16 03:21:51.000000000 -0400
> > @@ -0,0 +1,14 @@
> > +title: Display the task switches happeningt the scheduler
> > +name: sched_switch.stp
> > +version: 1.0
> > +author: kiran
> > +keywords: profiling functions
> > +subsystem: kernel
> > +status: production
> > +exit: user-controlled
> > +output: sorted-list on-exit
> > +scope: system-wide
> > +description: The sched_switch.stp script takes two arguments, first
argument can be "pid" or "name" to indicate what is being passed as second
argument. The script will trace the process based on pid/name and print the
scheduler switches happening with the process. If no arguments are passed, it
displays all the scheduler switches.
> > +test_check: stap -p4 sched_switch.stp
> > +test_installcheck: stap sched_switch.stp -c "sleep 1"
> > +
The description in the meta file say what the output is for the script and what
it might be used for.
> > diff -Naur
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.stp
systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.stp
> > ---
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.stp
1969-12-31 19:00:00.000000000 -0500
> > +++ systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.stp
2009-09-16 03:21:53.000000000 -0400
> > @@ -0,0 +1,71 @@
> > +/* This script works similar to ftrace's sched_switch. It displays a list of
> > + * processes which get switched in and out of the scheduler. The format of
display
> > + * is PROCESS_NAME PROCESS_PID CPU TIMESTAMP PID: PRIORITY: PROCESS STATE ->/+
> > + * NEXT_PID : NEXT_PRIORITY: NEXT_STATE NEXT_PROCESS_NAME
> > + * -> indicates that previous process is scheduled out and the next process is
> > + * scheduled in.
> > + * + indicates that previous process has woken up the next process.
> > + * The usage is sched_switch.stp <"pid"/"name"> pid/name
> > + */
> > +
> > +global task_cpu_old[9999]
> > +global pids[999]
> > +global processes
> > +global prev
> > +
> > +function state_calc(state) {
> > + if(state == 0)
> > + status = "R"
> > + if(state == 1)
> > + status = "S"
> > + if(state == 2)
> > + status = "D"
> > + if(state == 4)
> > + status = "T"
> > + if(state == 8)
> > + status = "T"
> > + if(state == 16)
> > + status = "Z"
> > + if(state == 32)
> > + status = "EXIT_DEAD"
> > + return status
> > +}
> > +probe scheduler.wakeup
> > +{
> > + pids[task_pid]++
> > + processes[task_pid] = $p;
> > + prev[task_pid] = task_current()
> > +
> > +}
> > +probe scheduler.sched_switch
> > +{
Why not use the existing scheduler.ctxswitch probe point?
> > + tid = next_tid
> > + tid1 = prev_tid
> > + state = prev_state
> > + state1 = next_state
> > +
> > + %( $# == 2 %?
> > +
> > + if(@1 == "pid")
> > + if (tid != $2 && tid1 != $2)
> > + next
> > + if(@1 == "name")
> > + if (task_execname(task_current()) != @2 && task_execname($next) != @2)
> > + next
> > +
> > + foreach (name in pids-) {
> > + if ((@1 == "pid" && (name == $2 || task_pid(prev[name]) == $2)) ||
> > + (@1 == "name" && (task_execname(prev[name]) == @2 ||
task_execname(processes[name]) == @2)))
> > + printf("%s\t\t%d\t%d\t%d\t%d:%d:%s + %d:%d:%s %s\n",
> > + task_execname(prev[name]), task_pid(prev[name]),
task_cpu(processes[name]), gettimeofday_ns(),
> > + task_pid(prev[name]), task_prio(prev[name]),
state_calc(task_state(prev[name])),
> > + task_pid(processes[name]), task_prio(processes[name]),
state_calc(task_state(processes[name])),
> > + task_execname(processes[name]))
> > + } %)
> > +
> > + old_cpu = task_cpu_old[tid]
> > + printf("%s\t\t%d\t%d\t%d\t%d:%d:%s ==> %d:%d:%s
%s\n",task_execname(task_current()),tid1,
> > +
old_cpu,gettimeofday_ns(),tid1,task_prio(task_current()),state_calc(state),next_pid,
> > + next_prio,state_calc(next_state),next_task_name )
> > + task_cpu_old[next_tid] = cpu()
> > +}
> >
> >
> > Thanks,
> > Kiran
> >
-Will