This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Scheduler Tapset based on kernel tracepoints


Kiran wrote:
> > Hi,
> >
> > This patch adds kernel tracepoints based probes to the scheduler tapset
> > along with the testcase, scheduler-test-tracepoints.stp and an example
> > script, sched_switch.stp.
> >
> > Signed-off-by: Kiran Prakash <kiran@linux.vnet.ibm.com>
> >
> > diff -Naur systemtap-0.9.9-orig/tapset/scheduler.stp
systemtap-0.9.9/tapset/scheduler.stp
> > --- systemtap-0.9.9-orig/tapset/scheduler.stp	2009-09-17 02:35:18.000000000
-0400
> > +++ systemtap-0.9.9/tapset/scheduler.stp	2009-09-17 02:32:49.000000000 -0400
> > @@ -33,7 +33,7 @@
> >   *  idle - boolean indicating whether current is the idle process
> >   */
> >  probe scheduler.cpu_off
> > -    = kernel.function("context_switch")
> > +    =  kernel.trace("sched_switch")
> >  {
> >      task_prev = $prev
> >      task_next = $next
> > @@ -124,6 +124,7 @@
> >  %( arch != "x86_64" && arch != "ia64" %?
> >  	kernel.function("__switch_to")
> >  %:
> > +	kernel.trace("sched_switch") ?,
> >  	kernel.function("context_switch")
> >  %)
> >  {

Tracepoints were already added for scheduler.cpu_off and scheduler.ctxswitch. a
couple days ago in:

http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=commit;h=ef0e74fc1131f1d217c78aa839d0de731ea7c940

Note that the checked in patch uses the "!" operator to prefer the tracepoints
if available, so the probe points still work for older kernels.  probe
scheduler.sched_switch already implemented as scheduler.ctxswitch.

Have the tapset probe points been tried out on older kernels, e.g. Fedora 10/11
or RHEL5?


> > diff -Naur
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.meta
systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.meta
> > ---
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.meta
1969-12-31 19:00:00.000000000 -0500
> > +++
systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.meta
2009-09-16 03:21:51.000000000 -0400
> > @@ -0,0 +1,14 @@
> > +title: Display the task switches happeningt the scheduler
> > +name: sched_switch.stp
> > +version: 1.0
> > +author: kiran
> > +keywords: profiling functions
> > +subsystem: kernel
> > +status: production
> > +exit: user-controlled
> > +output: sorted-list on-exit
> > +scope: system-wide
> > +description: The sched_switch.stp script takes two arguments, first
argument can be "pid" or "name" to indicate what is being passed as second
argument. The script will trace the process based on pid/name and print the
scheduler switches happening with the process. If no arguments are passed, it
displays all the scheduler switches.
> > +test_check: stap -p4 sched_switch.stp
> > +test_installcheck: stap  sched_switch.stp -c "sleep 1"
> > +

The description in the meta file say what the output is for the script and what
it might be used for.

> > diff -Naur
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.stp
systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.stp
> > ---
systemtap-0.9.9-orig/testsuite/systemtap.examples/profiling/sched_switch.stp
1969-12-31 19:00:00.000000000 -0500
> > +++ systemtap-0.9.9/testsuite/systemtap.examples/profiling/sched_switch.stp
2009-09-16 03:21:53.000000000 -0400
> > @@ -0,0 +1,71 @@
> > +/* This script works similar to ftrace's sched_switch. It displays a list of
> > + * processes which get switched in and out of the scheduler. The format of
display
> > + * is PROCESS_NAME PROCESS_PID CPU TIMESTAMP PID: PRIORITY: PROCESS STATE ->/+
> > + *    NEXT_PID : NEXT_PRIORITY: NEXT_STATE NEXT_PROCESS_NAME
> > + * -> indicates that previous process is scheduled out and the next process is
> > + *    scheduled in.
> > + * + indicates that previous process has woken up the next process.
> > + * The usage is sched_switch.stp <"pid"/"name"> pid/name
> > + */
> > +
> > +global task_cpu_old[9999]
> > +global pids[999]
> > +global processes
> > +global prev
> > +
> > +function state_calc(state) {
> > +        if(state == 0)
> > +        status = "R"
> > +        if(state == 1)
> > +        status = "S"
> > +        if(state == 2)
> > +        status = "D"
> > +        if(state == 4)
> > +        status = "T"
> > +        if(state == 8)
> > +        status = "T"
> > +        if(state == 16)
> > +        status = "Z"
> > +        if(state == 32)
> > +        status = "EXIT_DEAD"
> > +        return status
> > +}
> > +probe scheduler.wakeup
> > +{
> > +	pids[task_pid]++
> > +	processes[task_pid] = $p;
> > +	prev[task_pid] = task_current()
> > +	
> > +}
> > +probe scheduler.sched_switch
> > +{

Why not use the existing scheduler.ctxswitch probe point?

> > +	tid = next_tid
> > +	tid1 = prev_tid
> > +	state = prev_state
> > +	state1 = next_state
> > +	
> > +	%( $# == 2 %?
> > +	
> > +	if(@1 == "pid")
> > +		if (tid != $2 && tid1 != $2)
> > +			next
> > +	if(@1 == "name")
> > +		if (task_execname(task_current()) != @2 && task_execname($next) != @2)
> > +               		next
> > +	
> > +	foreach (name in pids-) {
> > +		if ((@1 == "pid" && (name == $2 || task_pid(prev[name]) == $2)) ||
> > +		   (@1 == "name" && (task_execname(prev[name]) == @2 ||
task_execname(processes[name]) == @2)))
> > +			printf("%s\t\t%d\t%d\t%d\t%d:%d:%s + %d:%d:%s %s\n",
> > +				task_execname(prev[name]), task_pid(prev[name]),
task_cpu(processes[name]), gettimeofday_ns(),
> > +				task_pid(prev[name]), task_prio(prev[name]),
state_calc(task_state(prev[name])),
> > +				task_pid(processes[name]), task_prio(processes[name]),
state_calc(task_state(processes[name])),
> > +				task_execname(processes[name]))
> > +	} %)
> > +
> > +	old_cpu = task_cpu_old[tid]
> > +	printf("%s\t\t%d\t%d\t%d\t%d:%d:%s ==> %d:%d:%s
%s\n",task_execname(task_current()),tid1,
> > +	
old_cpu,gettimeofday_ns(),tid1,task_prio(task_current()),state_calc(state),next_pid,
> > +		next_prio,state_calc(next_state),next_task_name )
> > +	task_cpu_old[next_tid] = cpu()
> > +}
> >
> >
> > Thanks,
> > Kiran
> >

-Will


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]