This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] Scheduler Tapset based on kernel tracepoints


Kiran <kiran@linux.vnet.ibm.com> writes:

(Is that your full name, for purposes of being listed in AUTHORS?)


> [...]
> --- systemtap-orig/tapset/scheduler.stp	2009-09-19 10:27:14.000000000 -0400
> +++ systemtap/tapset/scheduler.stp	2009-09-22 07:32:04.000000000 -0400
> @@ -118,32 +118,280 @@
>   *  Arguments:
>   *    prev_pid: The pid of the process to be switched out
>   *    next_pid: The pid of the process to be switched in
> + *    prev_tid: The tid of the process to be switched out
> + *    next_tid: The tid of the process to be switched in	 
> + *    prev_task_name: The name of the process to be switched out
> + *    next_task_name: The name of the process to be switched in
>   *    prevtsk_state: the state of the process to be switched out
> + *    nexttsk_state: the state of the process to be switched in
>   */

Please recast these as kerneldoc-formatted embedded docs that can
be extracted into our man pages.


> -probe scheduler.ctxswitch =
> +
> +probe scheduler.ctxswitch.tp = kernel.trace("sched_switch") 
> +{
> +	next_pid =  $next->pid
> +        next_pid = task_tid($next)
> +        next_task = $next
> +        next_task_name = task_execname($next)
> +        nexttsk_state = $next->state
> +	prev_pid = $prev->pid
> +        prev_tid = task_tid(tid)
> +        prev_task = $prev
> +        prev_task_name = task_execname($prev)
> +        prevtsk_state = $prev->state
> +}

There's a consistent confusion between pid and tid here.
pid is $task->tgid;  tid is $task->pid.  (There is not much
need for using task_tid() function, certainly not in this
asymmetric way.)



> +/**
> + * probe scheduler.kthread_stop - Fires when a thread created by kthread_create is stopped.
> + * @thread_pid: pid of the thread being stopped.
> + * @thread_priority: priority of the thread.
> + */
> +probe scheduler.kthread_stop.kp = kernel.function("kthread_stop")
> +{
> +	thread_pid = $k->pid
> +	thread_priority = $k->priority
> +}
> +probe scheduler.kthread_stop.tp = kernel.trace("sched_kthread_stop") 
> +{
> +        thread_pid = $t->pid
> +        thread_priority = $t->prio
> +}
> +probe scheduler.kthread_stop 
> +   = scheduler.kthread_stop.tp !,
> +     scheduler.kthread_stop.kp
> +{}

This one looks just right!


> diff -Naur systemtap-orig/testsuite/buildok/scheduler-test-tracepoints.stp systemtap/testsuite/buildok/scheduler-test-tracepoints.stp
> +#! stap -up4
> +
> +//Tests if all probes in the scheduler tapset are resolvable.
> +
> +probe scheduler.kthread_stop {
> +	printf("pid = %d, priority = %d\n", thread_pid, thread_priority);
> +}
> [...]

These look good (if they print out all the variables).


> [...]
> diff -Naur systemtap-orig/testsuite/systemtap.examples/profiling/sched_switch.stp systemtap/testsuite/systemtap.examples/profiling/sched_switch.stp
> --- systemtap-orig/testsuite/systemtap.examples/profiling/sched_switch.stp	1969-12-31 19:00:00.000000000 -0500
> +++ systemtap/testsuite/systemtap.examples/profiling/sched_switch.stp	2009-09-22 02:29:16.000000000 -0400
> @@ -0,0 +1,71 @@
> +/* This script wokrs similar to ftrace's sched_switch. It displays a list of
> + * processes which get switched in and out of the scheduler. The format of display
> + * is PROCESS_NAME PROCESS_PID CPU TIMESTAMP PID: PRIORITY: PROCESS STATE ->/+
> + *    NEXT_PID : NEXT_PRIORITY: NEXT_STATE NEXT_PROCESS_NAME 
> + * -> indicates that prev process is scheduled out and the next process is 
> + *    scheduled in.
> + * + indicates that prev process has woken up the next process.
> + * The usage is sched_switch.stp <"pid"/"name"> pid/name
> + */
> +
> +global task_cpu_old[9999]
> +global pids[999]
> [...]

Please write down whatever rationale exists for those two numbers.
Perhaps they could stick with defaults after the pid/tid confusion is
cleared up, and if perhaps you add some code to delete entries
belonging to processes as they die.

It'd be good to see a statement of what OS/kernel versions this
was tested on, and to see a sample of its output.

Thank you for your efforts!


- FChE


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]