no## This is a template for a generic "war story" (systemtap sample usage) writeup.
Kernel Profiling
Problem
One might wonder what the kernel is up to in general, but requring only a general overview with rough sampling, where the overhead is minimal. The following script gives one an impression.
Scripts
global profile, pcount probe timer.profile { pcount <<< 1 fn = user_mode() ? "<user>" : symname(addr()) if (fn != "") profile[fn] <<< 1 } probe timer.ms(4000) { printf ("\n--- %d samples recorded:\n", @count(pcount)) foreach (f in profile- limit 10) { printf ("%s\t%d\n", f, @count(profile[f])) } delete profile delete pcount }
Output
# stap pf2.stp --- 109 samples recorded: mwait_idle 71 check_poison_obj 4 _spin_unlock_irqrestore 2 dbg_redzone1 1 kfree 1 kmem_cache_free 1 _spin_unlock_irq 1 end_buffer_write_sync 1 lock_acquire 1 --- 108 samples recorded: mwait_idle 91 check_poison_obj 3 _spin_unlock_irq 2 delay_tsc 1
Lessons
This script samples using the timer interrupt, which occurs perhaps tens to hundreds of times a second. Since the timer.profile handler uses aggregate globals, it can run concurrently on a multiprocessor machine.
User-space functions are excluded from the profile, so what will tend to show up are kernel-intensive workloads, or idleness ("mwait_idle" on recent kernels). Programs that are simply busy in user space will not show up with this script.
See also WSFunctionCallCount.