This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug kprobes/2062] Return probes does not scale well on SMP box


------- Additional Comments From jkenisto at us dot ibm dot com  2006-07-13 19:44 -------
(In reply to comment #19)

> ====== 2.6.17.3/ppc64/8-way, without Jim's patch ================
> 
> no probe
>   Total cpus: loops = 40000000, average = 6202 ns
> 
> kretprobe using stap:
>   Total cpus: loops = 40000000, average = 43702 ns
> 
> kretprobe using getsid.c:
>   Total cpus: loops = 40000000, average = 36456 ns
> 
> 
> ======= 2.6.17.4/ppc64/8-way, with Jim's patch ===================
> 
> kretprobe using stap:
>   Total cpus: loops = 40000000, average = 26621 ns
> 
> kretprobe using getsid.c:
>   Total cpus: loops = 40000000, average = 24975 ns

This is actually pretty much what I'd expect to see.  All the CPUs are hitting
the same probepoint repeatedly and piling up on the same per-kretprobe lock. 
But the performance gain is reasonably good, and even better when stap's
involved.  The stap-generated handler takes longer than the empty kprobes
handler, so we get more benefit from not holding a lock while the handler runs.

We should see more benefit with multiple kretprobes -- e.g.
probe syscall.*.return { ... }

Keep the comments and improvements coming.  Thanks.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2062

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]