Top network users by PID
Problem
Someone asked if there's a way to tell how much network traffic each process is generating on a machine. Several suggestions were given, including a SystemTap script written by Jose Santos. I revised that script to use the networking tapset, track both transmits and receives, and print top-like output.
Reference this mailing list thread.
Scripts
global ifxmit, ifrecv, ifdevs, ifpid, execname, user probe netdev.transmit { p = pid() execname[p] = execname() user[p] = uid() ifdevs[p, dev_name] = dev_name ifxmit[p, dev_name] <<< length ifpid[p, dev_name] ++ } probe netdev.receive { p = pid() execname[p] = execname() user[p] = uid() ifdevs[p, dev_name] = dev_name ifrecv[p, dev_name] <<< length ifpid[p, dev_name] ++ } function print_activity() { printf("%5s %5s %-7s %7s %7s %7s %7s %-15s\n", "PID", "UID", "DEV", "XMIT_PK", "RECV_PK", "XMIT_KB", "RECV_KB", "COMMAND") foreach ([pid, dev] in ifpid-) { n_xmit = @count(ifxmit[pid, dev]) n_recv = @count(ifrecv[pid, dev]) printf("%5d %5d %-7s %7d %7d %7d %7d %-15s\n", pid, user[pid], dev, n_xmit, n_recv, n_xmit ? @sum(ifxmit[pid, dev])/1024 : 0, n_recv ? @sum(ifrecv[pid, dev])/1024 : 0, execname[pid]) } print("\n") delete execname delete user delete ifdevs delete ifxmit delete ifrecv delete ifpid } probe timer.ms(5000) { print_activity() }
Output
The original script filtered out traffic for pid 0. During testing much of the traffic was missing from the output. I removed the pid 0 filter and found the missing traffic stats. It appears that the networking probes are triggered during interrupts, so the pid() function may not reflect the actual pid causing the traffic.
# stap nettop.stp PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 66 344 18 19 swapper 2469 0 eth0 214 39 167 1 Xvnc 23470 0 eth0 24 35 5 1 firefox-bin 2281 0 eth0 1 1 0 0 wcstatusd 22446 0 eth0 1 0 1 0 sshd 2538 0 eth0 0 1 0 0 metacity 23557 0 eth0 0 1 0 0 sh 23559 0 eth0 0 1 0 0 lspci 23566 0 eth0 0 1 0 0 sh PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 14 80 0 3 swapper 2469 0 eth0 32 2 20 0 Xvnc 22446 0 eth0 1 0 0 0 sshd 2052 38 eth0 1 0 0 0 ntpd
Lessons
Top-like scripts are very easy to write in SystemTap. This same general script structure can be applied to many data collection tasks. The hard part is finding the right kernel function to probe. It's important to understand the context in which functions/probes can be triggered.