Top network users by PID
Problem
Someone asked if there's a way to tell how much network traffic each process is generating on a machine. Several suggestions were given, including a SystemTap script written by Jose Santos. I revised that script to use the networking tapset, track both transmits and receives, and print top-like output.
Reference this mailing list thread.
Scripts
global ifxmit, ifrecv, ifdevs, ifpid, execname, user
probe netdev.transmit
{
p = pid()
execname[p] = execname()
user[p] = uid()
ifdevs[p, dev_name] = dev_name
ifxmit[p, dev_name] <<< length
ifpid[p, dev_name] ++
}
probe netdev.receive
{
p = pid()
execname[p] = execname()
user[p] = uid()
ifdevs[p, dev_name] = dev_name
ifrecv[p, dev_name] <<< length
ifpid[p, dev_name] ++
}
function print_activity()
{
printf("%5s %5s %-7s %7s %7s %7s %7s %-15s\n",
"PID", "UID", "DEV", "XMIT_PK", "RECV_PK",
"XMIT_KB", "RECV_KB", "COMMAND")
foreach ([pid, dev] in ifpid-) {
n_xmit = @count(ifxmit[pid, dev])
n_recv = @count(ifrecv[pid, dev])
printf("%5d %5d %-7s %7d %7d %7d %7d %-15s\n",
pid, user[pid], dev, n_xmit, n_recv,
n_xmit ? @sum(ifxmit[pid, dev])/1024 : 0,
n_recv ? @sum(ifrecv[pid, dev])/1024 : 0,
execname[pid])
}
print("\n")
delete execname
delete user
delete ifdevs
delete ifxmit
delete ifrecv
delete ifpid
}
probe timer.ms(5000)
{
print_activity()
}
Output
The original script filtered out traffic for pid 0. During testing much of the traffic was missing from the output. I removed the pid 0 filter and found the missing traffic stats. It appears that the networking probes are triggered during interrupts, so the pid() function may not reflect the actual pid causing the traffic.
# stap nettop.stp
PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
0 0 eth0 66 344 18 19 swapper
2469 0 eth0 214 39 167 1 Xvnc
23470 0 eth0 24 35 5 1 firefox-bin
2281 0 eth0 1 1 0 0 wcstatusd
22446 0 eth0 1 0 1 0 sshd
2538 0 eth0 0 1 0 0 metacity
23557 0 eth0 0 1 0 0 sh
23559 0 eth0 0 1 0 0 lspci
23566 0 eth0 0 1 0 0 sh
PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
0 0 eth0 14 80 0 3 swapper
2469 0 eth0 32 2 20 0 Xvnc
22446 0 eth0 1 0 0 0 sshd
2052 38 eth0 1 0 0 0 ntpd
Lessons
Top-like scripts are very easy to write in SystemTap. This same general script structure can be applied to many data collection tasks. The hard part is finding the right kernel function to probe. It's important to understand the context in which functions/probes can be triggered.
