This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: measuring accept queueing time (revisited)


"Frank Ch. Eigler" <fche@redhat.com> wrote:
> normalperson (we'll just see about that :-) wrote:
> 
> > Hi all, I'm wondering if there's an updated/better version of
> > the [script] posted here:
> >
> >   http://sourceware.org/ml/systemtap/2006-q1/msg00385.html
> >
> > I can probably figure it out on my own, but would like to avoid
> > doing so if somebody's already done the work :)
> 
> Yeah, some changes since then:
> - data structures are more easily traversed by pure
>   script code -> and @cast type operators
> - the kernel has useful tracepoints compiled in for more events,
>   **but** this does not currently seem to include any interesting
>   tcp/network/socket stuff
> - the translator has gotten better placing probes
> - the compiler has gotten better at generating debug data to let
>   us find data 
> 
> So these days, I'd try to go for intercepting the inline functions
> 
>    probe kernel.function("sk_acceptq_added") {
>       /* record $sk, timestamp */
>    }
>    probe kernel.function("sk_acceptq_removed") { 
>       /* match $sk  */
>       /* record elapsed time, execname(), pid() */
>    }

Thank you!  With that hint, I came up with the following for TCP:
-------------------------------- 8<----------------------------------
global start

probe kernel.function("sk_acceptq_added") {
	start[$sk] = cpu_clock_ms(0)
}

probe kernel.function("sk_acceptq_removed") {
	started = start[$sk]
	if (started) {
		delete start[$sk]
		diff = cpu_clock_ms(0) - started
		printf("%s[%d] %ld\n", execname(), pid(), diff)
	}
}
-------------------------------- 8<----------------------------------

> or their (source-code-reading-determined) callers.
> 
> And perhaps try to raise a flag on lkml to get some tracepoints into
> the tcp/ip layer pretty please.

Will do if needed (not sure what to even ask for at this point, but
I'll try to figure it out).

> > [...]  I'd also like to be able to get the same accept queueing time
> > for Unix sockets, too.
> 
> There doesn't seem to be much queuing logic in UNIX sockets, for the
> connect/accept path at least.

I managed to come up with the following for UNIX sockets.  It could
probably be done better, but I'm not sure why my probe for unix_accept()
does not work.

Thus, I'm probing skb_recv_datagram() instead, but that might cast
too big of a net (I expect it to be slower, but haven't benched).

-------------------------------- 8<----------------------------------
global start

/*
 * 3.6.8 only has two unix_create1() callers:
 * - unix_stream_connect() - we care about this
 * - unix_create()         - we don't care about this
 */
probe kernel.function("unix_create1").return {
	start[$return] = cpu_clock_ms(0)
}

/*
 * we only care about unix_create1() called from unix_stream_connect(),
 * so avoid wasting space here.
 * Maybe future (or past) kernels had more unix_create1() callers,
 * that would be bad...
 */
probe kernel.function("unix_create").return {
	delete start[$return]
}

/* XXX: This doesn't seem to work, so I'm probing skb_recv_diagram instead */
probe kernel.function("unix_accept").return {
	if ($return != 0)
		next
	sk = $newsock->sk
	/* state: 3 == SS_CONNECTED, but this shows 1 (SS_UNCONNECTED) ... */
	/* printf("accept %d %lx %lx\n", $newsock->state, $newsock, sk); */
	started = start[sk]
	if (started) {
		delete start[sk]
		diff = cpu_clock_ms(0) - started
		printf("%s[%d] %ld\n", execname(), pid(), diff)
	}
}

probe kernel.function("skb_recv_datagram").return {
	if (!$return)
		next
	sk = $return->sk
	started = start[sk]
	if (started) {
		delete start[sk]
		diff = cpu_clock_ms(0) - started
		printf("%s[%d] %ld\n", execname(), pid(), diff)
	}
}
-------------------------------- 8<----------------------------------
I'm using an unpatched Linux 3.6.8 from the stable branch


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]