This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Why did control channel closes itself?
- From: "Peter Teoh" <htmldeveloper at gmail dot com>
- To: "Masami Hiramatsu" <mhiramat at redhat dot com>
- Cc: systemtap at sources dot redhat dot com
- Date: Wed, 23 Apr 2008 10:23:28 +0800
- Subject: Re: Why did control channel closes itself?
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=oQrwIU5NXMm/B1/o38+0DWzU0LIj5u3Zps1N91EXi8g=; b=NHjHNZz8l4x5WVKZyL5JUOmn0y7pRyFBr25TFb9nce9Oj53hbTkYpzuFCo1b2pqS4LEBBTLWA+BrTXdt22JKZy485psPyRL5GDJauDzV/J9GyxYlnzBxwSOzURt8swMr3Q1XZk334GWWPC4GeUYhD1f9kmGlo/qFQffk5wdnK6A=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=oKvUQ3O3DPAlR0kluowKQrllr0nTfp0z/wGAyFA8eMiuqh4IZoeeVzEhVdl+Cc8FawCFqlXb7W3IgPjqrRVo0ST9L0Wly3jxigMwc4lJ65yOA/Nh6z5JxLzAMnKFQ9HWEnw4Pm7Ue3TLdzqA5TtmqX+0fQUImtR9EGvln7CBZjU=
- References: <804dabb00804210922p3cf9294ayd9234ea75e934f70@mail.gmail.com> <480CC532.2000705@redhat.com>
Thank you Masami,
On Tue, Apr 22, 2008 at 12:47 AM, Masami Hiramatsu <mhiramat@redhat.com> wrote:
> Hi Peter,
>
>
> Peter Teoh wrote:
> > WARNING: Number of errors: 1, skipped probes: 0
> > WARNING: There were 56939 transport failures.
>
> hmm, you might have an error message. I guess that message
> has been washed away by other outputs.
>
> could you run that script with "-o outfile" option?
>
> And also, it seems that a lot of transport failuers occurred.
> I think you can reduce the transport failures by -s option.
>
stapio:stp_main_loop:282 nb=46
ERROR: probe overhead exceeded threshold
stapio:stp_main_loop:282 nb=53
WARNING: Number of errors: 1, skipped probes: 2
stapio:stp_main_loop:282 nb=51
WARNING: There were 45618 transport failures.
stapio:stp_main_loop:282 nb=4
stapio:stp_main_loop:319 got STP_EXIT
stapio:cleanup_and_exit:234 detach=0
stapio:close_relayfs:221 closing
stapio:close_relayfs:240 done
stapio:cleanup_and_exit:247 closing control channel
stapio:cleanup_and_exit:253 removing
stap_437787d15cda3e4c2d917c2b55f74c0e_108785
Notice the error "probe overhead exceeded threshold". And
subsequent control-C does not stop the execution, it just hanged
without any response.
My version is the latest:
1482d30eb166b566e99fa21f9cd697abb711c30e branch 'master' of git:/
/sources.redhat.com/git/systemtap
lsmod |grep stap
stap_437787d15cda3e4c2d917c2b55f74c0e_108785 528444 1
Most likely irrelevant but the last few lines in the output file are:
60165
60166 857 Xorg(3118): -> __resched_task
60167 0xc041a18b : __resched_task+0x1/0x63
60168 0xc06377a6 : kretprobe_trampoline_holder+0x3/0x33
60169 0xc06377a6 : kretprobe_trampoline_holder+0x3/0x33
60170 0xc04370e2 : hrtimer_interrupt+0xe0/0x13f
60171 0xc0413206 : smp_apic_timer_interrupt+0x6c/0x82
60172 0xc0405394 : apic_timer_interrupt+0x28/0x30
60173 0xc041c237 : __wake_up_sync+0x3a/0x44
60174 0xc06377a6 : kretprobe_trampoline_holder+0x3/0x33
60175 0xc05c204c : sock_wfree+0x22/0x37
60176 0xc05c3ee4 : skb_release_all+0x69/0xbc
60177 0xc05c3804 : __kfree_skb+0xb/0x66
60178 0xc05c388d : kfree_skb+0x2e/0x30
60179 0xc061f558 : unix_stream_recvmsg+0x357/0x486
60180 0xc05be7e1 : sock_aio_read+0xed/0xfb
60181 0xc0473c52 : do_sync_read+0xab/0xe9
60182 0xc04743f8 : vfs_read+0x9b/0x131
60183 0xc047486a : sys_read+0x3b/0x60
60184 0xc04048c6 : ia32_sysenter_target+0x66/0x8c
60185
60186 880 Xorg(3118): <- __resched_task
60187
60188 883 Xorg(3118): <- task_tick_fair
60189
60190 886 Xorg(3118): <- hrtick
60191
All the above names have been repeated many times before, I must emphasize.
Funny thing is this - in the stap_xxxxx.c auto-generated in the /tmp/
directory, the following lines were observed:
1529 : (STP_OVERLOAD_INTERVAL + 1);
1530 c->cycles_sum += cycles_elapsed;
1531 if (interval > STP_OVERLOAD_INTERVAL) {
1532 if (c->cycles_sum > STP_OVERLOAD_THRESHOLD) {
1533 _stp_error ("probe overhead exceeded threshold");
1534 atomic_set (&session_state, STAP_SESSION_ERROR);
1535 atomic_inc (&error_count);
1536 }
1537 c->cycles_base = cycles_atend;
1538 c->cycles_sum = 0;
1539 }
Here c->cycles_sum is incremented. But nowhere can I find it being
initialized - uninitialized cycles_sum variable - could this be a
bug?
> Thank you,
>
> --
> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America) Inc.
> Software Solutions Division
>
> e-mail: mhiramat@redhat.com
>
>
--
Regards,
Peter Teoh