[root@... ~]# stap -e 'probe syscall.epoll_wait { printf("%s\n", argstr) exit() }' ERROR: kernel read fault at 0x23cf81d4 (addr) near identifier '$epfd' at /usr/share/systemtap/tapset/syscalls.stp:657:37 WARNING: Number of errors: 1, skipped probes: 1 Pass 5: run completed in 10usr/40sys/4751real ms. While... [root@... ~]# stap -ve 'probe nd_syscall.epoll_wait { printf("%s\n", argstr) exit() }' 102, 0x96a01000, 1024, 5000 The same script works on x86_64 and powerpc.
Bug #10318 comment #7 seems to contain another instance of this bug
Verified on bear Fedora11: kernel-PAE-2.6.29.6-213.fc11.i686 systemtap-0.9.8-1.fc11.i586 Here is the c code generated by stap -p3. --- c->last_stmt = "identifier '$epfd' at /usr/share/systemtap/tapset/syscalls.stp:657:37"; if (unlikely (c->nesting+2 >= MAXNESTING)) { c->last_error = "MAXNESTING exceeded"; return; } else { c->nesting ++; } l->__retvalue = 0; #define return goto out { { { intptr_t addr; { // DWARF expression: 0x71 { intptr_t s0; s0 = fetch_register (1) + 0L; addr = s0; } } { int32_t value = deref (4, addr);THIS->__retvalue = value; } --- fetch_register(1) is regs->cx. Should that be reg->sp(or fetch_register(4))? BTW, this kernel is built with ftrace(mcount).
It doesn't seem PAE, I see the same on 2.6.29.6-213.fc11.i586
Dwarfdump said: --- <1><48125> DW_TAG_subprogram DW_AT_external yes(1) DW_AT_name sys_epoll_wait DW_AT_decl_file 1 fs/eventpoll.c DW_AT_decl_line 1249 DW_AT_prototyped yes(1) DW_AT_type <172> DW_AT_low_pc 0xc04cc908 DW_AT_high_pc 0xc04ccc8b DW_AT_frame_base <loclist with 3 entries follows> [ 0]<lowpc=0xc04cc908><highpc=0xc04cc909>DW_OP_breg4+4 [ 1]<lowpc=0xc04cc909><highpc=0xc04cc90b>DW_OP_breg4+8 [ 2]<lowpc=0xc04cc90b><highpc=0xc04ccc8b>DW_OP_breg5+8 DW_AT_sibling <49424> <2><48155> DW_TAG_formal_parameter DW_AT_name epfd DW_AT_decl_file 1 fs/eventpoll.c DW_AT_decl_line 1249 DW_AT_type <96> DW_AT_location <loclist with 3 entries follows> [ 0]<lowpc=0xc04cc908><highpc=0xc04cc93f>DW_OP_breg1+0 [ 1]<lowpc=0xc04cc93f><highpc=0xc04cc96c>DW_OP_reg0 [ 2]<lowpc=0xc04cc96c><highpc=0xc04ccc8b>DW_OP_breg1+0 --- And objdump said: --- c04cc908 <sys_epoll_wait>: c04cc908: 55 push %ebp c04cc909: 89 e5 mov %esp,%ebp c04cc90b: 57 push %edi c04cc90c: 56 push %esi c04cc90d: 53 push %ebx c04cc90e: 51 push %ecx c04cc90f: 83 ec 60 sub $0x60,%esp c04cc912: 8d 4d 08 lea 0x8(%ebp),%ecx <---here! c04cc915: e8 4a d5 f3 ff call c0409e64 <mcount> c04cc91a: 8b 51 04 mov 0x4(%ecx),%edx c04cc91d: 8b 01 mov (%ecx),%eax c04cc91f: 8b 79 0c mov 0xc(%ecx),%edi c04cc922: 89 55 b0 mov %edx,-0x50(%ebp) c04cc925: 8b 71 08 mov 0x8(%ecx),%esi c04cc928: 89 f2 mov %esi,%edx c04cc92a: 4a dec %edx --- You can see that the top of stack address is assigned to ecx at c04cc912. However, dwarf has no such information. It seems gcc's bug.(maybe, the combination of -g and mcount will cause this bug)
Here is the result of stap ... -p2 (probe part) --- # probes kernel.function("sys_epoll_wait@fs/eventpoll.c:1249")? /* pc=_stext+0xca908 */ /* <- syscall.epoll_wait = kernel.function("compat_sys_epoll_wait")?, kernel.function("SyS_epoll_wait")!, kernel.function("sys_epoll_wait")? <- syscall.epoll_wait */ # locals argstr:string { (argstr) = (sprintf("%d, %p, %d, %d", _dwarf_tvar_get_epfd_4(), _dwarf_tvar_get_events_5(), _dwarf_tvar_get_maxevents_6(), _dwarf_tvar_get_timeout_7())) printf("%s\\n", argstr) exit() } --- And /proc/kallsyms said; c0402000 T _stext So, actual probing address is 0xc04cc908 (sys_epoll_wait+0).
On my i386 machine the prologue tests also fail: FAIL: prologues -P FAIL: prologues no-P This is most likely related since the probe is placed so early. Do those tests also fail for you?
(In reply to comment #6) > On my i386 machine the prologue tests also fail: > FAIL: prologues -P > FAIL: prologues no-P > > This is most likely related since the probe is placed so early. > Do those tests also fail for you? Yes, I got same results. Running /usr/share/systemtap/testsuite/systemtap.base/prologues.exp ... FAIL: prologues -P FAIL: prologues no-P
Here are the results of dwarfdump and objdump with/without mcount(ftrace). I'm compiling linux-2.6.31-rc4-tip by gcc-4.4.0-4.i586. With ftrace (-pg) ------------ <1><44129> DW_TAG_subprogram DW_AT_external yes(1) DW_AT_name sys_epoll_wait DW_AT_decl_file 1 /home/mhiramat/ksrc/linux-2.6-tip/fs/eventpoll.c DW_AT_decl_line 1321 DW_AT_prototyped yes(1) DW_AT_type <350> DW_AT_low_pc 0xc10ce507 DW_AT_high_pc 0xc10ce6ef DW_AT_frame_base <loclist with 3 entries follows> [ 0]<lowpc=0xc10ce507><highpc=0xc10ce508>DW_OP_breg4+4 [ 1]<lowpc=0xc10ce508><highpc=0xc10ce50a>DW_OP_breg4+8 [ 2]<lowpc=0xc10ce50a><highpc=0xc10ce6ef>DW_OP_breg5+8 DW_AT_sibling <44909> <2><44159> DW_TAG_formal_parameter DW_AT_name epfd DW_AT_decl_file 1 /home/mhiramat/ksrc/linux-2.6-tip/fs/eventpoll.c DW_AT_decl_line 1321 DW_AT_type <60> DW_AT_location <loclist with 3 entries follows> [ 0]<lowpc=0xc10ce507><highpc=0xc10ce53e>DW_OP_breg1+0 [ 1]<lowpc=0xc10ce53e><highpc=0xc10ce568>DW_OP_reg0 [ 2]<lowpc=0xc10ce568><highpc=0xc10ce6ef>DW_OP_breg1+0 ------------ c10ce507 <sys_epoll_wait>: c10ce507: 55 push %ebp c10ce508: 89 e5 mov %esp,%ebp c10ce50a: 57 push %edi c10ce50b: 56 push %esi c10ce50c: 53 push %ebx c10ce50d: 51 push %ecx c10ce50e: 83 ec 34 sub $0x34,%esp c10ce511: 8d 4d 08 lea 0x8(%ebp),%ecx <--!here! c10ce514: e8 9f 50 f3 ff call c10035b8 <mcount> c10ce519: bb ea ff ff ff mov $0xffffffea,%ebx c10ce51e: 8b 51 04 mov 0x4(%ecx),%edx c10ce521: 8b 01 mov (%ecx),%eax c10ce523: 8b 71 0c mov 0xc(%ecx),%esi c10ce526: 89 55 c0 mov %edx,-0x40(%ebp) c10ce529: 8b 79 08 mov 0x8(%ecx),%edi c10ce52c: 89 fa mov %edi,%edx c10ce52e: 4a dec %edx c10ce52f: 81 fa a9 aa aa 0a cmp $0xaaaaaa9,%edx c10ce535: 89 7d c4 mov %edi,-0x3c(%ebp) ------------ Without ftrace (no -pg) ---------- <1><44101> DW_TAG_subprogram DW_AT_external yes(1) DW_AT_name sys_epoll_wait DW_AT_decl_file 1 /home/mhiramat/ksrc/linux-2.6-tip/fs/eventpoll.c DW_AT_decl_line 1321 DW_AT_prototyped yes(1) DW_AT_type <350> DW_AT_low_pc 0xc10b34cd DW_AT_high_pc 0xc10b36b0 DW_AT_frame_base <loclist with 3 entries follows> [ 0]<lowpc=0xc10b34cd><highpc=0xc10b34ce>DW_OP_breg4+4 [ 1]<lowpc=0xc10b34ce><highpc=0xc10b34d0>DW_OP_breg4+8 [ 2]<lowpc=0xc10b34d0><highpc=0xc10b36b0>DW_OP_breg5+8 DW_AT_sibling <44881> <2><44131> DW_TAG_formal_parameter DW_AT_name epfd DW_AT_decl_file 1 /home/mhiramat/ksrc/linux-2.6-tip/fs/eventpoll.c DW_AT_decl_line 1321 DW_AT_type <60> DW_AT_location <loclist with 3 entries follows> [ 0]<lowpc=0xc10b34cd><highpc=0xc10b34ff>DW_OP_breg1+0 [ 1]<lowpc=0xc10b34ff><highpc=0xc10b3529>DW_OP_reg0 [ 2]<lowpc=0xc10b3529><highpc=0xc10b36b0>DW_OP_breg1+0 ---------- c10b34cd <sys_epoll_wait>: c10b34cd: 55 push %ebp c10b34ce: 89 e5 mov %esp,%ebp c10b34d0: 57 push %edi c10b34d1: 56 push %esi c10b34d2: 53 push %ebx c10b34d3: bb ea ff ff ff mov $0xffffffea,%ebx c10b34d8: 51 push %ecx c10b34d9: 8d 4d 08 lea 0x8(%ebp),%ecx <--!here! c10b34dc: 83 ec 34 sub $0x34,%esp c10b34df: 8b 51 04 mov 0x4(%ecx),%edx c10b34e2: 8b 01 mov (%ecx),%eax c10b34e4: 8b 71 0c mov 0xc(%ecx),%esi c10b34e7: 89 55 c0 mov %edx,-0x40(%ebp) c10b34ea: 8b 79 08 mov 0x8(%ecx),%edi c10b34ed: 89 fa mov %edi,%edx c10b34ef: 4a dec %edx c10b34f0: 81 fa a9 aa aa 0a cmp $0xaaaaaa9,%edx c10b34f6: 89 7d c4 mov %edi,-0x3c(%ebp) ---------- OK... Even without mcount, the dwarf information seems broken. You can see the top of stack is assigned to DW_OP_breg1(=ecx) at c10b34d9. But there is no such information (or did I miss it?). What further information do we need?
I've reported on rh bugzilla for fedora11 (because I could see this only on F11). https://bugzilla.redhat.com/show_bug.cgi?id=516300
(In reply to comment #9) > I've reported on rh bugzilla for fedora11 (because I could see this only on F11). > https://bugzilla.redhat.com/show_bug.cgi?id=516300 > The bug seems fixed in gcc-4.4.1. However, I'm still not sure stap can decode DW_OP_call_frame_cfa.
Oops, stap can't decode dwarf generated by gcc-4.4.1... $ stap --vp 02 -r 2.6.31-rc6-tip -e 'probe kernel.function("sys_epoll_wait") {print($epfd)}' WARNING: kernel release/architecture mismatch with host forces last-pass 4. probe sys_epoll_wait@/home/mhiramat/ksrc/linux-2.6-tip/fs/eventpoll.c:1321 kernel reloc=.dynamic section=.text pc=0xc10b3d0d semantic error: dwarf_getlocation_addr (form 0xa): invalid DWARF Pass 2: analyzed script: 1 probe(s), 0 function(s), 0 embed(s), 0 global(s) in 490usr/60sys/557real ms. Pass 2: analysis failed. Try again with another '--vp 01' option. Running rm -rf /tmp/stapMvMIoA
roland, could you give us any suggestion about this change? Will it be a gcc bug? or elfutils or systemtap needs to be improved?
(In reply to comment #11) > Oops, stap can't decode dwarf generated by gcc-4.4.1... > [...] > However, I'm still not sure stap can decode DW_OP_call_frame_cfa. Yes it can since: commit 00b01a991cc4300f18c747853e85841d187b1fa4 Author: Mark Wielaard <mjw@redhat.com> Date: Wed Jul 15 12:04:57 2009 +0200 PR10388 Support DW_OP_call_frame_cfa. Depends on elfutils 0.142 cfi support. But you need to build systemtap against elfutils 0.142+
Thanks, I've checked that stap works with elfutils-0.142. $ stap --vp 02 -r 2.6.31-rc6-tip -e 'probe kernel.function("sys_epoll_wait") {print($epfd)}' -p2 probe sys_epoll_wait@/home/mhiramat/ksrc/linux-2.6-tip/fs/eventpoll.c:1321 kernel reloc=.dynamic section=.text pc=0xc10b3d0d # functions _dwarf_tvar_get_epfd_0:long () %{{ { intptr_t addr; intptr_t frame_base; { // DWARF expression: 0x92(4,4) { intptr_t s0; s0 = fetch_register (4) + 4L; frame_base = s0; } } { // DWARF expression: 0x91 { intptr_t s0; s0 = frame_base + 0L; addr = s0; } } { int32_t value = deref (4, addr);THIS->__retvalue = value; } } goto out; if (0) goto deref_fault; deref_fault: goto out; } /* pure */%} # probes kernel.function("sys_epoll_wait@/home/mhiramat/ksrc/linux-2.6-tip/fs/eventpoll.c:1321") /* pc=_stext+0xb2c25 */ /* <- kernel.function("sys_epoll_wait") */ print(_dwarf_tvar_get_epfd_0()) Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) in 540usr/90sys/628real ms. Running rm -rf /tmp/stapgtVMtt However, we'll need to wait for new gcc.
(In reply to comment #14) > However, we'll need to wait for new gcc. Which seems to have arrived, plus a recompiled kernel package using it. At least on Fedora 11. $ rpm -q gcc; uname -r; stap -V 2>&1 | head -1 gcc-4.4.1-2.fc11.i586 2.6.29.6-217.2.16.fc11.i586 SystemTap translator/driver (version 0.9.9/0.142 non-git sources) $ stap -e 'probe syscall.epoll_wait { printf("%s\n", argstr) exit() }' 4, 0x0253d3e0, 64, 59743