Bug 14596 - support for probing/unwinding out-of-tree modules
Summary: support for probing/unwinding out-of-tree modules
Status: RESOLVED DUPLICATE of bug 17073
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-19 22:30 UTC by Jeff Haran
Modified: 2014-06-19 15:26 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff Haran 2012-09-19 22:30:41 UTC
Just so you know the genesis of this bug:

> -----Original Message-----
> From: Frank Ch. Eigler [mailto:fche@redhat.com]
> Sent: Friday, September 14, 2012 2:04 PM
> To: Jeff Haran
> Cc: 'systemtap@sourceware.org'
> Subject: Re: Missing unwind data for module
> 
> Hi -
> 
> > You're a genius. Copying the module into /lib/modules/`uname -r`/
> > and rerunning stap without the .ko at the end did the trick: [...]
> 
> Haha, nah, more like insane.  This -d interpretation heuristic is
> trickier than it should be.  It relates to other bugs too, like the
> inability to say
>     probe module("/path/to/your/module.ko").SOMETHING { }
> 
> If you happened to hit this and were annoyed, we'd appreciate a bug
> report in bugzilla.
> 
> - FChE

"annoyed" is probably too strong a term, but it would be nice to be able to probe a module that is not in the standard place under /libs/modules and since Frank was so helpful in getting me through this I figured I'd provide the bugzilla he asked for.

Below is the original thread:

> -----Original Message-----
> From: Frank Ch. Eigler [mailto:fche@redhat.com]
> Sent: Friday, September 14, 2012 12:51 PM
> To: Jeff Haran
> Cc: 'systemtap@sourceware.org'
> Subject: Re: Missing unwind data for module
> 
> 
> Hi, Jeff -
> 
> > [...]
> > [root@s01b06 jharan]# stap -g -v -r 2.6.32-279.5.1.el6.x86_64
> neigh_create_enter.stp -m neigh_create_enter -p 4 -d insane.ko
> 
> I suspect what's missing is that when -d is used to identify kernel
> modules, names should exclude the .ko, and the modules should be found
> somewhere under -r or /lib/modules/`uname -r`/.  With the -d being
> given a file name, it'll treat it as a user-space executable/shlib,
> which won't work.  (We should make that work though as a heuristic.)
> 
> - FChE

Frank,

You're a genius. Copying the module into /lib/modules/`uname -r`/ and rerunning stap without the .ko at the end did the trick:
[root@s01b06 jharan]# stap -g -v -r 2.6.32-279.5.1.el6.x86_64 neigh_create_enter.stp -m neigh_create_enter -p 4 -d insane
Pass 1: parsed user script and 82 library script(s) using 97404virt/23028res/2960shr kb, in 120usr/0sys/131real ms.
Pass 2: analyzed script: 2 probe(s), 10 function(s), 2 embed(s), 2 global(s) using 319212virt/112628res/4248shr kb, in 1120usr/110sys/1229real ms.
Pass 3: translated to C into "/tmp/stapFTlY0i/neigh_create_enter_src.c" using 318820virt/117240res/8952shr kb, in 450usr/70sys/516real ms.
neigh_create_enter.ko
Pass 4: compiled C into "neigh_create_enter.ko" in 7830usr/490sys/7581real ms.
[root@s01b06 jharan]#

[root@s01b08 jharan]# staprun neigh_create_enter.ko
neigh_create_call began, arp_tbl 0xffffffff81b16540
neigh_create() tbl 0xffffffff81b16540, dev 0xffff8806303c5020, name bond1.3091
neigh_create() hit 0, dev 0xffff8806303c5020, name bond1.3091, addr 0xa9fe4014, addr_string 169.254.64.20
neigh_create() IP address matches
 0xffffffff814461c0 : neigh_create+0x0/0x520 [kernel]
 0xffffffff8149cdf2 : arp_find+0x1a2/0x230 [kernel]
 0xffffffff81457db3 : eth_rebuild_header+0x73/0x80 [kernel]
 0xffffffffa013a3bf : insane_rebuild_header+0x3f/0x60 [insane]
 0xffffffffa013a3bf : insane_rebuild_header+0x3f/0x60 [insane]
 0xffffffff81444d1e : neigh_compat_output+0x8e/0xa0 [kernel]
 0xffffffff81477317 : ip_finish_output+0x237/0x310 [kernel]
 0xffffffff814774a8 : ip_output+0xb8/0xc0 [kernel]
 0xffffffff814767a5 : ip_local_out+0x25/0x30 [kernel]
 0xffffffff814767cb : ip_send_skb+0x1b/0x80 [kernel]
 0xffffffff8147685b : ip_push_pending_frames+0x2b/0x30 [kernel]
 0xffffffff81496cbe : raw_sendmsg+0x4ce/0x8b0 [kernel]
 0xffffffff814a1bea : inet_sendmsg+0x4a/0xb0 [kernel]
 0xffffffff81428133 : sock_sendmsg+0x123/0x150 [kernel]
 0xffffffff81429c86 : __sys_sendmsg+0x406/0x420 [kernel]
 0xffffffff81429ea9 : sys_sendmsg+0x49/0x90 [kernel]
 0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
 0x0 (inexact)
^C[root@s01b08 jharan]#

Thanks again,

Jeff Haran

--------------------------------------------------------
Hi,

I am attempting to probe the kernel function neigh_create(). When I staprun run the kernel module produced by stap on the target system, I get this WARNING and partial backtrace:

[root@s01b08 jharan]# staprun neigh_create_enter.ko
neigh_create_call began, arp_tbl 0xffffffff81b16540
WARNING: Missing unwind data for module, rerun with 'stap -d insane'
neigh_create() tbl 0xffffffff81b16540, dev 0xffff8806303c5020, name bond1.3091
neigh_create() hit 0, dev 0xffff8806303c5020, name bond1.3091, addr 0xa9fe4014, addr_string 169.254.64.20
neigh_create() IP address matches
 0xffffffff814461c0 : neigh_create+0x0/0x520 [kernel]
 0xffffffff8149cdf2 : arp_find+0x1a2/0x230 [kernel]
 0xffffffff81457db3 : eth_rebuild_header+0x73/0x80 [kernel]
 0xffffffffa013a3bf [insane]
 0x0 (inexact)

insane is a kernel module that I got off of a website, so it's not part of the RHEL63 distro that the rest of the kernel came from. (Note: it's a virtual device driver that is used to generating packet drops). I assume that the above is related to some sort of symbol information in insane.ko being missing either when I build the module or when I run stap to build neigh_create_enter.ko.

If I follow the suggestion in the WARNING message when I build neigh_create_enter.ko on the build machine, I get this:

[root@s01b06 jharan]# stap -g -v -r 2.6.32-279.5.1.el6.x86_64 neigh_create_enter.stp -m neigh_create_enter -p 4 -d insane.ko
Pass 1: parsed user script and 82 library script(s) using 97408virt/23032res/2960shr kb, in 120usr/0sys/131real ms.
Pass 2: analyzed script: 2 probe(s), 10 function(s), 2 embed(s), 2 global(s) using 319216virt/112628res/4248shr kb, in 1150usr/160sys/1311real ms.
Pass 3: translated to C into "/tmp/stap31Gjb1/neigh_create_enter_src.c" using 318840virt/117260res/8972shr kb, in 460usr/70sys/528real ms.
neigh_create_enter.ko
Pass 4: compiled C into "neigh_create_enter.ko" in 8430usr/490sys/8909real ms.
[root@s01b06 jharan]#

No warning generated, but when I run the result on the target machine, I still get the warning and partial backtrace:

[root@s01b08 jharan]# staprun neigh_create_enter.ko
neigh_create_call began, arp_tbl 0xffffffff81b16540
WARNING: Missing unwind data for module, rerun with 'stap -d insane'
neigh_create() tbl 0xffffffff81b16540, dev 0xffff8806303c5020, name bond1.3091
neigh_create() hit 0, dev 0xffff8806303c5020, name bond1.3091, addr 0xa9fe4014, addr_string 169.254.64.20
neigh_create() IP address matches
 0xffffffff814461c0 : neigh_create+0x0/0x520 [kernel]
 0xffffffff8149cdf2 : arp_find+0x1a2/0x230 [kernel]
 0xffffffff81457db3 : eth_rebuild_header+0x73/0x80 [kernel]
 0xffffffffa013a3bf [insane]
 0x0 (inexact)
^C[root@s01b08 jharan]#

Note that in other call trees to neigh_create() I get nice clean back traces from the system call on down:

neigh_create() hit 261, dev 0xffff88066f7dd020, name bond1.3092, addr 0xa9fe4014, addr_string 169.254.64.20
neigh_create() IP address matches
 0xffffffff814461c0 : neigh_create+0x0/0x520 [kernel]
 0xffffffff8149b8ee : arp_bind_neighbour+0xbe/0xc0 [kernel]
 0xffffffff8146d2e7 : rt_intern_hash+0x147/0x590 [kernel]
 0xffffffff8146db6d : ip_route_output_slow+0x43d/0x9a0 [kernel]
 0xffffffff8146e26b : __ip_route_output_key+0x19b/0x1b0 [kernel]
 0xffffffff81496244 : ip4_datagram_connect+0x184/0x320 [kernel]
 0xffffffff814a1c7c : inet_dgram_connect+0x2c/0x80 [kernel]
 0xffffffff81428de7 : sys_connect+0xd7/0xf0 [kernel]
 0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
 0x0 (inexact)

The problem seems to occur only when the call tree goes thru the insane.ko module.

Suggestions as to what I should do to avoid the warning and get a full stack trace most appreciated. My stap script follows:

[root@s01b06 jharan]# cat neigh_create_enter.stp
global arp_tbl

%{
#include <net/arp.h>
%}
function the_table_address:long () %{ /* unmangled */
        /* rely on EXPORT_SYMBOL ... to let this resolve */
        THIS->__retvalue = (int64_t) & arp_tbl;
%}
probe begin {
        arp_tbl = the_table_address()
        printf("neigh_create_call began, arp_tbl %p\n", arp_tbl)
        /* printf("%d\n", @cast(the_table_address(),"neigh_table","kernel")->family) */
}

global hit
probe kernel.function("neigh_create").call
{
        tbl = $tbl
        dev_name = kernel_string($dev->name)
        printf("neigh_create() tbl %p, dev %p, name %s\n", tbl, $dev, dev_name)
        if (tbl == arp_tbl) {
                ip_addr_network = kernel_int($pkey)
                ip_addr_string = ip_ntop(ip_addr_network)
                printf("neigh_create() hit %d, dev %p, name %s, addr 0x%x, addr_string %s\n",
                        hit++, $dev, dev_name, ntohl(ip_addr_network), ip_addr_string)
                if (ntohl(ip_addr_network) == 0xa9fe4014) {
                        printf("neigh_create() IP address matches\n")
                        print_backtrace()
                }
        }
}
[root@s01b06 jharan]#

I'm using systemtap version 1.7-5 from the RHEL63 distro.

Thanks,

Jeff Haran
Comment 1 Frank Ch. Eigler 2014-06-19 15:26:40 UTC
da dupe, doopy doopy doo doo wop doo wop da-daaaah

*** This bug has been marked as a duplicate of bug 17073 ***