Bug 19300 - Many SystemTap tapsets failing on kernel.org 4.2.0 kernel
Summary: Many SystemTap tapsets failing on kernel.org 4.2.0 kernel
Status: RESOLVED WORKSFORME
Alias: None
Product: systemtap
Classification: Unclassified
Component: tapsets (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-27 21:25 UTC by SBW
Modified: 2016-05-24 15:01 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Configuration Files and Test Results (56.06 KB, application/octet-stream)
2015-11-27 21:25 UTC, SBW
Details
stap-report (6.13 KB, text/plain)
2015-11-28 22:36 UTC, SBW
Details

Note You need to log in before you can comment on or make changes to this bug.
Description SBW 2015-11-27 21:25:10 UTC
Created attachment 8816 [details]
Configuration Files and Test Results

Hello,

I recently installed SystemTap on a vanilla 4.2.0 kernel(Linux From Scratch 7.8). While going through the SystemTap Beginners Guide and testing the various scripts I notice that some tapsets work but unfortunately the network related tapsets -- the ones I want to use -- do not. 

For example, running a tapset like:

probe tcp.ipv4.receive {
  printf(" %15s %15s  %5d  %5d  %d  %d  %d  %d  %d  %d\n",
         saddr, daddr, sport, dport, urg, ack, psh, rst, syn, fin)
}

will result in:

semantic error: while processing function __tcp_skb_ack
   thrown from: elaborate.cxx:4955
semantic error: type definition 'tcphdr' not found in 'kernel': operator '@cast' at /usr/share/systemtap/tapset/linux/tcp.stp:121:9
   thrown from: elaborate.cxx:5541
        source:         return @cast(tcphdr, "tcphdr")->ack

I know that the necessary headers are there on my box and that the kernel has been compiled with CONFIG_DEBUG_INFO and the other required CONFIG settings.

For some reason wildcards don't work either -- not a big deal to me but perhaps it may illuminate what is going on.

stap -vv -e 'probe kernel.function("*@net/socket.c") { }'

semantic error: while resolving probe point: identifier 'kernel' at <input>:1:7
   thrown from: elaborate.cxx:1069
        source: probe kernel.function("*@net/socket.c") { }
                      ^

As an example of what works -- probes like these work great:

probe syscall.open
{
  printf ("%s(%d) open\n", execname(), pid())
}

--or--

global count_jiffies, count_ms
probe timer.jiffies(100) { count_jiffies ++ }
probe timer.ms(100) { count_ms ++ }
probe timer.ms(12345)
{
  hz=(1000*count_jiffies) / count_ms
  printf ("jiffies:ms ratio %d:%d => CONFIG_HZ=%d\n",
    count_jiffies, count_ms, hz)
  exit ()
}

I am hoping that the root cause of these issues is perhaps configuration as each Linux distro puts things in slightly different places. SystemTap seems pretty impressive and I hope to be able to fully use it in the future on kernel.org kernels, so any help on this would be greatly appreciated.

-------ATTACHMENTS-------
1) kernel .config showing modules for systemtap installed per instructions in 'Building a kernel.org kernel' under Installation -> Build Your Own
2) elfutils 0.164 config.log
3) systemtap 2.9 config.log
4) results of make installcheck
5) successful results of stap -vv variables-in-printf.stp
6) failure results of stap -vv tcpdumplike.stp
7) successful results of stap -vv timer-jiffies.stp
8) failure results of stap -vv wildcards.stp

-------SYSTEMTAP VERSION-------
Systemtap translator/driver (version 2.9/0.164, non-git sources)
Copyright (C) 2005-2015 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
enabled features: NLS TR1_UNORDERED_MAP

-------KERNEL VERSION------
Linux xyz_lfs 4.2.0 #1 SMP Thu Nov 26 22:12:49 PST 2015 i686 GNU/Linux

------DEBUGINFO PATH------- (as per 'Building a kernel.org kernel')
lrwxrwxrwx 1 root root 18 Nov 21 15:36 /lib/modules/4.2.0/build -> /sources/linux-4.2

-------GCC VERSION------
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i686-pc-linux-gnu/5.2.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: ../gcc-5.2.0/configure --prefix=/usr --enable-languages=c,c++ --disable-multilib --disable-bootstrap --with-system-zlib
Thread model: posix
gcc version 5.2.0 (GCC)

------ELFUTILS VERSION------
elfutils-0.164

-------ENVIRONMENT VARIABLES------
HZ=100
SHELL=/bin/bash
TERM=xterm-256color
USER=root
MAIL=/var/mail/root
PATH=/sbin:/bin:/usr/sbin:/usr/bin
PWD=/sources/linux-4.2
LANG=en_US.utf8#
SHLVL=1
HOME=/root
LOGNAME=root
_=/usr/bin/env
OLDPWD=/usr/share/systemtap/tapset/linux

------ALL SOFTWARE ON BOX------
-su-4.3# ls /sources
acl-2.2.52.src.tar.gz               gettext-0.19.5.1.tar.xz               ncurses-6.0.tar.gz
attr-2.4.47.src.tar.gz              git-2.6.0.tar.xz                      openssh-7.1p1.tar.gz
autoconf-2.69.tar.xz                glibc-2.22                            openssl-1.0.2d.tar.gz
automake-1.15.tar.xz                glibc-2.22-fhs-1.patch                patch-2.7.5.tar.xz
bash-4.3.30.tar.gz                  glibc-2.22.tar.xz                     perl-5.22.0.tar.bz2
bash-4.3.30-upstream_fixes-2.patch  glibc-2.22-upstream_i386_fix-1.patch  pkg-config-0.28.tar.gz
bc-1.06.95-memory_leak-1.patch      gmp-6.0.0a.tar.xz                     procps-ng-3.3.11.tar.xz
bc-1.06.95.tar.bz2                  gperf-3.0.4.tar.gz                    psmisc-22.21.tar.gz
binutils-2.25.1.tar.bz2             grep-2.21.tar.xz                      readline-6.3.tar.gz
bison-3.0.4.tar.xz                  groff-1.22.3.tar.gz                   readline-6.3-upstream_fixes-3.patch
blfs-bootscripts-20150924           grub-2.02~beta2.tar.xz                sed-4.2.2.tar.bz2
blfs-bootscripts-20150924.tar.bz2   gzip-1.6.tar.xz                       shadow-4.2.1.tar.xz
bzip2-1.0.6-install_docs-1.patch    iana-etc-2.30.tar.bz2                 strace-4.10.tar.xz
bzip2-1.0.6.tar.gz                  inetutils-1.9.4.tar.xz                sysklogd-1.5.1.tar.gz
check-0.10.0.tar.gz                 intltool-0.51.0.tar.gz                systemtap-2.9
config                              iproute2-4.2.0.tar.xz                 systemtap-2.9.tar.gz
coreutils-8.24-i18n-1.patch         kbd-2.0.3-backspace-1.patch           sysvinit-2.88dsf-consolidated-1.patch
coreutils-8.24.tar.xz               kbd-2.0.3.tar.xz                      sysvinit-2.88dsf.tar.bz2
curl-7.45.0.tar.lzma                kmod-21.tar.xz                        tar-1.28.tar.xz
dejagnu-1.5.3.tar.gz                less-458.tar.gz                       tcl-core8.6.4-src.tar.gz
diffutils-3.3.tar.xz                lfs-bootscripts-20150222.tar.bz2      texinfo-6.0.tar.xz
e2fsprogs-1.42.13.tar.gz            libcap-2.24.tar.xz                    tools.tar
elfutils-0.164                      libpipeline-1.4.1.tar.gz              trace-cmd
elfutils-0.164.tar.bz2              libtool-2.4.6.tar.xz                  tzdata2015f.tar.gz
eudev-3.1.2.tar.gz                  linux-4.2                             udev-lfs-20140408.tar.bz2
expat-2.1.0.tar.gz                  linux-4.2.tar.xz                      util-linux-2.27.tar.xz
expect5.45.tar.gz                   m4-1.4.17.tar.xz                      vim-7.4.tar.bz2
file-5.24.tar.gz                    make-4.1.tar.bz2                      wget-1.16.3.tar.xz
findutils-4.4.2.tar.gz              man-db-2.7.2.tar.xz                   which-2.21.tar.gz
flex-2.5.39.tar.xz                  man-pages-4.02.tar.xz                 XML-Parser-2.44.tar.gz
gawk-4.1.3.tar.xz                   md5sums                               xz-5.2.1.tar.xz
gcc-5.2.0.tar.bz2                   mpc-1.0.3.tar.gz                      zlib-1.2.8.tar.xz
gdb-7.10.tar.xz                     mpfr-3.1.3.tar.xz
gdbm-1.11.tar.gz                    mpfr-3.1.3-upstream_fixes-1.patch
Comment 1 Frank Ch. Eigler 2015-11-28 15:02:46 UTC
The definition of that tcphdr type is read out of the kernel vmlinux image (if the tapset doesn't request a header-file based cast, like e.g. inet_sock on tapset/linux/tcp.stp line 80.  The problematic @cast(... "tcphdr") bits could be converted to @cast(... "tcphdr", "kernel<net/tcp.h>") or similar.  But if debuginfo is missing, other things will be broken too.

Could you forward the stap-report output, as requested by the [man error::pass4] and [man error::reporting] pages?  I wouldn't be surprised if the vmlinux file was not as advertised.
Comment 2 SBW 2015-11-28 22:36:51 UTC
Created attachment 8817 [details]
stap-report
Comment 3 SBW 2015-11-28 23:48:25 UTC
I'm using a VM so I redid installing elfutils/systemtap on my stock image. 

The tapset problems relating to networking are the same. 

I uploaded the stap-report output.

I also tried another test:
-su-4.3# cat tcp_connections.stp
probe begin {
  printf("%6s %16s %6s %6s %16s\n",
         "UID", "CMD", "PID", "PORT", "IP_SOURCE")
}

probe kernel.function("tcp_accept").return?,
      kernel.function("inet_csk_accept").return? {
  sock = $return
  if (sock != 0)
    printf("%6d %16s %6d %6d %16s\n", uid(), execname(), pid(),
           inet_get_local_port(sock), inet_get_ip_source(sock))
}

And got these errors:

semantic error: while processing function __ip_sock_daddr
   thrown from: elaborate.cxx:4955
semantic error: unable to find member 'inet' for struct inet_sock (alternatives: pinet6, inet_id, sk, hdrincl, mc_index, min_ttl, tos, uc_index, cork, inet_opt, mc_list, freebind, is_icsk, mc_ttl, uc_ttl, mc_all, recverr, mc_addr, mc_loop, pmtudisc, rcv_tos, inet_saddr, inet_sport, nodefrag, transparent, rx_dst_ifindex, cmsg_flags, convert_csum, bind_address_no_port): operator '->' at /usr/share/systemtap/tapset/linux/ip.stp:107:48
   thrown from: dwflpp.cxx:3600
        source:                         @cast(sock, "inet_sock", "kernel<net/ip.h>")->inet->daddr)))
                                                                                    ^
        in expansion of macro: operator '@alternate' at /usr/share/systemtap/tapset/choose_defined.stpm:3:57
        source:         ( @defined(@value_if_defined) ? (@value_if_defined) : (@alternate) )
                                                                               ^
        in expansion of macro: operator '@choose_defined' at /usr/share/systemtap/tapset/linux/ip.stp:106:7
        source:                     @choose_defined(@cast(sock, "inet_sock")->daddr, # kernel >= 2.6.11
                                    ^
        in expansion of macro: operator '@alternate' at /usr/share/systemtap/tapset/choose_defined.stpm:3:57
        source:         ( @defined(@value_if_defined) ? (@value_if_defined) : (@alternate) )
                                                                               ^
        in expansion of macro: operator '@choose_defined' at /usr/share/systemtap/tapset/linux/ip.stp:105:3
        source:                 @choose_defined(@cast(sock, "inet_sock")->inet_daddr, # kernel >= 2.6.33
                                ^
        in expansion of macro: operator '@alternate' at /usr/share/systemtap/tapset/choose_defined.stpm:3:57
        source:         ( @defined(@value_if_defined) ? (@value_if_defined) : (@alternate) )
                                                                               ^
        in expansion of macro: operator '@choose_defined' at /usr/share/systemtap/tapset/linux/ip.stp:104:9
        source:         return @choose_defined(@cast(sock, "inet_sock")->sk->__sk_common->skc_daddr, # kernel >= 2.6.38
                               ^

Pass 2: analyzed script: 2 probe(s), 11 function(s), 4 embed(s), 0 global(s) using 44492virt/40256res/6408shr/34060data kb, in 1220usr/2330sys/3530real ms.
Pass 2: analysis failed.  [man error::pass2]

This behavior is the same as an earlier bug:

https://sourceware.org/bugzilla/show_bug.cgi?format=multiple&id=15171

Perhaps the above error indicates that stap can't determine my kernel version? (Some variable not set? I do know uname works fine).

Regardless I can still use some of the probes. If anything obvious is wrong please let me know.
Comment 4 Frank Ch. Eigler 2016-01-14 21:09:34 UTC
Interesting; the code works here on on f22 against 4.2.6.
I wonder if the problem relates to lack of kernel debuginfo.
The tapset has this construct:

function __ip_sock_daddr:long (sock:long)
        return @choose_defined(@cast(sock, "inet_sock")->sk->__sk_common->skc_daddr, # kernel >= 2.6.38
                @choose_defined(@cast(sock, "inet_sock")->inet_daddr, 
                    @choose_defined(@cast(sock, "inet_sock")->daddr,
                        @cast(sock, "inet_sock", "kernel<net/ip.h>")->inet->daddr)))

Note how only the last @cast() tries to generate debuginfo for
the struct type via a @cast(var, "type", "kernel<FOO.h>").  The
first three @casts don't have the header file parameter, so only
look in the vmlinux binary (if found).

My guess is that if the first three @cast's added a "kernel<BAR.h>",
it'd work even on your box, without kernel debuginfo.  One just
needs the right BAR.h.

And a good drink at said BAR.
Comment 5 Frank Ch. Eigler 2016-05-24 15:01:43 UTC
no recent reports, newer kernels OK too