Jonathan Lebon [Wed, 8 Jan 2014 20:54:23 +0000 (15:54 -0500)]
systemtap.spec: fix %post/%postun for initscript pkg
In the systemtap-iniscript pkg's %post and %postun, stap-server is
used rather systemtap (copy/pasted from stap-server's section?). Also,
we don't need the call to systemd-tmpfiles since the initscript pkg does
not define any tmpfiles.d config file (may also just have been left-over
from the server section).
Jonathan Lebon [Wed, 27 Nov 2013 16:21:02 +0000 (11:21 -0500)]
tapset-utrace.cxx: allow pid 1 probing
We originally limited the PID of a process(PID) probe to be greater than
1 to be on the safe side. Our latest utrace poses less risk and thus
probing init should be fine.
Frank Ch. Eigler [Sat, 11 Jan 2014 00:51:37 +0000 (19:51 -0500)]
stap translator: tolerate NULLs coming from some elfutils string lookups
It was reported on the mailing list, and privately experienced, that
stap pass-2 crashes could occur due to NULL dwarf_diename or
dwarf_decl_file's being propagated rather far within stap. This
commit adds protections (of the form ?: "foo") to eliminate the
problem in a few spots. There may be others; we should not store
so many raw char*'s.
Robin Hack [Thu, 9 Jan 2014 21:52:55 +0000 (16:52 -0500)]
tapset: add decoded sockaddr field vars to socket-related syscalls
* tapset/linux/aux_syscalls.stp (_struct_sockaddr_u_impl): New pretty-printer
with a bitfieldful of options.
* tapset/linux/syscalls.stpm (@_af_inet_info_u): New macro to pull call it.
* tapset/linux/*syscalls*: Lots of calls to it.
* testsuite/systemtap.examples/network/connect_stat.stp: New sample script.
Dave Brolley [Tue, 7 Jan 2014 19:16:53 +0000 (14:16 -0500)]
Add a ".local" variant to the DNS names on compile-server certificates.
This is an attempt to reduce the number of "Host name does not match
the DNS name(s) on the server certificate" warnings, particularly
when auto-discovering servers. Avahi presents the servers as
somehost.local, so having a somehost.local variant on the
server certificate prevents generating a warning for this
harmless situation.
Dave Brolley [Tue, 7 Jan 2014 19:07:02 +0000 (14:07 -0500)]
Rework compile-server discovery.
Limit unnecessary resolution of specified servers using dns
and Avahi.
- Don't dns-resolve avahi-provided host names.
- Don't query avahi when server address and port are provided directly.
- Allow unresolved host names to match an avahi-provided host name.
Josh Stone [Mon, 6 Jan 2014 19:55:13 +0000 (11:55 -0800)]
Don't cast the pointer in STAP_RETVALUE:string
If the embedded C author uses the wrong pointer type, the compiler
should rightly complain about it when passed to strlcpy. In the rare
case a cast is truly needed, it should be manually added.
Dave Brolley [Tue, 24 Dec 2013 19:39:28 +0000 (14:39 -0500)]
Client/server related testsuite tweaks.
- in setup_server, give the server a chance to start before checking
for it.
- The result of the test "New signing servers matches" was reversed
in client.exp.
Jonathan Lebon [Fri, 20 Dec 2013 15:46:52 +0000 (10:46 -0500)]
csclient.cxx: improve hostname resolution cache
Previously, the hostname resolution cache was associated with servers,
rather than addresses and hostnames. This can cause problems since
multiple servers can reside on the same host. This patch decouples
hostname cache from servers, resulting in proper handling of multiple
servers on the same host.
Jonathan Lebon [Wed, 18 Dec 2013 22:46:21 +0000 (17:46 -0500)]
csclient.cxx: ignore hostname for server equality
We previously relied on the hostname when checking whether two
compile_server_info objects referred to the same server. However, there
are situations where Avahi can report a server under a truncated
hostname which it previously reported using its FQDN.
So instead we change the strategy. We now rely on certificates to
determine whether two servers are indeed the same when one of them has a
missing IP address. The hostname is completely disregarded.
David Smith [Thu, 19 Dec 2013 19:05:15 +0000 (13:05 -0600)]
PR16207 partial fix: Fix the 'mount' [nd_]syscall.exp tests on rawhide.
* tapset/linux/syscalls2.stp: Add 'sys_oldumount' support to the
syscall.umount probe. Don't allow syscall nesting in syscall.umount.
* tapset/linux/nd_syscalls2.stp: Ditto.
Jonathan Lebon [Fri, 13 Dec 2013 21:37:57 +0000 (16:37 -0500)]
PR16326: fix client.exp and simplify it
The client.exp testcase can now (again) work with other stap-servers
running. This patch also significantly simplifies the code by
introducing a few array utility procedures.
Frank Ch. Eigler [Mon, 16 Dec 2013 18:38:39 +0000 (13:38 -0500)]
kernel tracepoints: bring up-to-date for 3.11
Added a whole bunch of other hidey-spots where kernel tracepoint
DEFINE_EVENT's were plopped in recent kernels, along with a few
incomplete-definition type workarounds.
For a virtio-serial port to be successfully installed, the domain also
requires a virtio-serial controller. When doing 'stapvirt port-add',
this is done automatically by libvirt. However, when using hotplugging,
stapvirt will fail if no controller is installed. Note that
virtio-serial controllers cannot be hotplugged.
Documenting this issue is a first step. Efforts are under way to make
this more transparent to the user.
Also see: https://bugzilla.redhat.com/show_bug.cgi?id=1020500#c34
David Smith [Thu, 12 Dec 2013 16:47:52 +0000 (10:47 -0600)]
Simplify sendfile.c test program for use on an NFS partition.
* testsuite/systemtap.syscall/sendfile.c: Simplify testcase. Originally
when compiled 32-bits on a 64-bit system and run on an NFS partition,
fstat() returns an invalid size of the newly created file. This invalid
size was verified with strace. Since we know the size of the file
anyway, just use it directly, which avoids the NFS problem.
Jonathan Lebon [Fri, 6 Dec 2013 19:50:44 +0000 (14:50 -0500)]
properly implement stapshd reload
We previously did not check the character device properly, which
resulted in all live sessions being killed during a reload. This was
especially an issue in the case of hotplugging under RHEL5/6, in which
udev can call reload multiple times even though only a single port was
hotplugged.
runtime: don't require CONFIG_KPROBES for user-space backtraces
* runtime/stack.c: Drop an unnecessary #ifdef CONFIG_KPROBES that
wrapped even pure-userspace stack-unwinding-related code, and
caused unnecessary -p4 failures.
Josh Stone [Mon, 2 Dec 2013 22:34:07 +0000 (14:34 -0800)]
Use proper set operations for symtab dupe checks
In query_symtab_func_info, rather than full set iteration to check an
address in alias_dupes, just use set::insert().second as a test. This
is what sets are designed to be algorithmically good at.
This also has the benefit of adding to alias_dupes, so duplicates within
the symbol table itself will still only be probed once. (If we didn't
want that effect, we would just use set::count() to test membership.)
Josh Stone [Mon, 2 Dec 2013 22:15:30 +0000 (14:15 -0800)]
Set git-describe --abbrev=12 for consistency and future-proofing
Git's default abbrev is 7, with smarts to disambiguate the SHA1 for that
given moment. Torvalds has recommended core.abbrev = 12 for kernel
developers to help avoid future as-yet-unknown collisions.
It becomes an issue to our scripts if this setting is not deterministic.
For instance, "make && sudo make install" will run git_version.sh with
$USER's git config, then root's config, but we don't want git_version.h
to be regenerated just for that difference.
Now our scripts use an explicit git-describe --abbrev=12 to be safe.
Jonathan Lebon [Mon, 2 Dec 2013 16:17:05 +0000 (11:17 -0500)]
stap-serverd: remember exact rc from spawned stap
Previously, stap-serverd used spawn_and_wait() to run stap and wait for
it to exit. However, the actual return code of stap was lost and never
bundled in the server response.
With this patch, spawn_and_wait() captures the child's exit rc in a
separate variable, so that we can differentiate between failure in
spawning and a nonzero exit code from the child.
So now the response/rc file holds the actual rc with which stap exited.
This makes a difference in the case of stap -l, in which we don't send a
script to the server and thus cannot rely on the presence or absence of
a compiled module in the server response to determine success.
csclient.cxx: don't print the 'via server failed' message if we're in
listing mode
Jonathan Lebon [Sat, 30 Nov 2013 16:14:25 +0000 (11:14 -0500)]
stapsh.c: fix handling of POLLIN to indicate EOF
We previously relied on POLLHUP to indicate EOF. However, it is also
possible to receive POLLIN when EOF is reached. With this patch, upon
receiving POLLIN and reading from the associated fd, if EOF is found, we
modify the polling array to indicate we're no longer interested.
Lukas Berk [Fri, 29 Nov 2013 21:34:11 +0000 (16:34 -0500)]
PR10208 Support probing weak symbols
*tapsets.cxx - Now always query the symtab (unless there is a pending interrupt
or dwarf callback error) on a function probe. We need to be careful
to check probe point's we've already resolved which will already
have full debug information and to not place another probe there.
We've removed the case of probing the symbol table on a statement probe,
as that code was written specifically for the kernel without userspace
in mind and was resolving the function the statement resided in (causing
errors in some cases).
*list.exp - Added testcase for weak symbols
*last_100_frees.stp - we use @defined($mem) here because on 64 bit systems, the
wildcard search takes us through both 64 bit and 32 bit libc
(which doesn't have debuginfo), this means the probe point
resolved from the 32 bit library has no context info
*mutex-contention.stp - ditto but for @defined($mutex) and @defined($rwlock)
Josh Stone [Tue, 26 Nov 2013 19:57:40 +0000 (11:57 -0800)]
stapdyn: Use plain CLOCK_MONOTONIC for -t timing
CLOCK_MONOTONIC_RAW has immunity to adjtime and NTP, but CLOCK_MONOTONIC
is often implemented in vdso. For simple timing, it's worth trading a
little accuracy for lower overhead.
Josh Stone [Tue, 26 Nov 2013 18:40:48 +0000 (10:40 -0800)]
stapdyn: Batch _stp_strncpy_from_user reads
In order to reduce the number of syscalls required, this strncpy now
opportunistically reads larger blocks before checking for '\0'. Reads
are kept within page boundaries to avoid running into invalid memory.
David Smith [Tue, 26 Nov 2013 16:58:22 +0000 (10:58 -0600)]
PR16207 partial fix: Fix the 'pipe' [nd_]syscall.exp tests on rawhide.
* tapset/linux/syscalls.stp: Handle syscall nesting in syscall.pipe.
* tapset/linux/nd_syscalls.stp: Handle syscall nesting in
nd_syscall.pipe.
* runtime/linux/compat_unistd.h: Add __NR_compat_pipe2.
* tapset/linux/aux_syscalls.stp (_sys_pipe2_flag_str): Handle a flags
value of 0.
* testsuite/systemtap.syscall/pipe.c: Add a new test.
David Smith [Mon, 25 Nov 2013 20:00:51 +0000 (14:00 -0600)]
PR16207 partial fix: Fix the 'dup' [nd_]syscall.exp tests on rawhide.
* tapset/linux/syscalls.stp: Split the syscall.dup2 probe into
syscall.dup2 and syscall.dup3.
* tapset/linux/nd_syscalls.stp: Split the nd_syscall.dup2 probe into
nd_syscall.dup2 and nd_syscall.dup3.
* runtime/linux/compat_unistd.h: Added the __NR_compat_dup3 define.
* testsuite/buildok/syscalls-detailed.stp: Added dup3 test.
* testsuite/buildok/nd_syscalls-detailed.stp: Ditto.
David Smith [Mon, 25 Nov 2013 17:03:24 +0000 (11:03 -0600)]
PR16207 partial fix: Fix the link [nd_]syscall.exp tests on rawhide.
* tapset/linux/syscalls.stp: Add @__syscall_compat_gate() macro call to
syscall.linkat probe.
* tapset/linux/syscalls2.stp: Add @__syscall_compat_gate() macro call to
syscall.readlinkat and syscall.symlinkat probes.
* tapset/linux/nd_syscalls.stp: Add @__syscall_compat_gate() macro call to
nd_syscall.linkat probe.
* tapset/linux/nd_syscalls2.stp: Add @__syscall_compat_gate() macro call
to nd_syscall.readlinkat and nd_syscall.symlinkat probes.
* runtime/linux/compat_unistd.h: Added the __NR_compat_linkat,
__NR_compat_readlinkat, and __NR_compat_symlinkat defines.
* testsuite/systemtap.syscall/link.c: Updated testcase to handle syscall
probes no longer being a wrapper around other syscall probes.
David Smith [Mon, 25 Nov 2013 15:30:49 +0000 (09:30 -0600)]
PR16207 partial fix: Fix the chmod [nd_]syscall.exp tests on rawhide.
* tapset/linux/syscalls.stp: Add @__syscall_compat_gate() macro call to
syscall.fchmodat, syscall.fchmodat.return, syscall.fchownat, and
syscall.fchownat.return probes.
* tapset/linux/syscalls.stp: Add @__syscall_compat_gate() macro call to
nd_syscall.fchmodat, nd_syscall.fchmodat.return, nd_syscall.fchownat,
and nd_syscall.fchownat.return probes.
* runtime/linux/compat_unistd.h: Added __NR_compat_fchmodat and
__NR_compat_fchownat defines.
* testsuite/systemtap.syscall/chmod.c: Updated testcase to handle
syscall probes no longer being a wrapper around other syscall probes.
David Smith [Fri, 22 Nov 2013 22:41:54 +0000 (16:41 -0600)]
PR16207 partial fix: Fix the access [nd_]syscall.exp tests on rawhide.
* tapset/linux/syscalls.stpm: Add @__syscall_compat_gate() macro.
* tapset/linux/syscalls.stp: Add @__syscall_compat_gate() macro call to
syscall.faccessat and syscall.faccess.return.
* tapset/linux/nd_syscalls.stp: Add @__syscall_compat_gate() macro call to
nd_syscall.faccessat and nd_syscall.faccess.return.
* testsuite/systemtap.syscall/access.c: Updated testcase to handle
access() no longer being a wrapper around faccessat().
* runtime/linux/compat_unistd.h: New file.
* tapset/linux/aux_syscalls.stp: Include compat_unistd.h.
David Smith [Fri, 22 Nov 2013 20:51:50 +0000 (14:51 -0600)]
PR15219 partial fix. The [nd_]syscall.timer_settime probes no longer nest.
* tapset/linux/syscalls2.stp: Add compat function support to
'syscall.timer_settime' and 'syscall.timer_settime.return' probes.
* tapset/linux/nd_syscalls2.stp: Add compat function support to
'nd_syscall.timer_settime' and 'nd_syscall.timer_settime.return'
probes.
* tapset/linux/aux_syscalls.stp (_struct_compat_itimerspec_u): New
function.
Aaron Tomlin [Fri, 22 Nov 2013 15:03:02 +0000 (15:03 +0000)]
Add STAP_ERROR macro
Instead of CONTEXT->last_error = "foo"; goto out; in an embedded-C
function, a newly defined macro STAP_ERROR(str) should be used.
The script can catch the exception with try { } catch { }.
Josh Stone [Wed, 20 Nov 2013 21:01:10 +0000 (13:01 -0800)]
Tighten -Wno-format-nonliteral to just where it's needed
We only have one function, stap_strfloctime(), which actually requires
relaxing this warning; the rest can and should be checked. Split this
function into its own file, and give just that the relaxed option.