Jonathan Lebon [Fri, 17 Jan 2014 17:11:20 +0000 (12:11 -0500)]
stap[run/dyn]: disable colors when SYSTEMTAP_COLORS empty
The current behaviour of SYSTEMTAP_COLORS is to turn on colors if it is
not set, or set but empty, and to turn off colors when set and invalid.
With this patch, rather than having users purposely make it invalid to
turn off colors, we interpret a set but empty SYSTEMTAP_COLORS to mean
turning colors off.
Jonathan Lebon [Thu, 16 Jan 2014 22:21:41 +0000 (17:21 -0500)]
PR15781: fix suggestion logic for optional probes
We previously used the heuristic of not suggesting functions for
optional probes to get around issues with recursive calls to
derive_probes from globby probes. This had the disadvantage that
suggestions could not be made for truly (script-level) optional probes.
We now add the new field 'from_glob' to probe_point which tracks whether
the probe_point was created out of a globby pp. We can thus now easily
determine when it is correct to suggest something, and when we should
suggest nothing but rather accumulate modules to suggest from.
Jonathan Lebon [Thu, 16 Jan 2014 22:13:36 +0000 (17:13 -0500)]
elaborate.cxx: save up all errors from optional pps
We remember all semantic_error objects caught even for optional probe
points so that if we get an error for a non-optional probe point, we
also print out the info of optional probe points that failed. This
gives users a clearer picture of why the whole probe failed.
Jonathan Lebon [Thu, 16 Jan 2014 22:07:03 +0000 (17:07 -0500)]
semantic_error: let it own its chain
This patch simplifies the way semantic_error chains are used by allowing
the parent object to own its chain. Upon setting the chain, the parent
creates a copy and keeps it secret, to be de-allocated upon destruction.
The patch also modifies the semantic_error constructor to allow the
chain to be also set at the same time.
Lukas Berk [Mon, 20 Jan 2014 20:39:35 +0000 (15:39 -0500)]
Add java backtrace test
*java.exp - delay removal of singleparam.class so backtrace test can use
it, also add the backtrace testcase
*singleparam.java - have method's call each other in the same order
instead of having each called directly from main,
this allows for a better backtrace
*java.stp - renamed to singleparam.stp for consistency with other tests
Lukas Berk [Mon, 20 Jan 2014 20:33:25 +0000 (15:33 -0500)]
Update/correct java testcase string
A remanent of when pn() originally passed the class.method name. This was
changed before the 2.2.1 release due to the fact byteman can't (yet)
properly pass that, so the unique identifier was changed. We should
update what the testcase is looking for just the parameter that was
passed.
* testsuite/systemtap.apps/java.exp - update search strings in test
Lukas Berk [Mon, 20 Jan 2014 20:30:54 +0000 (15:30 -0500)]
Fix how stapbm passes itself the methodname
*java/stapbm.in - we need to make sure stapbm passes the methodname with
the surrounding quotes in case there are multiple
parameters in the method call (otherwise stambm will
call itself with up to 17 parameters and error out)
Lukas Berk [Fri, 17 Jan 2014 23:26:58 +0000 (18:26 -0500)]
Add java backtrace functionality
This commit adds two functions, sprint_java_backtrace() and
print_java_backtrace. The former returns the java backtrace as one
string (may need to set the -DMAXSTRINGLEN var to read the entire
backtrace), and latter prints the java backtrace one line at a time.
*java/HelperSDT.c - Add METHOD_STAP_BT and _METHOD_BT_DELETE jni
functions
*java/HelperSDT.h - ditto
*java/..../HelperSDT.java - ditto
*java/stapbm.in - add conditional calls to functions based on backtrace
flag status
*tapset-method.cxx - add the probe points for METHOD_STAP_BT and
METHOD_BT_DELETE to be handled
*tapset/java.stp - add sprint_java_backtrace and print_java_backtrace functions
David Smith [Fri, 17 Jan 2014 19:21:40 +0000 (13:21 -0600)]
Improve avahi string list handling.
* csclient.cxx (get_value_from_avahi_string_list): Rewrite
extract_field_from_avahi_txt() to use native avahi string list
functions.
(resolve_callback): Call new function.
David Smith [Fri, 17 Jan 2014 18:11:51 +0000 (12:11 -0600)]
Handle updated server messages in server_[args,concurrency].exp testcases.
* testsuite/systemtap.server/server_args.exp
(stap_direct_and_with_client): Handle more than one hostname to skip.
* testsuite/systemtap.server/server_concurrency.exp: Handle new network
port message output.
Josh Stone [Fri, 17 Jan 2014 06:10:34 +0000 (22:10 -0800)]
Remember which rpms have been checked in the session
When the kernel or any userspace file is missing debuginfo, we run an
rpm query so packages can be suggested. If the user tries many such
probes on the same target, it's a waste to repeat the same query. Now
we remember which targets have already been checked in the session.
This was seen on a simple 'stap -l syscall.*', which took much longer to
run when debuginfo was missing than when present. With this patch, the
first syscall miss will lead to an rpm query, but each following miss
will know it's already been done.
Jonathan Lebon [Wed, 8 Jan 2014 20:54:23 +0000 (15:54 -0500)]
systemtap.spec: fix %post/%postun for initscript pkg
In the systemtap-iniscript pkg's %post and %postun, stap-server is
used rather systemtap (copy/pasted from stap-server's section?). Also,
we don't need the call to systemd-tmpfiles since the initscript pkg does
not define any tmpfiles.d config file (may also just have been left-over
from the server section).
Jonathan Lebon [Wed, 27 Nov 2013 16:21:02 +0000 (11:21 -0500)]
tapset-utrace.cxx: allow pid 1 probing
We originally limited the PID of a process(PID) probe to be greater than
1 to be on the safe side. Our latest utrace poses less risk and thus
probing init should be fine.
Frank Ch. Eigler [Sat, 11 Jan 2014 00:51:37 +0000 (19:51 -0500)]
stap translator: tolerate NULLs coming from some elfutils string lookups
It was reported on the mailing list, and privately experienced, that
stap pass-2 crashes could occur due to NULL dwarf_diename or
dwarf_decl_file's being propagated rather far within stap. This
commit adds protections (of the form ?: "foo") to eliminate the
problem in a few spots. There may be others; we should not store
so many raw char*'s.
Robin Hack [Thu, 9 Jan 2014 21:52:55 +0000 (16:52 -0500)]
tapset: add decoded sockaddr field vars to socket-related syscalls
* tapset/linux/aux_syscalls.stp (_struct_sockaddr_u_impl): New pretty-printer
with a bitfieldful of options.
* tapset/linux/syscalls.stpm (@_af_inet_info_u): New macro to pull call it.
* tapset/linux/*syscalls*: Lots of calls to it.
* testsuite/systemtap.examples/network/connect_stat.stp: New sample script.
Dave Brolley [Tue, 7 Jan 2014 19:16:53 +0000 (14:16 -0500)]
Add a ".local" variant to the DNS names on compile-server certificates.
This is an attempt to reduce the number of "Host name does not match
the DNS name(s) on the server certificate" warnings, particularly
when auto-discovering servers. Avahi presents the servers as
somehost.local, so having a somehost.local variant on the
server certificate prevents generating a warning for this
harmless situation.
Dave Brolley [Tue, 7 Jan 2014 19:07:02 +0000 (14:07 -0500)]
Rework compile-server discovery.
Limit unnecessary resolution of specified servers using dns
and Avahi.
- Don't dns-resolve avahi-provided host names.
- Don't query avahi when server address and port are provided directly.
- Allow unresolved host names to match an avahi-provided host name.
Josh Stone [Mon, 6 Jan 2014 19:55:13 +0000 (11:55 -0800)]
Don't cast the pointer in STAP_RETVALUE:string
If the embedded C author uses the wrong pointer type, the compiler
should rightly complain about it when passed to strlcpy. In the rare
case a cast is truly needed, it should be manually added.
Dave Brolley [Tue, 24 Dec 2013 19:39:28 +0000 (14:39 -0500)]
Client/server related testsuite tweaks.
- in setup_server, give the server a chance to start before checking
for it.
- The result of the test "New signing servers matches" was reversed
in client.exp.
Jonathan Lebon [Fri, 20 Dec 2013 15:46:52 +0000 (10:46 -0500)]
csclient.cxx: improve hostname resolution cache
Previously, the hostname resolution cache was associated with servers,
rather than addresses and hostnames. This can cause problems since
multiple servers can reside on the same host. This patch decouples
hostname cache from servers, resulting in proper handling of multiple
servers on the same host.
Jonathan Lebon [Wed, 18 Dec 2013 22:46:21 +0000 (17:46 -0500)]
csclient.cxx: ignore hostname for server equality
We previously relied on the hostname when checking whether two
compile_server_info objects referred to the same server. However, there
are situations where Avahi can report a server under a truncated
hostname which it previously reported using its FQDN.
So instead we change the strategy. We now rely on certificates to
determine whether two servers are indeed the same when one of them has a
missing IP address. The hostname is completely disregarded.
David Smith [Thu, 19 Dec 2013 19:05:15 +0000 (13:05 -0600)]
PR16207 partial fix: Fix the 'mount' [nd_]syscall.exp tests on rawhide.
* tapset/linux/syscalls2.stp: Add 'sys_oldumount' support to the
syscall.umount probe. Don't allow syscall nesting in syscall.umount.
* tapset/linux/nd_syscalls2.stp: Ditto.
Jonathan Lebon [Fri, 13 Dec 2013 21:37:57 +0000 (16:37 -0500)]
PR16326: fix client.exp and simplify it
The client.exp testcase can now (again) work with other stap-servers
running. This patch also significantly simplifies the code by
introducing a few array utility procedures.
Frank Ch. Eigler [Mon, 16 Dec 2013 18:38:39 +0000 (13:38 -0500)]
kernel tracepoints: bring up-to-date for 3.11
Added a whole bunch of other hidey-spots where kernel tracepoint
DEFINE_EVENT's were plopped in recent kernels, along with a few
incomplete-definition type workarounds.
For a virtio-serial port to be successfully installed, the domain also
requires a virtio-serial controller. When doing 'stapvirt port-add',
this is done automatically by libvirt. However, when using hotplugging,
stapvirt will fail if no controller is installed. Note that
virtio-serial controllers cannot be hotplugged.
Documenting this issue is a first step. Efforts are under way to make
this more transparent to the user.
Also see: https://bugzilla.redhat.com/show_bug.cgi?id=1020500#c34
David Smith [Thu, 12 Dec 2013 16:47:52 +0000 (10:47 -0600)]
Simplify sendfile.c test program for use on an NFS partition.
* testsuite/systemtap.syscall/sendfile.c: Simplify testcase. Originally
when compiled 32-bits on a 64-bit system and run on an NFS partition,
fstat() returns an invalid size of the newly created file. This invalid
size was verified with strace. Since we know the size of the file
anyway, just use it directly, which avoids the NFS problem.
Jonathan Lebon [Fri, 6 Dec 2013 19:50:44 +0000 (14:50 -0500)]
properly implement stapshd reload
We previously did not check the character device properly, which
resulted in all live sessions being killed during a reload. This was
especially an issue in the case of hotplugging under RHEL5/6, in which
udev can call reload multiple times even though only a single port was
hotplugged.
runtime: don't require CONFIG_KPROBES for user-space backtraces
* runtime/stack.c: Drop an unnecessary #ifdef CONFIG_KPROBES that
wrapped even pure-userspace stack-unwinding-related code, and
caused unnecessary -p4 failures.
Josh Stone [Mon, 2 Dec 2013 22:34:07 +0000 (14:34 -0800)]
Use proper set operations for symtab dupe checks
In query_symtab_func_info, rather than full set iteration to check an
address in alias_dupes, just use set::insert().second as a test. This
is what sets are designed to be algorithmically good at.
This also has the benefit of adding to alias_dupes, so duplicates within
the symbol table itself will still only be probed once. (If we didn't
want that effect, we would just use set::count() to test membership.)
Josh Stone [Mon, 2 Dec 2013 22:15:30 +0000 (14:15 -0800)]
Set git-describe --abbrev=12 for consistency and future-proofing
Git's default abbrev is 7, with smarts to disambiguate the SHA1 for that
given moment. Torvalds has recommended core.abbrev = 12 for kernel
developers to help avoid future as-yet-unknown collisions.
It becomes an issue to our scripts if this setting is not deterministic.
For instance, "make && sudo make install" will run git_version.sh with
$USER's git config, then root's config, but we don't want git_version.h
to be regenerated just for that difference.
Now our scripts use an explicit git-describe --abbrev=12 to be safe.
Jonathan Lebon [Mon, 2 Dec 2013 16:17:05 +0000 (11:17 -0500)]
stap-serverd: remember exact rc from spawned stap
Previously, stap-serverd used spawn_and_wait() to run stap and wait for
it to exit. However, the actual return code of stap was lost and never
bundled in the server response.
With this patch, spawn_and_wait() captures the child's exit rc in a
separate variable, so that we can differentiate between failure in
spawning and a nonzero exit code from the child.
So now the response/rc file holds the actual rc with which stap exited.
This makes a difference in the case of stap -l, in which we don't send a
script to the server and thus cannot rely on the presence or absence of
a compiled module in the server response to determine success.
csclient.cxx: don't print the 'via server failed' message if we're in
listing mode