David Smith [Tue, 22 Apr 2014 15:32:05 +0000 (10:32 -0500)]
PR16716 partial fix: Better types in 'syscall.{access,faccessat}'.
* tapset/linux/syscalls.stp: Fix 'mode' variable type in
'syscall.{access,faccessat}'.
* tapset/linux/aux_syscall.stp: Convert to use _stp_lookup_or_str().
* testsuite/systemtap.syscall/access.c (main): Add tests.
David Smith [Tue, 22 Apr 2014 13:42:59 +0000 (08:42 -0500)]
PR16716 partial fix: Better types in 'syscall.{swapon,syslog}'.
* tapset/linux/syscalls2.stp: Fix types in 'syscall.syslog' and
'syscall.swapon'. In 'syscall.swapon', decode the flags in the new
'swapflags' variable.
* tapset/linux/nd_syscalls2.stp: In 'nd_syscall.swapon', decode the flags
in the new 'swapflags' variable.
* tapset/linux/aux_syscalls.stp (_swapon_flags_str): New function.
* tapset/uconversions.stp (user_string_n2_quoted): If we're in a compat
task, when printing the pointer value as a number, don't expand it to
64-bits.
* testsuite/systemtap.syscall/swap.c: Added more tests.
* testsuite/systemtap.syscall/syslog.c: New test case.
* testsuite/buildok/syscalls2-detailed.stp: Added new 'swapflags' variable.
* testsuite/buildok/nd_syscalls2-detailed.stp: Ditto.
Stan Cox [Tue, 22 Apr 2014 13:37:16 +0000 (09:37 -0400)]
Add python support tapset example.
* systemtap.examples/general/tapset/(python2.stp,python3.stp,python.stpm): Python support tapset
* systemtap.examples/general/(pyexample.meta,pyexample.stp,pyexample.py): Use it.
David Smith [Mon, 21 Apr 2014 18:43:33 +0000 (13:43 -0500)]
PR16716 partial fix: Better types in 'syscall.{getpriority,setpriority}'.
* tapset/linux/syscalls.stp: Fix types in 'syscall.getpriority'.
* tapset/linux/syscalls2.stp: Fix types in 'syscall.setpriority'.
* tapset/linux/aux_syscalls.stp (_priority_which_str): Convert to use
_stp_lookup_str().
* testsuite/systemtap.syscall/getpriority.c: New test case.
* testsuite/systemtap.syscall/setpriority.c: Ditto.
Josh Stone [Fri, 18 Apr 2014 01:14:06 +0000 (18:14 -0700)]
tapset: mark _sched_policy_str() as pure and :string
This was causing issues in any script that pulled in aux_syscalls, but
didn't use _sched_policy_str(). Since it wasn't pure, it couldn't be
elided, and since it wasn't explicitly declared :string, there was no
type deduction to define its STAP_RETVALUE.
Josh Stone [Fri, 18 Apr 2014 00:54:01 +0000 (17:54 -0700)]
testsuite: don't mix -p2 and -l in labels.exp
Since commit bba368c5c5fea, -p2 and -l are exclusive options, even
though internally they are similar.
Furthermore, "labels exe .statement" and "labels so .statement" had this
failure masked, because they were looking for the absence of the error
"semantic error: no match" rather than the presence of actual matches.
Thus the option-validating error didn't trigger failure before.
Josh Stone [Thu, 17 Apr 2014 21:32:40 +0000 (14:32 -0700)]
Prevent lock-recursion in _stp_ctl_send
In rare cases, we may hit a probe while the transport layer is holding a
spinlock, and that probe may call _stp_ctl_send which tries to grab the
same and deadlocks. This is a bit easier to trigger on lockdep-enabled
kernels with the lock_acquired tracepoint.
This patch refactors the context->busy state management into get/put
context, and those areas which grab probe-sensitive locks now wrap
themselves with a context to stay comfortably free of probes.
Tangentially, the dyninst side abandons the busy flag, as it already had
a tls_context pointer to prevent direct recursion and a mutex for
exclusive access across all processes.
DEBUG_REENTRANCY is an unfortunate casualty, because we can't safely
call _stp_warn when the busy context may be from those held locks.
Jonathan Lebon [Thu, 17 Apr 2014 21:34:24 +0000 (17:34 -0400)]
stmt_rel.exp: improve coverage
The testcase previously only tested that specific relative linenos in
bio_init() were valid and that there were at least 3 linenos available
for probing.
We now improve this test by checking that probes listed by the wildcard
lineno are all accessible by relative numbering as well. As a sanity
check, we check that bio_init() has at least 3 linenos available, as
before.
Jonathan Lebon [Thu, 17 Apr 2014 16:21:25 +0000 (12:21 -0400)]
add DWARF_LINE* macros to help diagnosis
Similarly to the previous commit, we modify the new safe_dwarf_line*()
functions so that they carry __FILE__ and __LINE__ information into the
error. This new information is filled in when using the new DWARF_LINE*
macros.
Before:
semantic error: libdw failure (dwarf_lineaddr): no error
After:
semantic error: libdw failure (dwarf_lineaddr): no error
thrown from: ../systemtap/dwflpp.cxx:2242
Jonathan Lebon [Thu, 17 Apr 2014 16:08:00 +0000 (12:08 -0400)]
add DWFL_ASSERT and DWARF_ASSERT to help diagnosis
Semantic errors thrown from dwfl_assert() and dwarf_assert() lacked any
positional information to help track down where the assertion failed. We
create two new macros, DWFL_ASSERT and DWARF_ASSERT, which carry down
the __FILE__ and __LINE__ information so that the semantic_error created
contains that information, which can be printed out using -vv.
Before:
semantic error: libdwfl failure (asserting!): no error
After:
semantic error: libdwfl failure (asserting!): no error
thrown from: ../systemtap/tapsets.cxx:7183
Jonathan Lebon [Wed, 16 Apr 2014 18:02:21 +0000 (14:02 -0400)]
statement.exp: rework and expand
The statement.exp test case previously only tested a few specific cases.
We now introduce a new test program, 'statement.c', on which we can test
for all the things we previously tested, allowing us to remove the other
test programs. Furthermore, we extend coverage to test many other
possible combinations.
Jonathan Lebon [Wed, 16 Apr 2014 14:40:16 +0000 (10:40 -0400)]
dwflpp: implement new iterate_over_srcfile_lines()
We finally implement the new iterate_over_srcfile_lines(). The basic
strategy is to look at each matching DIE, rather than just the line
records matching the linenos so that we properly match, for example,
functions inlined multiple times (which can yield multiple sets of line
records for the same lineno but at the various addresses where inlined).
Jonathan Lebon [Wed, 16 Apr 2014 20:03:07 +0000 (16:03 -0400)]
add dwarf_query::filtered_all
In dwarf_query-related functions, we very often need to carry out the
same operation on both filtered_functions and filtered_inlines. Rather
than duplicating code, create a new dwarf_query function which creates a
temporary vector containing all of them.
Jonathan Lebon [Wed, 16 Apr 2014 18:08:47 +0000 (14:08 -0400)]
dwflpp: add CU line caching
The upcoming patches re-implementing iterate_over_srcfile_lines() will
depend on the use of CU lines in lineno order. Since dwarf_getsrclines()
outputs them in addr order, it greatly helps performance to cache the
sorted version.
Jonathan Lebon [Wed, 16 Apr 2014 14:47:30 +0000 (10:47 -0400)]
dwarf_wrappers: remove dwarf_line_t class
In the coming patches, we will make liberal use of Dwarf_Line. Rather
than requiring conversion to dwarf_line_t, which is very often overkill
and too verbose, we introduce new helper functions which are safe
versions of their dwarf equivalent.
Jonathan Lebon [Wed, 16 Apr 2014 14:51:24 +0000 (10:51 -0400)]
gut out dwflpp::iterate_over_srcfile_lines()
To prepare for the new code, we empty out iterate_over_srcfile_lines()
and remove associated functions. This is also where we break the link
between dwflpp and dwarf_query (the original issue mentioned in
PR16615).
Jonathan Lebon [Fri, 4 Apr 2014 17:47:54 +0000 (13:47 -0400)]
dwarf_query: rename line to linenos
Rename both the dwarf_query 'line' member to 'linenos' as well as the
enum type 'line_t' to 'lineno_t'. This more accurately reflects line
numbers, as opposed to Dwarf_Line or dwarf_line_t objects, which are
often simply named 'line' in other contexts.
Jonathan Lebon [Fri, 4 Apr 2014 17:25:07 +0000 (13:25 -0400)]
tapsets.cxx: simplify query_srcfile_label
The query_srcfile_line() callback checked if the query had a
statement(str). This could have evaluated to false in the past (when
query_cu() treated both .statement(str) and .statement(num)), but now
query_srcfile_line() is only used for statement/function(func@file:N)
probes, so we can simplify it.
David Smith [Thu, 17 Apr 2014 19:39:39 +0000 (14:39 -0500)]
Fixed PR16806 by improving task_finder/utrace shutdown.
* runtime/stp_utrace.c (utrace_init): Clear out the kmem cache pointers
after destroying the caches.
(utrace_exit): Ditto.
(utrace_shutdown): Updated comments.
(utrace_free): Lock the utrace structure while cleaning up.
* runtime/linux/task_finder2.c (stap_task_finder_post_init): If the
task_finder state isn't 'running', quit early.
(stap_stop_task_finder): Call stp_task_work_exit() to wait on any
remaining task_work items.
(utrace_report_exec): If the utrace state isn't registered, quit.
(utrace_report_syscall_entry): Ditto.
(utrace_report_syscall_exit): Ditto.
(utrace_report_clone): Ditto.
(utrace_report_death): Ditto.
Lukas Berk [Thu, 17 Apr 2014 15:17:32 +0000 (11:17 -0400)]
PR16829 rework, have staprun export verbosity flag
*java/stapbm.in - rename STAPBM_VERBOSE to general SYSTEMTAP_VERBOSE
flag
*staprun/staprun.8 - note new SYSTEMTAP_VERBOSE env variable
*staprun/staprun.c - set SYSTEMTAP_VERBOSE env var from -v's passed to
staprun
*tapset-method.cxx - revert leftbits string to previous assignment
Victor Kamensky [Tue, 8 Apr 2014 05:23:39 +0000 (22:23 -0700)]
runtime: linux 3.14 porting: case when CONFIG_USER_NS not defined
Fix build problem for linux-3.14 case with config where
CONFIG_USER_NS is not defined. With CONFIG_UIDGID_STRICT_TYPE_CHECKS
removed (261000a56b6382f597bcb12000f55c9ff26a1efb) access to
kuid_t and kgid_t should happen through from_k?uid_munged call.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Josh Stone [Thu, 10 Apr 2014 00:27:13 +0000 (17:27 -0700)]
PR16719: Fix a couple leaked Dwfl instances
* setupdwfl.cxx (setup_dwfl_kernel): When recursing into another round
after downloading, call dwfl_end on the Dwfl that we already started.
* tapset.cxx (tracepoint_builder::init_dw): Call dwfl_end on the Dwfl
used to fill in the s.kernel_source_tree.
* testsuite/systemtap.base/pr16719.exp: Add a tracepoint subtest.
David Smith [Wed, 9 Apr 2014 17:00:05 +0000 (12:00 -0500)]
PR16716: Fix types in syscall.sched_{getscheduler,setscheduler,rr_get_interval}
* tapset/linux/syscalls2.stp (syscall.sched_getscheduler): Fixed types.
(syscall.sched_setscheduler): Ditto.
(syscall.sched_rr_get_interval): Fixed nesting and types. Also change
'argstr' to just have a pointer to the 'struct timespec' value, since
that is an output parameter and decoding it on input won't produce
anything of value.
* tapset/linux/nd_syscalls2.stp: Ditto.
* tapset/linux/aux_syscalls.stp (_sched_policy_str): Updated to handle new
values, including the new SCHED_RESET_ON_FORK flag.
* testsuite/systemtap.syscall/test.tcl (run_one_test): Since execname()
only returns the first 15 characters of the test program name, truncate
it.
* testsuite/systemtap.syscall/sched_getscheduler.c: New testcase
* testsuite/systemtap.syscall/sched_rr_get_interval.c: Ditto.
* testsuite/systemtap.syscall/sched_setscheduler.c: Ditto.
Some funky errors occur on some fedora installations featuring perhaps
only partial publican setup, or some other mysterious causes. Add
some xml tags to hit the default ="common" case, and add a Makefile
conditional to have publican force --pdftool=fop rather than
wkhtmltopdf, which fails in entertaining ways sometimes.
Josh Stone [Fri, 4 Apr 2014 21:14:41 +0000 (14:14 -0700)]
testsuite: perf counter test improvements
- Use anchored -re patterns for more precise matching.
- Remove unused counter_a/b that caused unmatched warnings.
- Allow "max towers" to be one digit less.
Jonathan Lebon [Wed, 2 Apr 2014 20:24:08 +0000 (16:24 -0400)]
PR16307: testsuite: use new kill proc
Replace 'exec kill' by a call to the new kill proc, which accounts for
double-dashing as necessary. Where it makes sense, the timeout argument
was also used so that a SIGKILL was also sent after a few seconds.
Jonathan Lebon [Wed, 2 Apr 2014 16:31:36 +0000 (12:31 -0400)]
PR16307: proc kill: new proc for safer killing
During setup, check what kind of kill executable we're dealing with to
find out whether we'll need to use double dashes when calling it. We
also create a new kill proc that takes this into account when calling
kill.
David Smith [Thu, 3 Apr 2014 20:08:56 +0000 (15:08 -0500)]
PR16716 partial fix: Better types in 'syscall.shutdown'.
* tapset/linux/syscalls2.stp: Fix types in syscall.shutdown.
* tapset/linux/aux_syscalls.stp: Convert _shutdown_how_str() to use
_stp_lookup_str().
* runtime/linux/compat_net.h: Define SHUT_* for RHEL5.
* testsuite/systemtap.syscall/shutdown.c: New testcase.
William Cohen [Wed, 2 Apr 2014 20:16:21 +0000 (16:16 -0400)]
Make vm.pagefault and vm.pagefault.return probe only one real function
Newer versions of the kernel have both __handle_mm_fault and
handle_mm_fault functions. The __handle_mm_fault may be inlined for
some kernel causing some arguments and the return probe to be
unavailable. The memory tapset should just watch one of these
functions to instrument.
Changed all remaining "sleep 0.2" test cases to "sleep 1", so that the
check.exp mapping to "/usr/bin/stress ...." is triggered. Further,
made the stress a little more stressy with more i/o workload.
William Cohen [Wed, 2 Apr 2014 19:50:23 +0000 (15:50 -0400)]
Adjust the output of sched_switch.stp to be more readable
Several numerical values were printed as one string of digits. Placed
spaces between the numbers and adjusted to formatting so that the output
is easier to read.
Jonathan Lebon [Mon, 31 Mar 2014 20:11:27 +0000 (16:11 -0400)]
stap: unite all dumping modes into s.dump_mode
Clean up the way dumping modes are implement. We create a new enum
member which tracks the type of dumping wanted. This enum handles all
stap invocations which do not directly handle a user-script:
-l/-L/--dump-probe-types/--dump-probe-aliases/--dump-functions. It also
allows us to clean up the cmdline_script/have_script hacks previously
used in switch handling.
The trickiest part about this patch is to now allow for the possibility
of s.user_file to be NULL throughout passes 1 and 2, which previously
always assumed a script was present.
Jonathan Lebon [Fri, 28 Mar 2014 22:02:20 +0000 (18:02 -0400)]
stap: add --dump-probe-aliases
We add a new --dump-probe-aliases switch, which dumps all the aliases
picked up in library files after pass 1 and then exits. Aliases whose
names don't start with '_' are hidden behind a -v.
Also change probe_alias::printsig() so that epilogue-style aliases are
printed properly.
David Smith [Tue, 1 Apr 2014 19:01:16 +0000 (14:01 -0500)]
Make _struct_sockaddr_u_impl() tapset function output more consistent.
* tapset/linux/aux_syscalls.stp (_struct_sockaddr_u_impl): Make output
more consistent. When _struct_sockaddr_u_impl() succeeds, it returns
'{INFO}'. When it failed, it returned '[...]'. Now when it fails it
returns '{...}'.
* testsuite/systemtap.syscall/bind.c: Updated expected test output.
* testsuite/systemtap.syscall/connect.c: Updated expected test output.
* testsuite/systemtap.syscall/sendto.c: Updated expected test output.
David Smith [Tue, 1 Apr 2014 18:30:06 +0000 (13:30 -0500)]
PR16716 partial fix: Better types in 'syscall.{send,sendto}'.
* tapset/linux/syscalls2.stp: Fixed types and nesting in
'syscall.{send,sendto}'. Fixed a few more types in
'syscall.{recv,recvfrom}'.
* tapset/linux/nd_syscalls2.stp: Ditto.
* runtime/linux/compat_unistd.h: Added __NR_sendto.
* tapset/linux/aux_syscalls.stp: Be sure we have the SYS_* defines by
including '<linux/net.h>'.
* testsuite/systemtap.syscall/send.c: New testcase.
* testsuite/systemtap.syscall/sendto.c: New testcase.
* testsuite/systemtap.syscall/recv.c: Added more testing of the 'flags'
parameter.
* testsuite/systemtap.syscall/recvfrom.c: Ditto.
* testsuite/systemtap.syscall/recvmmsg.c: Ditto.
* testsuite/systemtap.syscall/recvmsg.c: Ditto.
Frank Ch. Eigler [Sun, 30 Mar 2014 12:17:23 +0000 (08:17 -0400)]
PR16766 cont'd: tolerate STAP_SESSION_ERROR in module_refresh callback
If a systemtap module is in error state (but not yet shut down via
message from stapio/staprun), it may harmlessly continue receiving
and processing module-notification callbacks.
Josh Stone [Fri, 28 Mar 2014 20:53:15 +0000 (13:53 -0700)]
runtime: improve the preempt_enable_no_resched copy
The variant introduced in commit 651a87924c22 only handled the
CONFIG_PREEMPT_COUNT side of things, which causes mayhem when that's not
set, as with kernel-3.14.0-0.rc8.git0.1.fc21.x86_64. The #else case just
needs a barrier(). The former "TODO rethink" still stands.
David Smith [Fri, 28 Mar 2014 19:10:01 +0000 (14:10 -0500)]
PR16716 partial fix: Fix 'syscall.{recv,recvfrom}' on 32-bit platforms.
* tapset/uconversions.stp (user_ulong): New function.
(user_ulong_warn): Ditto.
* tapset/linux/syscalls2.stp: Change all calls to user_uint64() to
user_ulong(), so that 32-bit platforms are handled correctly.
* tapset/linux/nd_syscalls2.stp: Ditto. Also added asmlinkage() calls to
'nd_syscall.{recv,recvfrom}'.
* testsuite/buildok/conversions-embedded.stp: Added build tests for
user_ulong() and user_ulong_warn().
* testsuite/buildok/conversions.stp: Ditto.
David Smith [Fri, 28 Mar 2014 15:49:04 +0000 (10:49 -0500)]
Remove outdated comments from nd_syscalls.stp and nd_syscalls2.stp.
* tapset/linux/nd_syscalls.stp: No actual code change. Remove old comments
showing the syscall variant of the probe. These comments have gotten out
of date, and it isn't that hard to lookup the syscall variant of the
probe. In addition, having these comments present prevents easily
looking for actual '$var' references that shouldn't be there.
* tapset/linux/nd_syscalls2.stp: Ditto.