Jonathan Lebon [Thu, 17 Apr 2014 16:21:25 +0000 (12:21 -0400)]
add DWARF_LINE* macros to help diagnosis
Similarly to the previous commit, we modify the new safe_dwarf_line*()
functions so that they carry __FILE__ and __LINE__ information into the
error. This new information is filled in when using the new DWARF_LINE*
macros.
Before:
semantic error: libdw failure (dwarf_lineaddr): no error
After:
semantic error: libdw failure (dwarf_lineaddr): no error
thrown from: ../systemtap/dwflpp.cxx:2242
Jonathan Lebon [Thu, 17 Apr 2014 16:08:00 +0000 (12:08 -0400)]
add DWFL_ASSERT and DWARF_ASSERT to help diagnosis
Semantic errors thrown from dwfl_assert() and dwarf_assert() lacked any
positional information to help track down where the assertion failed. We
create two new macros, DWFL_ASSERT and DWARF_ASSERT, which carry down
the __FILE__ and __LINE__ information so that the semantic_error created
contains that information, which can be printed out using -vv.
Before:
semantic error: libdwfl failure (asserting!): no error
After:
semantic error: libdwfl failure (asserting!): no error
thrown from: ../systemtap/tapsets.cxx:7183
Jonathan Lebon [Wed, 16 Apr 2014 18:02:21 +0000 (14:02 -0400)]
statement.exp: rework and expand
The statement.exp test case previously only tested a few specific cases.
We now introduce a new test program, 'statement.c', on which we can test
for all the things we previously tested, allowing us to remove the other
test programs. Furthermore, we extend coverage to test many other
possible combinations.
Jonathan Lebon [Wed, 16 Apr 2014 14:40:16 +0000 (10:40 -0400)]
dwflpp: implement new iterate_over_srcfile_lines()
We finally implement the new iterate_over_srcfile_lines(). The basic
strategy is to look at each matching DIE, rather than just the line
records matching the linenos so that we properly match, for example,
functions inlined multiple times (which can yield multiple sets of line
records for the same lineno but at the various addresses where inlined).
Jonathan Lebon [Wed, 16 Apr 2014 20:03:07 +0000 (16:03 -0400)]
add dwarf_query::filtered_all
In dwarf_query-related functions, we very often need to carry out the
same operation on both filtered_functions and filtered_inlines. Rather
than duplicating code, create a new dwarf_query function which creates a
temporary vector containing all of them.
Jonathan Lebon [Wed, 16 Apr 2014 18:08:47 +0000 (14:08 -0400)]
dwflpp: add CU line caching
The upcoming patches re-implementing iterate_over_srcfile_lines() will
depend on the use of CU lines in lineno order. Since dwarf_getsrclines()
outputs them in addr order, it greatly helps performance to cache the
sorted version.
Jonathan Lebon [Wed, 16 Apr 2014 14:47:30 +0000 (10:47 -0400)]
dwarf_wrappers: remove dwarf_line_t class
In the coming patches, we will make liberal use of Dwarf_Line. Rather
than requiring conversion to dwarf_line_t, which is very often overkill
and too verbose, we introduce new helper functions which are safe
versions of their dwarf equivalent.
Jonathan Lebon [Wed, 16 Apr 2014 14:51:24 +0000 (10:51 -0400)]
gut out dwflpp::iterate_over_srcfile_lines()
To prepare for the new code, we empty out iterate_over_srcfile_lines()
and remove associated functions. This is also where we break the link
between dwflpp and dwarf_query (the original issue mentioned in
PR16615).
Jonathan Lebon [Fri, 4 Apr 2014 17:47:54 +0000 (13:47 -0400)]
dwarf_query: rename line to linenos
Rename both the dwarf_query 'line' member to 'linenos' as well as the
enum type 'line_t' to 'lineno_t'. This more accurately reflects line
numbers, as opposed to Dwarf_Line or dwarf_line_t objects, which are
often simply named 'line' in other contexts.
Jonathan Lebon [Fri, 4 Apr 2014 17:25:07 +0000 (13:25 -0400)]
tapsets.cxx: simplify query_srcfile_label
The query_srcfile_line() callback checked if the query had a
statement(str). This could have evaluated to false in the past (when
query_cu() treated both .statement(str) and .statement(num)), but now
query_srcfile_line() is only used for statement/function(func@file:N)
probes, so we can simplify it.
David Smith [Thu, 17 Apr 2014 19:39:39 +0000 (14:39 -0500)]
Fixed PR16806 by improving task_finder/utrace shutdown.
* runtime/stp_utrace.c (utrace_init): Clear out the kmem cache pointers
after destroying the caches.
(utrace_exit): Ditto.
(utrace_shutdown): Updated comments.
(utrace_free): Lock the utrace structure while cleaning up.
* runtime/linux/task_finder2.c (stap_task_finder_post_init): If the
task_finder state isn't 'running', quit early.
(stap_stop_task_finder): Call stp_task_work_exit() to wait on any
remaining task_work items.
(utrace_report_exec): If the utrace state isn't registered, quit.
(utrace_report_syscall_entry): Ditto.
(utrace_report_syscall_exit): Ditto.
(utrace_report_clone): Ditto.
(utrace_report_death): Ditto.
Lukas Berk [Thu, 17 Apr 2014 15:17:32 +0000 (11:17 -0400)]
PR16829 rework, have staprun export verbosity flag
*java/stapbm.in - rename STAPBM_VERBOSE to general SYSTEMTAP_VERBOSE
flag
*staprun/staprun.8 - note new SYSTEMTAP_VERBOSE env variable
*staprun/staprun.c - set SYSTEMTAP_VERBOSE env var from -v's passed to
staprun
*tapset-method.cxx - revert leftbits string to previous assignment
Victor Kamensky [Tue, 8 Apr 2014 05:23:39 +0000 (22:23 -0700)]
runtime: linux 3.14 porting: case when CONFIG_USER_NS not defined
Fix build problem for linux-3.14 case with config where
CONFIG_USER_NS is not defined. With CONFIG_UIDGID_STRICT_TYPE_CHECKS
removed (261000a56b6382f597bcb12000f55c9ff26a1efb) access to
kuid_t and kgid_t should happen through from_k?uid_munged call.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Josh Stone [Thu, 10 Apr 2014 00:27:13 +0000 (17:27 -0700)]
PR16719: Fix a couple leaked Dwfl instances
* setupdwfl.cxx (setup_dwfl_kernel): When recursing into another round
after downloading, call dwfl_end on the Dwfl that we already started.
* tapset.cxx (tracepoint_builder::init_dw): Call dwfl_end on the Dwfl
used to fill in the s.kernel_source_tree.
* testsuite/systemtap.base/pr16719.exp: Add a tracepoint subtest.
David Smith [Wed, 9 Apr 2014 17:00:05 +0000 (12:00 -0500)]
PR16716: Fix types in syscall.sched_{getscheduler,setscheduler,rr_get_interval}
* tapset/linux/syscalls2.stp (syscall.sched_getscheduler): Fixed types.
(syscall.sched_setscheduler): Ditto.
(syscall.sched_rr_get_interval): Fixed nesting and types. Also change
'argstr' to just have a pointer to the 'struct timespec' value, since
that is an output parameter and decoding it on input won't produce
anything of value.
* tapset/linux/nd_syscalls2.stp: Ditto.
* tapset/linux/aux_syscalls.stp (_sched_policy_str): Updated to handle new
values, including the new SCHED_RESET_ON_FORK flag.
* testsuite/systemtap.syscall/test.tcl (run_one_test): Since execname()
only returns the first 15 characters of the test program name, truncate
it.
* testsuite/systemtap.syscall/sched_getscheduler.c: New testcase
* testsuite/systemtap.syscall/sched_rr_get_interval.c: Ditto.
* testsuite/systemtap.syscall/sched_setscheduler.c: Ditto.
Some funky errors occur on some fedora installations featuring perhaps
only partial publican setup, or some other mysterious causes. Add
some xml tags to hit the default ="common" case, and add a Makefile
conditional to have publican force --pdftool=fop rather than
wkhtmltopdf, which fails in entertaining ways sometimes.
Josh Stone [Fri, 4 Apr 2014 21:14:41 +0000 (14:14 -0700)]
testsuite: perf counter test improvements
- Use anchored -re patterns for more precise matching.
- Remove unused counter_a/b that caused unmatched warnings.
- Allow "max towers" to be one digit less.
Jonathan Lebon [Wed, 2 Apr 2014 20:24:08 +0000 (16:24 -0400)]
PR16307: testsuite: use new kill proc
Replace 'exec kill' by a call to the new kill proc, which accounts for
double-dashing as necessary. Where it makes sense, the timeout argument
was also used so that a SIGKILL was also sent after a few seconds.
Jonathan Lebon [Wed, 2 Apr 2014 16:31:36 +0000 (12:31 -0400)]
PR16307: proc kill: new proc for safer killing
During setup, check what kind of kill executable we're dealing with to
find out whether we'll need to use double dashes when calling it. We
also create a new kill proc that takes this into account when calling
kill.
David Smith [Thu, 3 Apr 2014 20:08:56 +0000 (15:08 -0500)]
PR16716 partial fix: Better types in 'syscall.shutdown'.
* tapset/linux/syscalls2.stp: Fix types in syscall.shutdown.
* tapset/linux/aux_syscalls.stp: Convert _shutdown_how_str() to use
_stp_lookup_str().
* runtime/linux/compat_net.h: Define SHUT_* for RHEL5.
* testsuite/systemtap.syscall/shutdown.c: New testcase.
William Cohen [Wed, 2 Apr 2014 20:16:21 +0000 (16:16 -0400)]
Make vm.pagefault and vm.pagefault.return probe only one real function
Newer versions of the kernel have both __handle_mm_fault and
handle_mm_fault functions. The __handle_mm_fault may be inlined for
some kernel causing some arguments and the return probe to be
unavailable. The memory tapset should just watch one of these
functions to instrument.
Changed all remaining "sleep 0.2" test cases to "sleep 1", so that the
check.exp mapping to "/usr/bin/stress ...." is triggered. Further,
made the stress a little more stressy with more i/o workload.
William Cohen [Wed, 2 Apr 2014 19:50:23 +0000 (15:50 -0400)]
Adjust the output of sched_switch.stp to be more readable
Several numerical values were printed as one string of digits. Placed
spaces between the numbers and adjusted to formatting so that the output
is easier to read.
Jonathan Lebon [Mon, 31 Mar 2014 20:11:27 +0000 (16:11 -0400)]
stap: unite all dumping modes into s.dump_mode
Clean up the way dumping modes are implement. We create a new enum
member which tracks the type of dumping wanted. This enum handles all
stap invocations which do not directly handle a user-script:
-l/-L/--dump-probe-types/--dump-probe-aliases/--dump-functions. It also
allows us to clean up the cmdline_script/have_script hacks previously
used in switch handling.
The trickiest part about this patch is to now allow for the possibility
of s.user_file to be NULL throughout passes 1 and 2, which previously
always assumed a script was present.
Jonathan Lebon [Fri, 28 Mar 2014 22:02:20 +0000 (18:02 -0400)]
stap: add --dump-probe-aliases
We add a new --dump-probe-aliases switch, which dumps all the aliases
picked up in library files after pass 1 and then exits. Aliases whose
names don't start with '_' are hidden behind a -v.
Also change probe_alias::printsig() so that epilogue-style aliases are
printed properly.
David Smith [Tue, 1 Apr 2014 19:01:16 +0000 (14:01 -0500)]
Make _struct_sockaddr_u_impl() tapset function output more consistent.
* tapset/linux/aux_syscalls.stp (_struct_sockaddr_u_impl): Make output
more consistent. When _struct_sockaddr_u_impl() succeeds, it returns
'{INFO}'. When it failed, it returned '[...]'. Now when it fails it
returns '{...}'.
* testsuite/systemtap.syscall/bind.c: Updated expected test output.
* testsuite/systemtap.syscall/connect.c: Updated expected test output.
* testsuite/systemtap.syscall/sendto.c: Updated expected test output.
David Smith [Tue, 1 Apr 2014 18:30:06 +0000 (13:30 -0500)]
PR16716 partial fix: Better types in 'syscall.{send,sendto}'.
* tapset/linux/syscalls2.stp: Fixed types and nesting in
'syscall.{send,sendto}'. Fixed a few more types in
'syscall.{recv,recvfrom}'.
* tapset/linux/nd_syscalls2.stp: Ditto.
* runtime/linux/compat_unistd.h: Added __NR_sendto.
* tapset/linux/aux_syscalls.stp: Be sure we have the SYS_* defines by
including '<linux/net.h>'.
* testsuite/systemtap.syscall/send.c: New testcase.
* testsuite/systemtap.syscall/sendto.c: New testcase.
* testsuite/systemtap.syscall/recv.c: Added more testing of the 'flags'
parameter.
* testsuite/systemtap.syscall/recvfrom.c: Ditto.
* testsuite/systemtap.syscall/recvmmsg.c: Ditto.
* testsuite/systemtap.syscall/recvmsg.c: Ditto.
Frank Ch. Eigler [Sun, 30 Mar 2014 12:17:23 +0000 (08:17 -0400)]
PR16766 cont'd: tolerate STAP_SESSION_ERROR in module_refresh callback
If a systemtap module is in error state (but not yet shut down via
message from stapio/staprun), it may harmlessly continue receiving
and processing module-notification callbacks.
Josh Stone [Fri, 28 Mar 2014 20:53:15 +0000 (13:53 -0700)]
runtime: improve the preempt_enable_no_resched copy
The variant introduced in commit 651a87924c22 only handled the
CONFIG_PREEMPT_COUNT side of things, which causes mayhem when that's not
set, as with kernel-3.14.0-0.rc8.git0.1.fc21.x86_64. The #else case just
needs a barrier(). The former "TODO rethink" still stands.
David Smith [Fri, 28 Mar 2014 19:10:01 +0000 (14:10 -0500)]
PR16716 partial fix: Fix 'syscall.{recv,recvfrom}' on 32-bit platforms.
* tapset/uconversions.stp (user_ulong): New function.
(user_ulong_warn): Ditto.
* tapset/linux/syscalls2.stp: Change all calls to user_uint64() to
user_ulong(), so that 32-bit platforms are handled correctly.
* tapset/linux/nd_syscalls2.stp: Ditto. Also added asmlinkage() calls to
'nd_syscall.{recv,recvfrom}'.
* testsuite/buildok/conversions-embedded.stp: Added build tests for
user_ulong() and user_ulong_warn().
* testsuite/buildok/conversions.stp: Ditto.
David Smith [Fri, 28 Mar 2014 15:49:04 +0000 (10:49 -0500)]
Remove outdated comments from nd_syscalls.stp and nd_syscalls2.stp.
* tapset/linux/nd_syscalls.stp: No actual code change. Remove old comments
showing the syscall variant of the probe. These comments have gotten out
of date, and it isn't that hard to lookup the syscall variant of the
probe. In addition, having these comments present prevents easily
looking for actual '$var' references that shouldn't be there.
* tapset/linux/nd_syscalls2.stp: Ditto.
Frank Ch. Eigler [Fri, 28 Mar 2014 01:29:04 +0000 (21:29 -0400)]
PR16766: kernel crash for failed-init module-notification
Suppress the module_notifier callback for cases of failure of the
main generated systemtap module-initialization code, which checks
build-ids, privileges, etc. etc.; we don't want any module-notifier
callbacks after an error.
* runtime/transport/transport.c: Don't call module-notifier stuff
if initialization failed.
* translate.cxx (emit_module_refresh): Emit code to suppress callback
payload if somehow the notifier got activated anyway.
Jonathan Lebon [Wed, 26 Mar 2014 17:56:20 +0000 (13:56 -0400)]
initscript: skip dracut stap module by default
We previously always enabled the dracut stap module as long as there
were scripts to include. This can lead to issues since the params.conf
file may be obsolete/not in sync e.g. during a kernel update. We now
make the module an opt-in feature, and make the initscript explicit
specify its inclusion.
Jonathan Lebon [Wed, 26 Mar 2014 15:39:11 +0000 (11:39 -0400)]
initscript: use new-kernel-pkg after dracut
With this patch, we now also call new-kernel-pkg --update after creating
the new image so that the bootloader is updated if need be (see also
BZ1051649#c9).
This patch also includes some polishing re. console log output.
David Smith [Wed, 26 Mar 2014 14:43:23 +0000 (09:43 -0500)]
Replace _sendflags_str() and _recvflags_str() with _msg_flags_str().
* tapset/linux/aux_syscalls.stp (_msg_flags_str): New function to replace
_sendflags_str() and _recvflags_str().
(_sendflags_str): Deprecated and reimplemented with _msg_flags_str().
(_recvflags_str): Ditto.
* tapset/linux/syscalls2.stp: Replace _sendflags_str() and
_recvflags_str() with _msg_flags_str().
* tapset/linux/nd_syscalls2.stp: Ditto.
* testsuite/buildok/aux_syscalls-embedded.stp: Test _msg_flags_str().
* NEWS: Mention deprecations.
Josh Stone [Wed, 26 Mar 2014 00:04:00 +0000 (17:04 -0700)]
Minor testsuite tweaks for stapdyn
- global_end: Allow for zero time (i.e. less than 1 microsecond), and
don't double the _dyninst test suffix.
- process_by_cmd: Add an "int rc" just to give distinct IPs between
function start and the first mark, so we don't need to worry about
determinism of the probe order. (stapdyn was hitting the mark first.)
- suppress-time-limit: Expect the WARNING messages too, because stapdyn
often has them mixed with the normal script output.
Josh Stone [Tue, 25 Mar 2014 20:46:57 +0000 (13:46 -0700)]
socktop: Make sure *_str are always known as strings
The new removal of unreachable code meant socktop might remove its whole
filter setup, and all the related variables are left with unknown type.
They would typically be pruned away just fine, but socktop also has a
probe never with print statements to avoid autoprint. Changing those to
log() lets stap always know they are strings.
Josh Stone [Tue, 25 Mar 2014 20:43:29 +0000 (13:43 -0700)]
Tighten a semok/doubleglob.stp pattern
With 't**es(1)', we only intended to match 'timer.jiffies(1), but we
also matched a 'tcp[...].callees(1)' on accident. This odd instance has
some issues in variable expansion, so we should just avoid it. Using
'ti**es(1)' gets back to the original intent of the test.
Josh Stone [Tue, 25 Mar 2014 18:22:43 +0000 (11:22 -0700)]
Remove code that follows unconditional control statements
When a block contains return, next, break, or continue, any following
statements are unreachable. Warn and remove them.
This replaces tapset-perfmon's statement_counter, which didn't know
enough to deal with blocks like '{ { } }' that arise from perf aliases.
Now it just inserts a 'next', and the later optimization passes can
figure this out in a very generic way.
Josh Stone [Fri, 21 Mar 2014 18:46:03 +0000 (11:46 -0700)]
Read the perf process from -c during pass2
When a bare .process is used, it's value is inferred from -c CMD. This
must be checked during pass2, or else a changed -c CMD will not trigger
a hash change, and an incorrect cached module will be used.
This patch also unifies the wordexp argv[0] parsing of -c CMD into a
shared systemtap_session::cmd_file().
Josh Stone [Fri, 21 Mar 2014 18:39:42 +0000 (11:39 -0700)]
PR14223: Allow perf probes for mere @stapdev mortals
Even though we're using the kernel interface, perf checks CAP_SYS_ADMIN,
which our mere @stapdev user may not have. By running via a workqueue,
we'll be in an events/X kernel thread with sufficient privileges.
Jonathan Lebon [Sat, 22 Mar 2014 04:50:53 +0000 (00:50 -0400)]
semantic_error: also print source of error
When printing semantic_errors at high verbosity (-vv), it can be useful
to also know where the error came from. That information is already made
available through the errsrc member of semantic_error (initially
implemented for dup-error elimination).
We also change the ERRSRC macro to use __FILE__ rather than __FUNCTION__
to be not only more informative, but more foolproof (e.g. two errors
thrown from identically named functions at identical lines in separate
files before would have been considered equivalent by the dup-error
elimination).
The final result is e.g. something like this:
semantic error: unresolved type : identifier 'ActiveOpens' at :22:8
thrown from: elaborate.cxx:5239
source: global ActiveOpens
^
Josh Stone [Sat, 22 Mar 2014 00:32:06 +0000 (17:32 -0700)]
testsuite: big cleanup of sdt_misc types
- Use a central check() function throughout, so tests and messages are
easily uniform. Prefix the PASS/FAIL messages with "sdt_types" so
they are easily distinguishable from dejagnu PASS/FAIL in the log.
- Adjust most constants to test the bounds of their types.
- Add unsigned char and unsigned long long.
- Tighten the expect patterns to exactly one line at a time, and
increment $notok for extra lines (like WARNINGs).
Josh Stone [Fri, 21 Mar 2014 22:09:53 +0000 (15:09 -0700)]
testsuite: big cleanup of sdt_asm
- Use a central check() function throughout, so tests and messages are
easily uniform. Prefix the PASS/FAIL messages with "sdt_asm" so they
are easily distinguishable from dejagnu PASS/FAIL in the log.
- Save and restore SP when when changing it in sdt_asm.S, and prepare
the stack and return value at the end for a clean exit.
- Skip mark("*sp") for --runtime=dyninst, because it's not prepared to
deal with that bad stack state.
- Tighten the expect patterns to exactly one line at a time, and
increment $notok for extra lines (like WARNINGs).