runtime rhel6: avoid kernel crash with buildid checks for process(PID#) probes
Crash observed on rhel6 (utrace -> process(NUM).statement(ADDR).absolute probes)
in unprivileged_myproc.exp test case.
* runtime/linux/uprobes-common.c (stap_uprobe_change_plus): Don't try to check
process against _stp_module entry, if we don't even have a process pathname.
* runtime/sym.c (_stp_usermodule_check): WARN_ON incoming null pathname.
Josh Stone [Sat, 26 Apr 2014 00:17:03 +0000 (17:17 -0700)]
tapset: Deal with missing fs in i386/register.stp
Older kernels didn't have fs in pt_regs, so the following _reg_offsets
were incorrect. Make fs conditional on stapconf, and use a counter to
more easily keep track of the differences.
Josh Stone [Sat, 26 Apr 2014 00:12:18 +0000 (17:12 -0700)]
runtime: Fix synthesized regs on ix86
The ix86 register names and offsets in the inline asm of
arch_unw_init_frame_info are now fixed, including stapconf to deal with
fs/xfs/gs changes over time.
x86_64 is updated to match the same style of access.
Josh Stone [Fri, 25 Apr 2014 22:25:10 +0000 (15:25 -0700)]
runtime: Initialize _stp_module_self from THIS_MODULE
By iterating the sections of THIS_MODULE directly, rather than from
STP_RELOCATION messages, we can more safely contain the eh_frame length
determination. As an upper bound, the length can be no more than the
distance to the next section in memory. Then when we walk fde records,
we can sanity check that the offsets make sense.
syscalls tapset: adapt to absence of uselib(2) syscall
LKML commit 69369a70 made it possible to configure a kernel without
uselib(2) support. Mark these tapset aliases optional so syscall.*
type probing works on !CONFIG_USELIB kernels.
PR6961 cont'd: add a terminating 0x00000000 word to module .ko .eh_frame
This is necessary because now we parse our own module's .eh_frame
lightly at startup time, and must reliably find the end of the
entries. (We don't get the crtend.o goodies from userspace.)
We also reject 8-byte extended-length cases, should they ever come up.
* session.* (op_create_auxiliary): Make possible to designate aux files
as trailers (coming in after the main generated .o file) instead of before.
* buildrun.cxx (compile_pass): Add s.auxiliary_output trailers.
* translate.cxx (emit_symbol_data_done): T-800 returns from the future.
* runtime/sym.c (_stp_kmodule_update_address): Reject 8-byte extended-length
fields in .eh_frame.
Jonathan Lebon [Fri, 25 Apr 2014 19:00:07 +0000 (15:00 -0400)]
statement.exp: adapt for RHEL5
In GCC < 4.4, the decl_line is the line holding the opening brace, not the
line with the identifier. Since the decl_line is the basis for relative
line number statement probes, this discrepancy was causing statement.exp
to have some RELATIVE subtest failures.
Add another empty line so that main@statement.c+4 always falls on an
empty line, regardless of which is the decl_line.
Jonathan Lebon [Fri, 25 Apr 2014 16:01:52 +0000 (12:01 -0400)]
testsuite: adapt to RHEL5
- callee.exp: set as untested if GCC or elfutils too old
- dump_functions.exp: RHEL5 exec doesn't support -ignorestderr
- dump_probe_aliases.exp: ditto
Jonathan Lebon [Fri, 25 Apr 2014 15:53:12 +0000 (11:53 -0400)]
lib/systemtap.exp: get better GCC and elfutils version info
Rather than having the global GCC_Version hold the whole
"gcc --version | head -n 1" string, create a new GCC_FullVersion for
that and make GCC_Version only hold the raw version (e.g. "4.1.2").
This will be useful later on for version comparison.
Create new ELF_Version which holds the elfutils library version against
which SystemTap was compiled.
Jonathan Lebon [Wed, 23 Apr 2014 19:48:37 +0000 (15:48 -0400)]
pr16806.exp: new testcase
- pr16806.exp: new testcase to verify proper tracepoint unregistration
and utrace shutdown.
- loop.c: don't call usleep() if NOSLEEP is defined. This is used by
pr16806.exp, which requires it to make conditions more racy.
- unprivileged_myproc.exp: simplify path to C file.
- unprivileged_probes.exp: ditto.
Josh Stone [Tue, 22 Apr 2014 16:07:42 +0000 (09:07 -0700)]
PR16844: Initial adaptation for kernel 3.15 tracepoints
This hides the tracepoint version adaptations in stp_tracepoint.h. For
older kernel versions, that header is sufficient. With 3.15 we have to
do extra bookkeeping ourselves, so it pulls in stp_tracepoints.c, which
is borrowed almost verbatim from lttng. It seems to work well, but we
can still decide if we want to take a different approach.
David Smith [Tue, 22 Apr 2014 15:32:05 +0000 (10:32 -0500)]
PR16716 partial fix: Better types in 'syscall.{access,faccessat}'.
* tapset/linux/syscalls.stp: Fix 'mode' variable type in
'syscall.{access,faccessat}'.
* tapset/linux/aux_syscall.stp: Convert to use _stp_lookup_or_str().
* testsuite/systemtap.syscall/access.c (main): Add tests.
David Smith [Tue, 22 Apr 2014 13:42:59 +0000 (08:42 -0500)]
PR16716 partial fix: Better types in 'syscall.{swapon,syslog}'.
* tapset/linux/syscalls2.stp: Fix types in 'syscall.syslog' and
'syscall.swapon'. In 'syscall.swapon', decode the flags in the new
'swapflags' variable.
* tapset/linux/nd_syscalls2.stp: In 'nd_syscall.swapon', decode the flags
in the new 'swapflags' variable.
* tapset/linux/aux_syscalls.stp (_swapon_flags_str): New function.
* tapset/uconversions.stp (user_string_n2_quoted): If we're in a compat
task, when printing the pointer value as a number, don't expand it to
64-bits.
* testsuite/systemtap.syscall/swap.c: Added more tests.
* testsuite/systemtap.syscall/syslog.c: New test case.
* testsuite/buildok/syscalls2-detailed.stp: Added new 'swapflags' variable.
* testsuite/buildok/nd_syscalls2-detailed.stp: Ditto.
Stan Cox [Tue, 22 Apr 2014 13:37:16 +0000 (09:37 -0400)]
Add python support tapset example.
* systemtap.examples/general/tapset/(python2.stp,python3.stp,python.stpm): Python support tapset
* systemtap.examples/general/(pyexample.meta,pyexample.stp,pyexample.py): Use it.
David Smith [Mon, 21 Apr 2014 18:43:33 +0000 (13:43 -0500)]
PR16716 partial fix: Better types in 'syscall.{getpriority,setpriority}'.
* tapset/linux/syscalls.stp: Fix types in 'syscall.getpriority'.
* tapset/linux/syscalls2.stp: Fix types in 'syscall.setpriority'.
* tapset/linux/aux_syscalls.stp (_priority_which_str): Convert to use
_stp_lookup_str().
* testsuite/systemtap.syscall/getpriority.c: New test case.
* testsuite/systemtap.syscall/setpriority.c: Ditto.
Josh Stone [Fri, 18 Apr 2014 01:14:06 +0000 (18:14 -0700)]
tapset: mark _sched_policy_str() as pure and :string
This was causing issues in any script that pulled in aux_syscalls, but
didn't use _sched_policy_str(). Since it wasn't pure, it couldn't be
elided, and since it wasn't explicitly declared :string, there was no
type deduction to define its STAP_RETVALUE.
Josh Stone [Fri, 18 Apr 2014 00:54:01 +0000 (17:54 -0700)]
testsuite: don't mix -p2 and -l in labels.exp
Since commit bba368c5c5fea, -p2 and -l are exclusive options, even
though internally they are similar.
Furthermore, "labels exe .statement" and "labels so .statement" had this
failure masked, because they were looking for the absence of the error
"semantic error: no match" rather than the presence of actual matches.
Thus the option-validating error didn't trigger failure before.
Josh Stone [Thu, 17 Apr 2014 21:32:40 +0000 (14:32 -0700)]
Prevent lock-recursion in _stp_ctl_send
In rare cases, we may hit a probe while the transport layer is holding a
spinlock, and that probe may call _stp_ctl_send which tries to grab the
same and deadlocks. This is a bit easier to trigger on lockdep-enabled
kernels with the lock_acquired tracepoint.
This patch refactors the context->busy state management into get/put
context, and those areas which grab probe-sensitive locks now wrap
themselves with a context to stay comfortably free of probes.
Tangentially, the dyninst side abandons the busy flag, as it already had
a tls_context pointer to prevent direct recursion and a mutex for
exclusive access across all processes.
DEBUG_REENTRANCY is an unfortunate casualty, because we can't safely
call _stp_warn when the busy context may be from those held locks.
Jonathan Lebon [Thu, 17 Apr 2014 21:34:24 +0000 (17:34 -0400)]
stmt_rel.exp: improve coverage
The testcase previously only tested that specific relative linenos in
bio_init() were valid and that there were at least 3 linenos available
for probing.
We now improve this test by checking that probes listed by the wildcard
lineno are all accessible by relative numbering as well. As a sanity
check, we check that bio_init() has at least 3 linenos available, as
before.
Jonathan Lebon [Thu, 17 Apr 2014 16:21:25 +0000 (12:21 -0400)]
add DWARF_LINE* macros to help diagnosis
Similarly to the previous commit, we modify the new safe_dwarf_line*()
functions so that they carry __FILE__ and __LINE__ information into the
error. This new information is filled in when using the new DWARF_LINE*
macros.
Before:
semantic error: libdw failure (dwarf_lineaddr): no error
After:
semantic error: libdw failure (dwarf_lineaddr): no error
thrown from: ../systemtap/dwflpp.cxx:2242
Jonathan Lebon [Thu, 17 Apr 2014 16:08:00 +0000 (12:08 -0400)]
add DWFL_ASSERT and DWARF_ASSERT to help diagnosis
Semantic errors thrown from dwfl_assert() and dwarf_assert() lacked any
positional information to help track down where the assertion failed. We
create two new macros, DWFL_ASSERT and DWARF_ASSERT, which carry down
the __FILE__ and __LINE__ information so that the semantic_error created
contains that information, which can be printed out using -vv.
Before:
semantic error: libdwfl failure (asserting!): no error
After:
semantic error: libdwfl failure (asserting!): no error
thrown from: ../systemtap/tapsets.cxx:7183
Jonathan Lebon [Wed, 16 Apr 2014 18:02:21 +0000 (14:02 -0400)]
statement.exp: rework and expand
The statement.exp test case previously only tested a few specific cases.
We now introduce a new test program, 'statement.c', on which we can test
for all the things we previously tested, allowing us to remove the other
test programs. Furthermore, we extend coverage to test many other
possible combinations.
Jonathan Lebon [Wed, 16 Apr 2014 14:40:16 +0000 (10:40 -0400)]
dwflpp: implement new iterate_over_srcfile_lines()
We finally implement the new iterate_over_srcfile_lines(). The basic
strategy is to look at each matching DIE, rather than just the line
records matching the linenos so that we properly match, for example,
functions inlined multiple times (which can yield multiple sets of line
records for the same lineno but at the various addresses where inlined).
Jonathan Lebon [Wed, 16 Apr 2014 20:03:07 +0000 (16:03 -0400)]
add dwarf_query::filtered_all
In dwarf_query-related functions, we very often need to carry out the
same operation on both filtered_functions and filtered_inlines. Rather
than duplicating code, create a new dwarf_query function which creates a
temporary vector containing all of them.
Jonathan Lebon [Wed, 16 Apr 2014 18:08:47 +0000 (14:08 -0400)]
dwflpp: add CU line caching
The upcoming patches re-implementing iterate_over_srcfile_lines() will
depend on the use of CU lines in lineno order. Since dwarf_getsrclines()
outputs them in addr order, it greatly helps performance to cache the
sorted version.
Jonathan Lebon [Wed, 16 Apr 2014 14:47:30 +0000 (10:47 -0400)]
dwarf_wrappers: remove dwarf_line_t class
In the coming patches, we will make liberal use of Dwarf_Line. Rather
than requiring conversion to dwarf_line_t, which is very often overkill
and too verbose, we introduce new helper functions which are safe
versions of their dwarf equivalent.
Jonathan Lebon [Wed, 16 Apr 2014 14:51:24 +0000 (10:51 -0400)]
gut out dwflpp::iterate_over_srcfile_lines()
To prepare for the new code, we empty out iterate_over_srcfile_lines()
and remove associated functions. This is also where we break the link
between dwflpp and dwarf_query (the original issue mentioned in
PR16615).
Jonathan Lebon [Fri, 4 Apr 2014 17:47:54 +0000 (13:47 -0400)]
dwarf_query: rename line to linenos
Rename both the dwarf_query 'line' member to 'linenos' as well as the
enum type 'line_t' to 'lineno_t'. This more accurately reflects line
numbers, as opposed to Dwarf_Line or dwarf_line_t objects, which are
often simply named 'line' in other contexts.
Jonathan Lebon [Fri, 4 Apr 2014 17:25:07 +0000 (13:25 -0400)]
tapsets.cxx: simplify query_srcfile_label
The query_srcfile_line() callback checked if the query had a
statement(str). This could have evaluated to false in the past (when
query_cu() treated both .statement(str) and .statement(num)), but now
query_srcfile_line() is only used for statement/function(func@file:N)
probes, so we can simplify it.
David Smith [Thu, 17 Apr 2014 19:39:39 +0000 (14:39 -0500)]
Fixed PR16806 by improving task_finder/utrace shutdown.
* runtime/stp_utrace.c (utrace_init): Clear out the kmem cache pointers
after destroying the caches.
(utrace_exit): Ditto.
(utrace_shutdown): Updated comments.
(utrace_free): Lock the utrace structure while cleaning up.
* runtime/linux/task_finder2.c (stap_task_finder_post_init): If the
task_finder state isn't 'running', quit early.
(stap_stop_task_finder): Call stp_task_work_exit() to wait on any
remaining task_work items.
(utrace_report_exec): If the utrace state isn't registered, quit.
(utrace_report_syscall_entry): Ditto.
(utrace_report_syscall_exit): Ditto.
(utrace_report_clone): Ditto.
(utrace_report_death): Ditto.
Lukas Berk [Thu, 17 Apr 2014 15:17:32 +0000 (11:17 -0400)]
PR16829 rework, have staprun export verbosity flag
*java/stapbm.in - rename STAPBM_VERBOSE to general SYSTEMTAP_VERBOSE
flag
*staprun/staprun.8 - note new SYSTEMTAP_VERBOSE env variable
*staprun/staprun.c - set SYSTEMTAP_VERBOSE env var from -v's passed to
staprun
*tapset-method.cxx - revert leftbits string to previous assignment
Victor Kamensky [Tue, 8 Apr 2014 05:23:39 +0000 (22:23 -0700)]
runtime: linux 3.14 porting: case when CONFIG_USER_NS not defined
Fix build problem for linux-3.14 case with config where
CONFIG_USER_NS is not defined. With CONFIG_UIDGID_STRICT_TYPE_CHECKS
removed (261000a56b6382f597bcb12000f55c9ff26a1efb) access to
kuid_t and kgid_t should happen through from_k?uid_munged call.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Lukas Berk [Fri, 11 Apr 2014 17:57:04 +0000 (13:57 -0400)]
Add self-unwinding testcase and tweak existing tests
*runtime/sym.c - warning output from the stap module about the module
only having an eh_frame (vs eh_frame_hdr), uses the _stp_module->path
variable as a descriptor. Use the (guaranteed) unique name of our
inserted module vs the placeholder from the initializer.
*testsuite/systemtap.base/backtrace.exp - Due to the existing probes now
having more information (and more output), we need to change the
existing testcase to expect the warnings based on eh_frame and no
eh_frame_hdr being present. Also add a new testcase where there should
be no (inexact) from the unwind string.
Lukas Berk [Thu, 10 Apr 2014 20:05:55 +0000 (16:05 -0400)]
Explicitly declare _stp_module_self fields
*runtime/sym.c - remove assignments for fields assigned in translate.cxx
*translate.cxx - explicitly initialize more of the fields in the
_stp_module_self module. This way we no longer have to
rely on lucky ordering later on.
Josh Stone [Thu, 10 Apr 2014 00:27:13 +0000 (17:27 -0700)]
PR16719: Fix a couple leaked Dwfl instances
* setupdwfl.cxx (setup_dwfl_kernel): When recursing into another round
after downloading, call dwfl_end on the Dwfl that we already started.
* tapset.cxx (tracepoint_builder::init_dw): Call dwfl_end on the Dwfl
used to fill in the s.kernel_source_tree.
* testsuite/systemtap.base/pr16719.exp: Add a tracepoint subtest.
David Smith [Wed, 9 Apr 2014 17:00:05 +0000 (12:00 -0500)]
PR16716: Fix types in syscall.sched_{getscheduler,setscheduler,rr_get_interval}
* tapset/linux/syscalls2.stp (syscall.sched_getscheduler): Fixed types.
(syscall.sched_setscheduler): Ditto.
(syscall.sched_rr_get_interval): Fixed nesting and types. Also change
'argstr' to just have a pointer to the 'struct timespec' value, since
that is an output parameter and decoding it on input won't produce
anything of value.
* tapset/linux/nd_syscalls2.stp: Ditto.
* tapset/linux/aux_syscalls.stp (_sched_policy_str): Updated to handle new
values, including the new SCHED_RESET_ON_FORK flag.
* testsuite/systemtap.syscall/test.tcl (run_one_test): Since execname()
only returns the first 15 characters of the test program name, truncate
it.
* testsuite/systemtap.syscall/sched_getscheduler.c: New testcase
* testsuite/systemtap.syscall/sched_rr_get_interval.c: Ditto.
* testsuite/systemtap.syscall/sched_setscheduler.c: Ditto.
Some funky errors occur on some fedora installations featuring perhaps
only partial publican setup, or some other mysterious causes. Add
some xml tags to hit the default ="common" case, and add a Makefile
conditional to have publican force --pdftool=fop rather than
wkhtmltopdf, which fails in entertaining ways sometimes.
Lukas Berk [Tue, 8 Apr 2014 19:22:41 +0000 (15:22 -0400)]
Create synthetic pt_regs and support for unwinding in module
*buildrun.cxx - filter out -fno-asynchronous-unwind-tables flag
*runtime/stack.c - reorder unwinding to always try dwarf_unwinding first
*runtime/sym.c - add functionality to load information about our own
module when we know it's been loaded
*runtime/sym.h - add various self module structs
*runtime/unwind.c - correct how we calculate finding the next fde
*runtime/unwind/i386.h - add asm for getting current register values on
i386
*runtime/unwind/x86_64.h - same as above but for x86_64
*translate.cxx - make sure self module structs are listed in stap-symbols.h
Josh Stone [Fri, 4 Apr 2014 21:14:41 +0000 (14:14 -0700)]
testsuite: perf counter test improvements
- Use anchored -re patterns for more precise matching.
- Remove unused counter_a/b that caused unmatched warnings.
- Allow "max towers" to be one digit less.
Jonathan Lebon [Wed, 2 Apr 2014 20:24:08 +0000 (16:24 -0400)]
PR16307: testsuite: use new kill proc
Replace 'exec kill' by a call to the new kill proc, which accounts for
double-dashing as necessary. Where it makes sense, the timeout argument
was also used so that a SIGKILL was also sent after a few seconds.
Jonathan Lebon [Wed, 2 Apr 2014 16:31:36 +0000 (12:31 -0400)]
PR16307: proc kill: new proc for safer killing
During setup, check what kind of kill executable we're dealing with to
find out whether we'll need to use double dashes when calling it. We
also create a new kill proc that takes this into account when calling
kill.
David Smith [Thu, 3 Apr 2014 20:08:56 +0000 (15:08 -0500)]
PR16716 partial fix: Better types in 'syscall.shutdown'.
* tapset/linux/syscalls2.stp: Fix types in syscall.shutdown.
* tapset/linux/aux_syscalls.stp: Convert _shutdown_how_str() to use
_stp_lookup_str().
* runtime/linux/compat_net.h: Define SHUT_* for RHEL5.
* testsuite/systemtap.syscall/shutdown.c: New testcase.