Josh Stone [Tue, 16 Oct 2012 23:11:07 +0000 (16:11 -0700)]
stapdyn: Big refactoring
The main additions are the "mutator" class working with the overall
session, and the "mutatee" class working with the individual targets.
This is a bit of code churn, with no functional change, but it should
make it easier to keep track of our current state and add new features.
Frank Ch. Eigler [Tue, 16 Oct 2012 18:43:52 +0000 (14:43 -0400)]
PR14731: fix PR12022 regression for synthetic foreach()
* elaborate.cxx (add_global_var_display): For foreach_loop created
for otherwise-unused global, initialize sort_aggr => sc_none, the
same way the parser would.
David Smith [Mon, 15 Oct 2012 21:28:28 +0000 (16:28 -0500)]
Fixed PR14701 by adding dyninst timer probe support.
* tapset-timers.cxx (hrtimer_derived_probe_group::emit_interval): Removed
function.
(hrtimer_derived_probe_group::emit_module_decls): Pushed some code down
to the runtime and added dyninst support.
(hrtimer_derived_probe_group::emit_module_init): Ditto.
(hrtimer_derived_probe_group::emit_module_exit): Ditto.
(timer_builder::build): Throw semantic errors if 'timer.jiffies' or
'timer.profile' probes are used in dyninst mode.
(register_tapset_timers): Add fake privilege for 'timer.profile' probes
when in dyninst mode (so we get a semantic error, not a privilege error).
* buildrun.cxx (compile_dyninst): Added '-lrt' for timer functions.
* testsuite/systemtap.pass1-4/buildok-dyninst.exp: Move some tests to the
kfail list.
* runtime/timer.c: New file.
* runtime/dyninst/timer.c: Ditto.
* runtime/linux/timer.c: Ditto.
Frank Ch. Eigler [Sat, 13 Oct 2012 16:13:35 +0000 (12:13 -0400)]
PR12022: support foreach sorting by user-selected aggregate operator
The syntax goes:
foreach ([x,y] in array @avg -) { }
inserting the desired sorting aggregator between the array name and
the +/- The runtime has been ready for this for a long time (see the
runtime/map.c SORT_* macros), finally time for the translator to let
users enjoy it.
Josh Stone [Fri, 12 Oct 2012 21:45:55 +0000 (14:45 -0700)]
PR14172: Fix for kernels without VM_EXECUTABLE
We were using VM_EXECUTABLE in two ways:
1) In task_finder for locating the process executable among all the
vmas. Since around 2.6.26 there is also mm->exe_file, which will serve
this purpose just fine.
2) In uprobes to avoid relocation offset for semaphores in ET_EXEC
files. This is actually incorrect, but harmless, because the callback
path for ET_EXEC targets already sets relocation=offset=0 anyway. So we
can just remove the special case for VM_EXECUTABLE altogether.
* runtime/task_finder_vma.c (stap_find_exe_file): New, locate the
process executable either by VM_EXECUTABLE or mm->exe_file.
* runtime/linux/task_finder.c (__stp_get_mm_path): Use stap_find_exe_file.
* runtime/linux/task_finder2.c (__stp_get_mm_path): Ditto.
* runtime/linux/uprobes-common.c (stap_uprobe_change_plus): Don't
special case for VM_EXECUTABLE (and add a comment why).
* runtime/linux/uprobes-inode.c (stapiu_change_plus): Ditto.
(stapiu_get_task_inode): Use stap_find_exe_file.
Frank Ch. Eigler [Fri, 12 Oct 2012 16:27:42 +0000 (12:27 -0400)]
testsuite: tweak net-sanity.exp test
It shouldn't use stap -vv, ashat creates too much noise in the .log file.
It shouldn't use 0xd34db33f as a known-bad address, because sometimes it's good.
Really good. Mmmm, tasty, BBQ goodness good. Honey, time to warm up the burners!
Frank Ch. Eigler [Fri, 12 Oct 2012 03:28:30 +0000 (23:28 -0400)]
PR14245 clean up error messages for staprun -d SOMETHING_AWFUL
jistone reported that the previously moved
"ERROR: no access to debugfs; try "chmod 0755 /sys/kernel/debug" as root"
message was appearing for erroneous staprun -d FOOBAR cases.
* ctl.c (init_ctl_channel): Print a more general error for any .ctl-file
opening failure.
Josh Stone [Thu, 11 Oct 2012 17:42:10 +0000 (10:42 -0700)]
Set staprun's verbosity one less than stap
Prior to commit e520ea8, staprun would get one -v for s.verbosity>1 and
a second -v for s.verbosity>2. That commit unbounded the number of -v
for staprun, but changed the off-by-one, and staprun and stapio are too
chatty for that.
This also now sets stapdyn verbosity the same way.
* buildrun.cxx (make_run_command): Give one less -v flag.
(make_dyninst_run_command): Set stapdyn -v flags the same way.
David Smith [Thu, 11 Oct 2012 15:55:58 +0000 (10:55 -0500)]
Fixed PR14659 so that ptrace can be used on tasks probed with systemtap.
* runtime/stp_utrace.c (utrace_set_events): No longer set
TIF_SYSCALL_TRACE on the target task.
(utrace_reset): No longer clear TIF_SYSCALL_TRACE on the target task.
* testsuite/systemtap.base/ptrace.exp: New testcase.
Frank Ch. Eigler [Wed, 10 Oct 2012 22:10:40 +0000 (18:10 -0400)]
PR14245: support /sys/kernel/debug mounted 0700
This is done by staprun passing a file descriptor for the
/sys/kernel/debug/systemtap/stap_MODULE directory from staprun
(running setuid) to stapio (running unprivileged, previously unable to
traverse to that path itself). This FD passing is done with a new
option -F<fd> for stapio (though by accident staprun also accepts (and
rejects) this option).
Since openat(2) is relatively recent, autoconf macros are used to back
down to graceful failure on older kernels, and to hide the new code.
New staprun always uses -F<fd> to stapio, even if permissions on
/sys/kernel/debug do not require it.
* staprun/common.c (relay_basedir_fd): New variable.
(parse_args): Parse new -F: option.
(usage): Document it.
* staprun/staprun.h: Corresponding changes.
* staprun/ctl.c (init_ctl_channel): Reorganize to try an incoming
relay_basedir_fd first (with a faccessat cross-user check) first.
Try to compute a relay_basedir_fd if not already set.
* staprun/mainloop.c (read_buffer_info): Note ignoring of this PR facility on
RHEL4-era old_transport.
* staprun/relayfs.c (init_relayfs): Attempt to open relay_fd[] using
relay_basedir_fd if specified.
* staprun/stapio.c: Top secret.
* staprun/staprun.c (main): Don't allow staprun itself to take -F, for it
could be misused by a very bad person (tm). However, arrange to pass
it to stapio, if we have incidentally discovered a good relay_basedir_fd.
* staprun/staprun_funcs.c (mountfs): Drop access_debugfs() check at this
point, as init_ctl_channel() will do the check later.
PR14555: handle 0 _stext relocs from userspace by kallsyms_lookup_name fallback
* runtime/transport/symbols.c (_stp_do_relocation): For an incoming
_stext=0 relocation (such as for /proc/sys/kernel/kptr_restrict = 2),
fall back to kallsyms_lookup_name.
William Cohen [Tue, 9 Oct 2012 20:30:48 +0000 (16:30 -0400)]
Make the tapcheck.sh look for all .stp files in the tapset directory
With the reorganization of the tapset directory there were some tapsets
multiple directories down. tapcheck.sh was only checking the top level
of the tapset directory. This meant that is was missing a many of the
tapsets that were in subdirectories. This change makes the results more
accurate.
Josh Stone [Tue, 9 Oct 2012 16:43:34 +0000 (09:43 -0700)]
PR14572: Set s.privilege = unprivileged for stapdyn
When running under Dyninst, we are effectively unprivileged by nature,
so setting s.privilege to reflect that helps restrict the available
probe types.
However, we still want to allow guru mode for setting target variables
and using embedded-C, so let systemtapr_:session::is_usermodea() pass.
* session.cxx (systemtap_session::parse_cmdline): For --runtime=dyninst,
set the privilege level too.
(systemtap_session::check_options): Allow -g for is_usermode().
* staptree.cxx (varuse_collecting_visitor::visit_embeddedcode): Allow
embedded-C unrestricted for is_usermode().
(varuse_collecting_visitor::visit_embedded_expr): Ditto.
commit 82523f19 changed the error-exit path of _stp_pmap_agg, but was
confused by the multiple (three!) levels of nested loops in effect at
the point of failure. While the prior "return;" skipped an overall
(newly needed) aggregate-unlock; the current "break;" skipped too
little. Switch to a proper simple goto to almost but not quite
return;.
Josh Stone [Sat, 6 Oct 2012 00:02:56 +0000 (17:02 -0700)]
stapdyn: Limit functions searches to the stap module
This is slightly more efficient because we already know which object we
expect to have the functions, so we don't need to search the whole app.
* stapdyn/stapdyn.cxx (call_inferior_function): Take the BPatch_object
in which we're searching for functions as a parameter.
(instrument_uprobe_target, instrument_uprobes): Ditto.
(dynamic_library_callback): Pass the stap dso from global.
(main): Save the stap dso into a global, and pass it as needed.
Josh Stone [Fri, 5 Oct 2012 23:27:27 +0000 (16:27 -0700)]
stapdyn: track dlopened objects for probes
* stapdyn/stapdyn.cxx (instrument_uprobe_target): New, factored out the
code to write all the probes from one target to one object.
(instrument_uprobes): Use instrument_uprobe_target for each.
(dynamic_library_callback): New, identify if a new object is a target,
and call instrument_uprobe_target if so.
(find_uprobes): Fill the vector as a parameter, and then return a
success status directly instead of floating the exception.
(main): Register the dynamic_library_callback for dlopens.
David Smith [Fri, 5 Oct 2012 17:52:43 +0000 (12:52 -0500)]
(PR14637 partial fix) Improve stapdyn locking.
* runtime/dyninst/runtime.h: Remove the 'stapdyn_big_dumb_lock' and make
preempt_disable() and preempt_enable_no_resched() do nothing for
dyninst.
* runtime/dyninst/tls_data.c: Change the mutex in tls_data_container_t to
a rwlock.
(_stp_tls_free_per_thread_ptr): Write lock the container before remove
the object from the list (and reduce the amount of time the container is
locked).
(_stp_tls_get_per_thread_ptr): Write lock the container before adding
the object from the list (and reduce the amount of time the container is
locked).
* runtime/map.c (_stp_pmap_agg): Be sure to unlock the map in error
conditions.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_get)): Be sure to unlock the
container in an error condition.
Josh Stone [Fri, 5 Oct 2012 02:04:14 +0000 (19:04 -0700)]
PR14573 (partial): Pass some registers into stapdyn
There doesn't seem to be a way to create and pass the pt_regs structure
from the Dyninst API, but we can still get most registers. This patch
adds a new enter_dyninst_uprobe_regs() to receive registers and fill
them into a pt_regs from there.
XXX Dyninst is currently limited in how many individual function
arguments it can pass, so for now I'm cutting it down to the first 8.
* runtime/dyninst/stapdyn.h: Declare enter_dyninst_uprobe_regs.
* runtime/dyninst/uprobes.c: Implement it, filling all dwarf registers
into a local struct pt_regs.
* runtime/dyninst/regs.c: Include regs.h to get SET_REG_IP.
* stapdyn/stapdyn.cxx (get_dwarf_registers): Create BPatch_snippets for
as many of the DWARF registers as possible (bug-limited to 8).
(instrument_uprobes): Look for the new entry function and use it.
Josh Stone [Thu, 4 Oct 2012 23:19:10 +0000 (16:19 -0700)]
PR14179: Split up loc2c-runtime.h for linux|dyninst
* runtime/loc2c-runtime.h: Remove deref functions and special register
handling from this shared base, and rename k_dwarf_register_N to
pt_dwarf_register_N to be more neutral.
* runtime/linux/loc2c-runtime.h: Move the deref functions and special
register handling here. Nothing new, just transplanted.
* runtime/dyninst/loc2c-runtime.h: Add deref and register functions.
* runtime/dyninst/copy.c (__copy_from_user, __copy_to_user): Move from
linux_def.h, since these are custom implementations, not kernel copies.
(_stp_strncpy_from_user, _stp_copy_from_user): New implementations.
Josh Stone [Wed, 3 Oct 2012 19:59:15 +0000 (12:59 -0700)]
stapdyn: nullify the pagefault machinations for derefs
We don't need to care about pagefault safety in userspace, but the
definitions making those into preempt_disable led to recursing on
stapdyn_big_dumb_lock (going away in PR14571). We can just #define
the pagefault_enable/disable away for the dyninst runtime.
David Smith [Tue, 2 Oct 2012 21:27:55 +0000 (16:27 -0500)]
(PR14571 partial fix) Add dyninst pmap stat fixes.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_tls_object_init)): Initialize the
histogram parameters, in case this pmap contains a histogram.
(KEYSYM(_stp_pmap_new)): For dyninst, override the tls data object init
function.
Dave Brolley [Tue, 2 Oct 2012 19:39:09 +0000 (15:39 -0400)]
Bug 860750 - stapusr user not able to run modules compiled and signed by the server
- When #ifndef HAVE_ELF_GETSHDRSTRNDX is true, then there is insufficient
ELF support to examine a signed systemtap module in order to determine the
privilege credentials required to run it. In this case, staprun should behave
like an older, multi-privilege-level-unaware, staprun and load the module for
stapusr and above. Since we know that it has been correctly signed, the module
is either an old dual-privilege module compiled fopr stapusr (ok), or it is
a new multi-privilege-enabled module compiled for stapusr or stapsys. In this
case, the module's internal self check will determine whether the user actually has
the required credentials. The module will abort if the user does not have the
required credentials.
- Small bug in translating the user's privilege credential mask to a string.
Josh Stone [Tue, 2 Oct 2012 18:23:48 +0000 (11:23 -0700)]
stapdyn: Fork output from stdout/stderr
We're still using the target's stdio (PR14491), but we're now using
separate FILE handles to do it, so we're not affected by the target
closing its own stdout early.
* runtime/dyninst/io.c (_stp_out, _stp_err): Private FILE handles.
(_stp_clone_file): Clone a FILE handle, also setting FD_CLOEXEC.
(_stp_warn, _stp_error, _stp_softerror, _stp_dbug): Use _stp_err.
* runtime/dyninst/print.c )_stp_print_flush): Use _stp_err and _stp_out.
* runtime/dyninst/runtime.h (stp_dyninst_ctor): Clone stderr and stdout.
(stp_dyninst_dtor): Close _stp_err and _stp_out.
Josh Stone [Tue, 2 Oct 2012 17:55:26 +0000 (10:55 -0700)]
stapdyn: Don't silence pass-4 gcc errors
* buildrun.cxx (compile_dyninst): Just as compile_pass() always shows
error output from Kbuild, we should always show gcc errors from
compiling dyninst modules too.
Josh Stone [Tue, 2 Oct 2012 17:55:19 +0000 (10:55 -0700)]
stapdyn: Use FD_CLOEXEC on _stp_mem_fd instead of O_CLOEXEC
O_CLOEXEC is only available since Linux 2.6.23, which is fairly old, but
we may still care to run on such systems. Using fcntl FD_CLOEXEC can
accomplish the same thing, and we don't need to worry about the race of
other threads calling exec at the same time as our module load, because
the whole process will be frozen.
* runtime/linux/copy.c: Move out _stp_read_address definition.
(__stp_strncpy_from_user): Simply accept vicarious protection
from caller _stp_strncpy_from_user.
(_stp_copy_from_user): Protect more.
* runtime/stp_string.c (_stp_text_str): Use _stp_read_address
instead of barenaked __stp_get_user.
* runtime/stp_string.h (__stp_get_user): Simplify; now only for
use by ...
(_stp_read_address): Moved here.
David Smith [Tue, 2 Oct 2012 14:56:42 +0000 (09:56 -0500)]
(PR14571 partial fix) Correctly handle maps with limited entries.
* translate.cxx (mapvar::init): Remove hardcoded 'wrap' initialization and
let _stp_map_new() initialize 'wrap'.
* runtime/map.c (_stp_map_init): Set new 'wrap' parameter in map itself.
(_stp_map_new): Pass new 'wrap' parameter down to _stp_map_init().
(_stp_map_tls_object_init): Pass cached 'wrap' field to _stp_map_init().
(_stp_pmap_new): Pass new 'wrap' parameter down to _stp_map_init().
* runtime/map.h: Update function prototypes with new 'wrap' parameter.
* runtime/map-gen.c (KEYSYM(_stp_map_new)): Pass new 'wrap' parameter down
to the correct _stp_map_new* function.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_new)): Pass new 'wrap' parameter
down to the correct _stp_pmap_new* function
* runtime/map-stat.c (_stp_map_new_hstat_log): Pass new 'wrap' parameter
down to _stp_map_new().
(_stp_map_new_hstat_linear): Ditto.
(_stp_pmap_new_hstat_linear): Ditto.
(_stp_pmap_new_hstat_log): Ditto.
PR14555, replace kernel symbol "_stext" by a macro in runtime/k_syms.h
The macro is used by the runtime as well as the compilation
components. It is not guaranteed that this symbol is always called
"_stext" on all archtitectures. On powerpc64 for example its name is
".__start". Stap will not run on other architectures where this symbol
has a different name because the lookup for "_stext" will fail.
Adjusted by <fche> to leave _stext as the relocation pseudo-section
name as used by relocation basis code, and parametrizing only
symbol names.