Josh Stone [Fri, 12 Oct 2012 21:45:55 +0000 (14:45 -0700)]
PR14172: Fix for kernels without VM_EXECUTABLE
We were using VM_EXECUTABLE in two ways:
1) In task_finder for locating the process executable among all the
vmas. Since around 2.6.26 there is also mm->exe_file, which will serve
this purpose just fine.
2) In uprobes to avoid relocation offset for semaphores in ET_EXEC
files. This is actually incorrect, but harmless, because the callback
path for ET_EXEC targets already sets relocation=offset=0 anyway. So we
can just remove the special case for VM_EXECUTABLE altogether.
* runtime/task_finder_vma.c (stap_find_exe_file): New, locate the
process executable either by VM_EXECUTABLE or mm->exe_file.
* runtime/linux/task_finder.c (__stp_get_mm_path): Use stap_find_exe_file.
* runtime/linux/task_finder2.c (__stp_get_mm_path): Ditto.
* runtime/linux/uprobes-common.c (stap_uprobe_change_plus): Don't
special case for VM_EXECUTABLE (and add a comment why).
* runtime/linux/uprobes-inode.c (stapiu_change_plus): Ditto.
(stapiu_get_task_inode): Use stap_find_exe_file.
Frank Ch. Eigler [Fri, 12 Oct 2012 16:27:42 +0000 (12:27 -0400)]
testsuite: tweak net-sanity.exp test
It shouldn't use stap -vv, ashat creates too much noise in the .log file.
It shouldn't use 0xd34db33f as a known-bad address, because sometimes it's good.
Really good. Mmmm, tasty, BBQ goodness good. Honey, time to warm up the burners!
Frank Ch. Eigler [Fri, 12 Oct 2012 03:28:30 +0000 (23:28 -0400)]
PR14245 clean up error messages for staprun -d SOMETHING_AWFUL
jistone reported that the previously moved
"ERROR: no access to debugfs; try "chmod 0755 /sys/kernel/debug" as root"
message was appearing for erroneous staprun -d FOOBAR cases.
* ctl.c (init_ctl_channel): Print a more general error for any .ctl-file
opening failure.
Josh Stone [Thu, 11 Oct 2012 17:42:10 +0000 (10:42 -0700)]
Set staprun's verbosity one less than stap
Prior to commit e520ea8, staprun would get one -v for s.verbosity>1 and
a second -v for s.verbosity>2. That commit unbounded the number of -v
for staprun, but changed the off-by-one, and staprun and stapio are too
chatty for that.
This also now sets stapdyn verbosity the same way.
* buildrun.cxx (make_run_command): Give one less -v flag.
(make_dyninst_run_command): Set stapdyn -v flags the same way.
David Smith [Thu, 11 Oct 2012 15:55:58 +0000 (10:55 -0500)]
Fixed PR14659 so that ptrace can be used on tasks probed with systemtap.
* runtime/stp_utrace.c (utrace_set_events): No longer set
TIF_SYSCALL_TRACE on the target task.
(utrace_reset): No longer clear TIF_SYSCALL_TRACE on the target task.
* testsuite/systemtap.base/ptrace.exp: New testcase.
Frank Ch. Eigler [Wed, 10 Oct 2012 22:10:40 +0000 (18:10 -0400)]
PR14245: support /sys/kernel/debug mounted 0700
This is done by staprun passing a file descriptor for the
/sys/kernel/debug/systemtap/stap_MODULE directory from staprun
(running setuid) to stapio (running unprivileged, previously unable to
traverse to that path itself). This FD passing is done with a new
option -F<fd> for stapio (though by accident staprun also accepts (and
rejects) this option).
Since openat(2) is relatively recent, autoconf macros are used to back
down to graceful failure on older kernels, and to hide the new code.
New staprun always uses -F<fd> to stapio, even if permissions on
/sys/kernel/debug do not require it.
* staprun/common.c (relay_basedir_fd): New variable.
(parse_args): Parse new -F: option.
(usage): Document it.
* staprun/staprun.h: Corresponding changes.
* staprun/ctl.c (init_ctl_channel): Reorganize to try an incoming
relay_basedir_fd first (with a faccessat cross-user check) first.
Try to compute a relay_basedir_fd if not already set.
* staprun/mainloop.c (read_buffer_info): Note ignoring of this PR facility on
RHEL4-era old_transport.
* staprun/relayfs.c (init_relayfs): Attempt to open relay_fd[] using
relay_basedir_fd if specified.
* staprun/stapio.c: Top secret.
* staprun/staprun.c (main): Don't allow staprun itself to take -F, for it
could be misused by a very bad person (tm). However, arrange to pass
it to stapio, if we have incidentally discovered a good relay_basedir_fd.
* staprun/staprun_funcs.c (mountfs): Drop access_debugfs() check at this
point, as init_ctl_channel() will do the check later.
PR14555: handle 0 _stext relocs from userspace by kallsyms_lookup_name fallback
* runtime/transport/symbols.c (_stp_do_relocation): For an incoming
_stext=0 relocation (such as for /proc/sys/kernel/kptr_restrict = 2),
fall back to kallsyms_lookup_name.
William Cohen [Tue, 9 Oct 2012 20:30:48 +0000 (16:30 -0400)]
Make the tapcheck.sh look for all .stp files in the tapset directory
With the reorganization of the tapset directory there were some tapsets
multiple directories down. tapcheck.sh was only checking the top level
of the tapset directory. This meant that is was missing a many of the
tapsets that were in subdirectories. This change makes the results more
accurate.
Josh Stone [Tue, 9 Oct 2012 16:43:34 +0000 (09:43 -0700)]
PR14572: Set s.privilege = unprivileged for stapdyn
When running under Dyninst, we are effectively unprivileged by nature,
so setting s.privilege to reflect that helps restrict the available
probe types.
However, we still want to allow guru mode for setting target variables
and using embedded-C, so let systemtapr_:session::is_usermodea() pass.
* session.cxx (systemtap_session::parse_cmdline): For --runtime=dyninst,
set the privilege level too.
(systemtap_session::check_options): Allow -g for is_usermode().
* staptree.cxx (varuse_collecting_visitor::visit_embeddedcode): Allow
embedded-C unrestricted for is_usermode().
(varuse_collecting_visitor::visit_embedded_expr): Ditto.
commit 82523f19 changed the error-exit path of _stp_pmap_agg, but was
confused by the multiple (three!) levels of nested loops in effect at
the point of failure. While the prior "return;" skipped an overall
(newly needed) aggregate-unlock; the current "break;" skipped too
little. Switch to a proper simple goto to almost but not quite
return;.
Josh Stone [Sat, 6 Oct 2012 00:02:56 +0000 (17:02 -0700)]
stapdyn: Limit functions searches to the stap module
This is slightly more efficient because we already know which object we
expect to have the functions, so we don't need to search the whole app.
* stapdyn/stapdyn.cxx (call_inferior_function): Take the BPatch_object
in which we're searching for functions as a parameter.
(instrument_uprobe_target, instrument_uprobes): Ditto.
(dynamic_library_callback): Pass the stap dso from global.
(main): Save the stap dso into a global, and pass it as needed.
Josh Stone [Fri, 5 Oct 2012 23:27:27 +0000 (16:27 -0700)]
stapdyn: track dlopened objects for probes
* stapdyn/stapdyn.cxx (instrument_uprobe_target): New, factored out the
code to write all the probes from one target to one object.
(instrument_uprobes): Use instrument_uprobe_target for each.
(dynamic_library_callback): New, identify if a new object is a target,
and call instrument_uprobe_target if so.
(find_uprobes): Fill the vector as a parameter, and then return a
success status directly instead of floating the exception.
(main): Register the dynamic_library_callback for dlopens.
David Smith [Fri, 5 Oct 2012 17:52:43 +0000 (12:52 -0500)]
(PR14637 partial fix) Improve stapdyn locking.
* runtime/dyninst/runtime.h: Remove the 'stapdyn_big_dumb_lock' and make
preempt_disable() and preempt_enable_no_resched() do nothing for
dyninst.
* runtime/dyninst/tls_data.c: Change the mutex in tls_data_container_t to
a rwlock.
(_stp_tls_free_per_thread_ptr): Write lock the container before remove
the object from the list (and reduce the amount of time the container is
locked).
(_stp_tls_get_per_thread_ptr): Write lock the container before adding
the object from the list (and reduce the amount of time the container is
locked).
* runtime/map.c (_stp_pmap_agg): Be sure to unlock the map in error
conditions.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_get)): Be sure to unlock the
container in an error condition.
Josh Stone [Fri, 5 Oct 2012 02:04:14 +0000 (19:04 -0700)]
PR14573 (partial): Pass some registers into stapdyn
There doesn't seem to be a way to create and pass the pt_regs structure
from the Dyninst API, but we can still get most registers. This patch
adds a new enter_dyninst_uprobe_regs() to receive registers and fill
them into a pt_regs from there.
XXX Dyninst is currently limited in how many individual function
arguments it can pass, so for now I'm cutting it down to the first 8.
* runtime/dyninst/stapdyn.h: Declare enter_dyninst_uprobe_regs.
* runtime/dyninst/uprobes.c: Implement it, filling all dwarf registers
into a local struct pt_regs.
* runtime/dyninst/regs.c: Include regs.h to get SET_REG_IP.
* stapdyn/stapdyn.cxx (get_dwarf_registers): Create BPatch_snippets for
as many of the DWARF registers as possible (bug-limited to 8).
(instrument_uprobes): Look for the new entry function and use it.
Josh Stone [Thu, 4 Oct 2012 23:19:10 +0000 (16:19 -0700)]
PR14179: Split up loc2c-runtime.h for linux|dyninst
* runtime/loc2c-runtime.h: Remove deref functions and special register
handling from this shared base, and rename k_dwarf_register_N to
pt_dwarf_register_N to be more neutral.
* runtime/linux/loc2c-runtime.h: Move the deref functions and special
register handling here. Nothing new, just transplanted.
* runtime/dyninst/loc2c-runtime.h: Add deref and register functions.
* runtime/dyninst/copy.c (__copy_from_user, __copy_to_user): Move from
linux_def.h, since these are custom implementations, not kernel copies.
(_stp_strncpy_from_user, _stp_copy_from_user): New implementations.
Josh Stone [Wed, 3 Oct 2012 19:59:15 +0000 (12:59 -0700)]
stapdyn: nullify the pagefault machinations for derefs
We don't need to care about pagefault safety in userspace, but the
definitions making those into preempt_disable led to recursing on
stapdyn_big_dumb_lock (going away in PR14571). We can just #define
the pagefault_enable/disable away for the dyninst runtime.
David Smith [Tue, 2 Oct 2012 21:27:55 +0000 (16:27 -0500)]
(PR14571 partial fix) Add dyninst pmap stat fixes.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_tls_object_init)): Initialize the
histogram parameters, in case this pmap contains a histogram.
(KEYSYM(_stp_pmap_new)): For dyninst, override the tls data object init
function.
Dave Brolley [Tue, 2 Oct 2012 19:39:09 +0000 (15:39 -0400)]
Bug 860750 - stapusr user not able to run modules compiled and signed by the server
- When #ifndef HAVE_ELF_GETSHDRSTRNDX is true, then there is insufficient
ELF support to examine a signed systemtap module in order to determine the
privilege credentials required to run it. In this case, staprun should behave
like an older, multi-privilege-level-unaware, staprun and load the module for
stapusr and above. Since we know that it has been correctly signed, the module
is either an old dual-privilege module compiled fopr stapusr (ok), or it is
a new multi-privilege-enabled module compiled for stapusr or stapsys. In this
case, the module's internal self check will determine whether the user actually has
the required credentials. The module will abort if the user does not have the
required credentials.
- Small bug in translating the user's privilege credential mask to a string.
Josh Stone [Tue, 2 Oct 2012 18:23:48 +0000 (11:23 -0700)]
stapdyn: Fork output from stdout/stderr
We're still using the target's stdio (PR14491), but we're now using
separate FILE handles to do it, so we're not affected by the target
closing its own stdout early.
* runtime/dyninst/io.c (_stp_out, _stp_err): Private FILE handles.
(_stp_clone_file): Clone a FILE handle, also setting FD_CLOEXEC.
(_stp_warn, _stp_error, _stp_softerror, _stp_dbug): Use _stp_err.
* runtime/dyninst/print.c )_stp_print_flush): Use _stp_err and _stp_out.
* runtime/dyninst/runtime.h (stp_dyninst_ctor): Clone stderr and stdout.
(stp_dyninst_dtor): Close _stp_err and _stp_out.
Josh Stone [Tue, 2 Oct 2012 17:55:26 +0000 (10:55 -0700)]
stapdyn: Don't silence pass-4 gcc errors
* buildrun.cxx (compile_dyninst): Just as compile_pass() always shows
error output from Kbuild, we should always show gcc errors from
compiling dyninst modules too.
Josh Stone [Tue, 2 Oct 2012 17:55:19 +0000 (10:55 -0700)]
stapdyn: Use FD_CLOEXEC on _stp_mem_fd instead of O_CLOEXEC
O_CLOEXEC is only available since Linux 2.6.23, which is fairly old, but
we may still care to run on such systems. Using fcntl FD_CLOEXEC can
accomplish the same thing, and we don't need to worry about the race of
other threads calling exec at the same time as our module load, because
the whole process will be frozen.
* runtime/linux/copy.c: Move out _stp_read_address definition.
(__stp_strncpy_from_user): Simply accept vicarious protection
from caller _stp_strncpy_from_user.
(_stp_copy_from_user): Protect more.
* runtime/stp_string.c (_stp_text_str): Use _stp_read_address
instead of barenaked __stp_get_user.
* runtime/stp_string.h (__stp_get_user): Simplify; now only for
use by ...
(_stp_read_address): Moved here.
David Smith [Tue, 2 Oct 2012 14:56:42 +0000 (09:56 -0500)]
(PR14571 partial fix) Correctly handle maps with limited entries.
* translate.cxx (mapvar::init): Remove hardcoded 'wrap' initialization and
let _stp_map_new() initialize 'wrap'.
* runtime/map.c (_stp_map_init): Set new 'wrap' parameter in map itself.
(_stp_map_new): Pass new 'wrap' parameter down to _stp_map_init().
(_stp_map_tls_object_init): Pass cached 'wrap' field to _stp_map_init().
(_stp_pmap_new): Pass new 'wrap' parameter down to _stp_map_init().
* runtime/map.h: Update function prototypes with new 'wrap' parameter.
* runtime/map-gen.c (KEYSYM(_stp_map_new)): Pass new 'wrap' parameter down
to the correct _stp_map_new* function.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_new)): Pass new 'wrap' parameter
down to the correct _stp_pmap_new* function
* runtime/map-stat.c (_stp_map_new_hstat_log): Pass new 'wrap' parameter
down to _stp_map_new().
(_stp_map_new_hstat_linear): Ditto.
(_stp_pmap_new_hstat_linear): Ditto.
(_stp_pmap_new_hstat_log): Ditto.
PR14555, replace kernel symbol "_stext" by a macro in runtime/k_syms.h
The macro is used by the runtime as well as the compilation
components. It is not guaranteed that this symbol is always called
"_stext" on all archtitectures. On powerpc64 for example its name is
".__start". Stap will not run on other architectures where this symbol
has a different name because the lookup for "_stext" will fail.
Adjusted by <fche> to leave _stext as the relocation pseudo-section
name as used by relocation basis code, and parametrizing only
symbol names.
Mark Wielaard [Sun, 30 Sep 2012 21:44:28 +0000 (23:44 +0200)]
testsuite stap_run_batch don't add an extra empty argument.
commit 8c94ef made it possible to add extra arguments to stap_run_batch.
But we must make sure we don't accidentially add an extra empty argument.
Some testcases like parseok/fourteen.stp depend on @# being zero.
Mark Wielaard [Sun, 30 Sep 2012 16:12:19 +0000 (18:12 +0200)]
memory.stp: do_mmap was replaced by vm_mmap.
do_mmap was completely replaced by vm_mmap, so if either is a good match
for the vm.mmap probe alias. See kernel commits 6be5ceb and dc98250.
(Note, do_mmap2 is a special case just for powerpc.)
do_munmap was partially replaced by vm_munmap, but vm_munmap calls
though do_munmap, so for the vm.munmap probe alias do_munmap is the
best match. See kernel commits a46ef99 and 17d1587.
The same is true for do_brk and vm_brk, vm_brk calls through do_brk,
so for the vm.brk probe alias do_brk is the function to probe. See
kernel commit e4eb1ff.
This used to resolve partly before because when CONFIG_COMPAT = "y"
there would still be a compat_sys_nfsservctl. But that was slightly
bogus because without CONFIG_NFSD that would just be:
Mark Wielaard [Sat, 29 Sep 2012 17:13:47 +0000 (19:13 +0200)]
parser::parse_global(): break after seeing a termination token.
The code would swallow the terminating token and then inspect whether it
was a ','. Which it obviously wasn't since we had just seen that it was
a ';'.
Mark Wielaard [Fri, 28 Sep 2012 21:16:18 +0000 (23:16 +0200)]
Add get_self_path() as workaround for running under valgrind.
get_base_hash() wants to get some stats of the main binary. But when
running under valgind a stat call on /proc/self/exe actually gives
the stats of the valgrind process binary. Using readlink before calling
stat works around that (readlink is intercepted by valgrind, stat isn't).
Add get_self_path to util.cxx.
Josh Stone [Sat, 29 Sep 2012 01:29:40 +0000 (18:29 -0700)]
stapdyn: Clean up error/warning/log messages
* stapdyn/stapdyn.cxx (staplog, stapwarn, staperror): New ostream
functions that allow common prefixes, log levels, and suppressed
warnings. All appropriate clog's are updated to these.
* buildrun.cxx (make_dyninst_run_command): Set -v and -w options.
Josh Stone [Fri, 28 Sep 2012 21:28:13 +0000 (14:28 -0700)]
stapdyn: Check and report the child exit status
* stapdyn/dynutil.cxx (check_dyninst_exit): New, check how the given
BPatch_process exited, and report failures.
* stapdyn/stapdyn.cxx (main): Use check_dyninst_exit.
* stapdyn/dynsdt.cxx (main): Use check_dyninst_exit.
PR14364, PR14630: Use set_fs and pagefault_disable/enable around more accesses
It turns out there are a bunch of conceptually overlapping
functions/macros throughout the runtime, each of which attempts to
dereference untrustworthy kernel- or user-space pointers, in slightly
different ways.
When deliberately invoked with bad pointer values, some lockdep
kernels (e.g. 2.6.32-279.9.1.el6.x86_64.debug) would emit errors about
page-fault handling paths being triggered in inappropriate contexts
for some of these lookup functions. It turns out a more robust
control of address space checking and fault suppression is necessary.
* runtime/linux/autoconf-pagefault_disable.c: New test.
* buildrun.cxx (compile_pass): Run it.
* runtime/linux/copy.c (_stp_read_address): Add pagefault_{disable,enable}.
Note duplication with loc2c-runtime.h
(_stp_strncpy_from_user): Add set_fs & pagefault_{disable,enable}.
Note duplication with loc2c-runtime.h
* runtime/stp_string.h (__stp_get_user): Wrap in pagefault_{disable,enable}.
Note duplication with loc2c-runtime.h
* tapset/uconversions.stp (__STP_GET_USER): Instead of __stp_get_user,
zap duplication with loc2c-runtime.h and just call loc2c-runtime.h.
* runtime/loc2c-runtime.h (STAPCONF_PAGEFAULT_DISABLE): Add dummy
macros for pre-rhel5 kernels.
(_stp_deref, _stp_store_deref): Revamped arch-specific macros, setting
segments and disabling pagefaults.
(uderef,ustore_deref,kderef,kstore_deref): Revamped macros to call the
above. These should become the standard throughout the runtime/tapset.