Josh Stone [Sat, 6 Oct 2012 00:02:56 +0000 (17:02 -0700)]
stapdyn: Limit functions searches to the stap module
This is slightly more efficient because we already know which object we
expect to have the functions, so we don't need to search the whole app.
* stapdyn/stapdyn.cxx (call_inferior_function): Take the BPatch_object
in which we're searching for functions as a parameter.
(instrument_uprobe_target, instrument_uprobes): Ditto.
(dynamic_library_callback): Pass the stap dso from global.
(main): Save the stap dso into a global, and pass it as needed.
Josh Stone [Fri, 5 Oct 2012 23:27:27 +0000 (16:27 -0700)]
stapdyn: track dlopened objects for probes
* stapdyn/stapdyn.cxx (instrument_uprobe_target): New, factored out the
code to write all the probes from one target to one object.
(instrument_uprobes): Use instrument_uprobe_target for each.
(dynamic_library_callback): New, identify if a new object is a target,
and call instrument_uprobe_target if so.
(find_uprobes): Fill the vector as a parameter, and then return a
success status directly instead of floating the exception.
(main): Register the dynamic_library_callback for dlopens.
David Smith [Fri, 5 Oct 2012 17:52:43 +0000 (12:52 -0500)]
(PR14637 partial fix) Improve stapdyn locking.
* runtime/dyninst/runtime.h: Remove the 'stapdyn_big_dumb_lock' and make
preempt_disable() and preempt_enable_no_resched() do nothing for
dyninst.
* runtime/dyninst/tls_data.c: Change the mutex in tls_data_container_t to
a rwlock.
(_stp_tls_free_per_thread_ptr): Write lock the container before remove
the object from the list (and reduce the amount of time the container is
locked).
(_stp_tls_get_per_thread_ptr): Write lock the container before adding
the object from the list (and reduce the amount of time the container is
locked).
* runtime/map.c (_stp_pmap_agg): Be sure to unlock the map in error
conditions.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_get)): Be sure to unlock the
container in an error condition.
Josh Stone [Fri, 5 Oct 2012 02:04:14 +0000 (19:04 -0700)]
PR14573 (partial): Pass some registers into stapdyn
There doesn't seem to be a way to create and pass the pt_regs structure
from the Dyninst API, but we can still get most registers. This patch
adds a new enter_dyninst_uprobe_regs() to receive registers and fill
them into a pt_regs from there.
XXX Dyninst is currently limited in how many individual function
arguments it can pass, so for now I'm cutting it down to the first 8.
* runtime/dyninst/stapdyn.h: Declare enter_dyninst_uprobe_regs.
* runtime/dyninst/uprobes.c: Implement it, filling all dwarf registers
into a local struct pt_regs.
* runtime/dyninst/regs.c: Include regs.h to get SET_REG_IP.
* stapdyn/stapdyn.cxx (get_dwarf_registers): Create BPatch_snippets for
as many of the DWARF registers as possible (bug-limited to 8).
(instrument_uprobes): Look for the new entry function and use it.
Josh Stone [Thu, 4 Oct 2012 23:19:10 +0000 (16:19 -0700)]
PR14179: Split up loc2c-runtime.h for linux|dyninst
* runtime/loc2c-runtime.h: Remove deref functions and special register
handling from this shared base, and rename k_dwarf_register_N to
pt_dwarf_register_N to be more neutral.
* runtime/linux/loc2c-runtime.h: Move the deref functions and special
register handling here. Nothing new, just transplanted.
* runtime/dyninst/loc2c-runtime.h: Add deref and register functions.
* runtime/dyninst/copy.c (__copy_from_user, __copy_to_user): Move from
linux_def.h, since these are custom implementations, not kernel copies.
(_stp_strncpy_from_user, _stp_copy_from_user): New implementations.
Josh Stone [Wed, 3 Oct 2012 19:59:15 +0000 (12:59 -0700)]
stapdyn: nullify the pagefault machinations for derefs
We don't need to care about pagefault safety in userspace, but the
definitions making those into preempt_disable led to recursing on
stapdyn_big_dumb_lock (going away in PR14571). We can just #define
the pagefault_enable/disable away for the dyninst runtime.
David Smith [Tue, 2 Oct 2012 21:27:55 +0000 (16:27 -0500)]
(PR14571 partial fix) Add dyninst pmap stat fixes.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_tls_object_init)): Initialize the
histogram parameters, in case this pmap contains a histogram.
(KEYSYM(_stp_pmap_new)): For dyninst, override the tls data object init
function.
Dave Brolley [Tue, 2 Oct 2012 19:39:09 +0000 (15:39 -0400)]
Bug 860750 - stapusr user not able to run modules compiled and signed by the server
- When #ifndef HAVE_ELF_GETSHDRSTRNDX is true, then there is insufficient
ELF support to examine a signed systemtap module in order to determine the
privilege credentials required to run it. In this case, staprun should behave
like an older, multi-privilege-level-unaware, staprun and load the module for
stapusr and above. Since we know that it has been correctly signed, the module
is either an old dual-privilege module compiled fopr stapusr (ok), or it is
a new multi-privilege-enabled module compiled for stapusr or stapsys. In this
case, the module's internal self check will determine whether the user actually has
the required credentials. The module will abort if the user does not have the
required credentials.
- Small bug in translating the user's privilege credential mask to a string.
Josh Stone [Tue, 2 Oct 2012 18:23:48 +0000 (11:23 -0700)]
stapdyn: Fork output from stdout/stderr
We're still using the target's stdio (PR14491), but we're now using
separate FILE handles to do it, so we're not affected by the target
closing its own stdout early.
* runtime/dyninst/io.c (_stp_out, _stp_err): Private FILE handles.
(_stp_clone_file): Clone a FILE handle, also setting FD_CLOEXEC.
(_stp_warn, _stp_error, _stp_softerror, _stp_dbug): Use _stp_err.
* runtime/dyninst/print.c )_stp_print_flush): Use _stp_err and _stp_out.
* runtime/dyninst/runtime.h (stp_dyninst_ctor): Clone stderr and stdout.
(stp_dyninst_dtor): Close _stp_err and _stp_out.
Josh Stone [Tue, 2 Oct 2012 17:55:26 +0000 (10:55 -0700)]
stapdyn: Don't silence pass-4 gcc errors
* buildrun.cxx (compile_dyninst): Just as compile_pass() always shows
error output from Kbuild, we should always show gcc errors from
compiling dyninst modules too.
Josh Stone [Tue, 2 Oct 2012 17:55:19 +0000 (10:55 -0700)]
stapdyn: Use FD_CLOEXEC on _stp_mem_fd instead of O_CLOEXEC
O_CLOEXEC is only available since Linux 2.6.23, which is fairly old, but
we may still care to run on such systems. Using fcntl FD_CLOEXEC can
accomplish the same thing, and we don't need to worry about the race of
other threads calling exec at the same time as our module load, because
the whole process will be frozen.
* runtime/linux/copy.c: Move out _stp_read_address definition.
(__stp_strncpy_from_user): Simply accept vicarious protection
from caller _stp_strncpy_from_user.
(_stp_copy_from_user): Protect more.
* runtime/stp_string.c (_stp_text_str): Use _stp_read_address
instead of barenaked __stp_get_user.
* runtime/stp_string.h (__stp_get_user): Simplify; now only for
use by ...
(_stp_read_address): Moved here.
David Smith [Tue, 2 Oct 2012 14:56:42 +0000 (09:56 -0500)]
(PR14571 partial fix) Correctly handle maps with limited entries.
* translate.cxx (mapvar::init): Remove hardcoded 'wrap' initialization and
let _stp_map_new() initialize 'wrap'.
* runtime/map.c (_stp_map_init): Set new 'wrap' parameter in map itself.
(_stp_map_new): Pass new 'wrap' parameter down to _stp_map_init().
(_stp_map_tls_object_init): Pass cached 'wrap' field to _stp_map_init().
(_stp_pmap_new): Pass new 'wrap' parameter down to _stp_map_init().
* runtime/map.h: Update function prototypes with new 'wrap' parameter.
* runtime/map-gen.c (KEYSYM(_stp_map_new)): Pass new 'wrap' parameter down
to the correct _stp_map_new* function.
* runtime/pmap-gen.c (KEYSYM(_stp_pmap_new)): Pass new 'wrap' parameter
down to the correct _stp_pmap_new* function
* runtime/map-stat.c (_stp_map_new_hstat_log): Pass new 'wrap' parameter
down to _stp_map_new().
(_stp_map_new_hstat_linear): Ditto.
(_stp_pmap_new_hstat_linear): Ditto.
(_stp_pmap_new_hstat_log): Ditto.
PR14555, replace kernel symbol "_stext" by a macro in runtime/k_syms.h
The macro is used by the runtime as well as the compilation
components. It is not guaranteed that this symbol is always called
"_stext" on all archtitectures. On powerpc64 for example its name is
".__start". Stap will not run on other architectures where this symbol
has a different name because the lookup for "_stext" will fail.
Adjusted by <fche> to leave _stext as the relocation pseudo-section
name as used by relocation basis code, and parametrizing only
symbol names.
Mark Wielaard [Sun, 30 Sep 2012 21:44:28 +0000 (23:44 +0200)]
testsuite stap_run_batch don't add an extra empty argument.
commit 8c94ef made it possible to add extra arguments to stap_run_batch.
But we must make sure we don't accidentially add an extra empty argument.
Some testcases like parseok/fourteen.stp depend on @# being zero.
Mark Wielaard [Sun, 30 Sep 2012 16:12:19 +0000 (18:12 +0200)]
memory.stp: do_mmap was replaced by vm_mmap.
do_mmap was completely replaced by vm_mmap, so if either is a good match
for the vm.mmap probe alias. See kernel commits 6be5ceb and dc98250.
(Note, do_mmap2 is a special case just for powerpc.)
do_munmap was partially replaced by vm_munmap, but vm_munmap calls
though do_munmap, so for the vm.munmap probe alias do_munmap is the
best match. See kernel commits a46ef99 and 17d1587.
The same is true for do_brk and vm_brk, vm_brk calls through do_brk,
so for the vm.brk probe alias do_brk is the function to probe. See
kernel commit e4eb1ff.
This used to resolve partly before because when CONFIG_COMPAT = "y"
there would still be a compat_sys_nfsservctl. But that was slightly
bogus because without CONFIG_NFSD that would just be:
Mark Wielaard [Sat, 29 Sep 2012 17:13:47 +0000 (19:13 +0200)]
parser::parse_global(): break after seeing a termination token.
The code would swallow the terminating token and then inspect whether it
was a ','. Which it obviously wasn't since we had just seen that it was
a ';'.
Mark Wielaard [Fri, 28 Sep 2012 21:16:18 +0000 (23:16 +0200)]
Add get_self_path() as workaround for running under valgrind.
get_base_hash() wants to get some stats of the main binary. But when
running under valgind a stat call on /proc/self/exe actually gives
the stats of the valgrind process binary. Using readlink before calling
stat works around that (readlink is intercepted by valgrind, stat isn't).
Add get_self_path to util.cxx.
Josh Stone [Sat, 29 Sep 2012 01:29:40 +0000 (18:29 -0700)]
stapdyn: Clean up error/warning/log messages
* stapdyn/stapdyn.cxx (staplog, stapwarn, staperror): New ostream
functions that allow common prefixes, log levels, and suppressed
warnings. All appropriate clog's are updated to these.
* buildrun.cxx (make_dyninst_run_command): Set -v and -w options.
Josh Stone [Fri, 28 Sep 2012 21:28:13 +0000 (14:28 -0700)]
stapdyn: Check and report the child exit status
* stapdyn/dynutil.cxx (check_dyninst_exit): New, check how the given
BPatch_process exited, and report failures.
* stapdyn/stapdyn.cxx (main): Use check_dyninst_exit.
* stapdyn/dynsdt.cxx (main): Use check_dyninst_exit.
PR14364, PR14630: Use set_fs and pagefault_disable/enable around more accesses
It turns out there are a bunch of conceptually overlapping
functions/macros throughout the runtime, each of which attempts to
dereference untrustworthy kernel- or user-space pointers, in slightly
different ways.
When deliberately invoked with bad pointer values, some lockdep
kernels (e.g. 2.6.32-279.9.1.el6.x86_64.debug) would emit errors about
page-fault handling paths being triggered in inappropriate contexts
for some of these lookup functions. It turns out a more robust
control of address space checking and fault suppression is necessary.
* runtime/linux/autoconf-pagefault_disable.c: New test.
* buildrun.cxx (compile_pass): Run it.
* runtime/linux/copy.c (_stp_read_address): Add pagefault_{disable,enable}.
Note duplication with loc2c-runtime.h
(_stp_strncpy_from_user): Add set_fs & pagefault_{disable,enable}.
Note duplication with loc2c-runtime.h
* runtime/stp_string.h (__stp_get_user): Wrap in pagefault_{disable,enable}.
Note duplication with loc2c-runtime.h
* tapset/uconversions.stp (__STP_GET_USER): Instead of __stp_get_user,
zap duplication with loc2c-runtime.h and just call loc2c-runtime.h.
* runtime/loc2c-runtime.h (STAPCONF_PAGEFAULT_DISABLE): Add dummy
macros for pre-rhel5 kernels.
(_stp_deref, _stp_store_deref): Revamped arch-specific macros, setting
segments and disabling pagefaults.
(uderef,ustore_deref,kderef,kstore_deref): Revamped macros to call the
above. These should become the standard throughout the runtime/tapset.
runtime: add noinline to *printf fns to limit frame-size errors
During the debugging work for PR14630, it turned out to trigger these
warning->errors. Some noinline's, and one or two static[]'s,
<south park>.... and it's gone!</>
Josh Stone [Fri, 28 Sep 2012 00:55:05 +0000 (17:55 -0700)]
PR14489: Revamp stapdyn probe metadata
Rather than having a fixed data structure for stapdyn to read from the
module, now stapdyn queries the module dynamically for its data. Thus,
we dlopen the module within stapdyn, then dlsym a few query functions
and use those to enumerate all of the probe data.
* runtime/dyninst/stapdyn.h: Add functions for metadata.
* runtime/dyninst/uprobes.h: New, define module internal datastructures.
* runtime/dyninst/uprobes.c: Implement the metadata functions.
* tapsets.cxx: Generate probe metadat in the new datastructures.
* stapdyn/stapdyn.cxx: Query probes using the new functions.
* stapdyn/Makefile.am: No longer need -lelf for stapdyn.
* stapdyn/Makefile.in: Regenerate.
Josh Stone [Thu, 27 Sep 2012 20:25:30 +0000 (13:25 -0700)]
PR14574: Let stapdyn run without a target command
For easier testing, let begin/end/error probes run directly in stapdyn
when there's no -c option given.
* runtime/dyninst/stapdyn.h: New, declare functions defined in the
module and used by stapdyn.
* runtime/dyninst/runtime.h: Include stapdyn.h so definitions match.
* stapdyn/stapdyn.cxx: When no command is given, dlsym the init/exit
functions and run them directly.
* (autoreconf...)
Josh Stone [Thu, 27 Sep 2012 18:42:03 +0000 (11:42 -0700)]
hash: Include several subdirectories of the runtime
One advantage of having the runtime path in the hash is its mtime also
tells us when any files were modified (at least with most editors that
write a temp file then atomically rename). For stap hackers like
myself, it would be nice to get this benefit for subdirectories of the
runtime too, so let's also add those to the hash.
* hash.cxx (get_base_hash): Add /transport, /unwind, and either /dyninst
or /linux depending on the current runtime mode.
(find_uprobes_hash): Update the uprobes paths since PR14179's move.
Josh Stone [Thu, 27 Sep 2012 16:19:39 +0000 (09:19 -0700)]
PR13486: Always output a frame_base if needed
Previously, loc2c only emitted code for frame_base if it was on the
first loc of the chain. But in some cases, the first piece may not
reference the frame_base while later parts do.
In the cases found in the bug, the first piece was GNU_implicit_pointer,
which doesn't even emit code of its own. But its target DIE did need
the frame_base to compute its value. Since loc2c didn't realize that,
we ended up emitting code using frame_base without ever declaring it.
* loc2c.c (c_emit_location): Loop over the whole loc chain to determine
if a frame_base is needed, and output the first one found.
Josh Stone [Wed, 26 Sep 2012 22:17:10 +0000 (15:17 -0700)]
stapdyn: Resolve the target executable from the PATH
BPatch::processCreate needs a full path for the process argument, so we
need to walk the PATH to figure that out. We already did this in
dynsdt, but using a private function. Now both use find_executable from
util.h, in a new form that doesn't care about sysroots.
* util.cxx (find_executable): Add a name-only version that just wraps
the full sysroot version, for convenience. Also let a few more things
be const in the implementation.
* hash.cxx (get_base_hash): Use the wrapper function for finding gcc,
instead of having a "dummy" sysenv itself.
* stapdyn/stapdyn.cxx (main): Resolve the target with find_executable.
* stapdyn/dynsdt.cxx (resolve_path): Removed.
(main): Use find_executable instead.
For helping diagnose crashes that may occur during a testsuite,
set $SYSTEMTAP_SYNC, which is handled by staprun, just before it
does the module-insertion. This will make tests slower, sorry.
Josh Stone [Mon, 24 Sep 2012 23:15:55 +0000 (16:15 -0700)]
stapdyn: Enable end/error probes
The dyninst exit hook runs too late for us to still call anything in the
mutatee, so the systemtap_module_exit() call which should run all of the
end/error probes wasn't happening.
Now we use a destructor function in the mutatee, so our exit path always
runs after main() returns or after an exit() call. Functions like
_exit() are still problematic though.
This now also makes a distinction between initializing process-local vs.
session resources, so we are more ready for operating with multiple
mutatees at once. See dyninst/runtime.h for design comments.
David Smith [Fri, 21 Sep 2012 20:13:28 +0000 (15:13 -0500)]
(PR14571 partial fix) For dyninst, use TLS for map and stat data.
* runtime/dyninst/tls_data.c: New file.
* runtime/stat.c (struct _Stat): Add a tls_data_container_t structure.
(_stp_stat_tls_object_init): New function.
(_stp_stat_tls_object_free): Ditto.
(_stp_stat_init): Instead of directly allocating percpu data, for
dyninst set up tls data to be created when accessed by calling
_stp_tls_data_container_init().
(_stp_stat_del): For dyninst, call _stp_tls_data_container_cleanup() to
remove all the tls data.
(_stp_stat_add): For dyninst, get the proper tls stat object.
(_stp_stat_get_cpu): Deleted unused function.
(_stp_stat_get): For dyninst, get the proper tls stat objects.
(_stp_stat_clear): For dyninst, clear the stat in each thread's tls data.
* runtime/stat.h (struct stat_data): Add a tls_data_object_t structure.
* runtime/map.c (_stp_map_tls_object_init): New function.
(_stp_map_tls_object_free): Ditto.
(_stp_pmap_new): Instead of directly allocating percpu data, for dyninst
set up tls data to be created when accessed by calling
_stp_tls_data_container_init().
(_stp_pmap_clear): For dyninst, clear the map in each thread's tls data.
(_stp_pmap_del): For dyninst, call _stp_tls_data_container_cleanup() to
remove all the tls data.
(_stp_pmap_agg): Add dyninst support.
* runtime/map.h (struct map_root): Add a tls_data_object_t structure.
(struct pmap): Add a tls_data_container_t structure.
* runtime/map-stat.c (_stp_hstat_tls_object_init): New function.
(_stp_pmap_new_hstat_linear): For dyninst, override the standard tls
data object init function with _stp_hstat_tls_object_init(), which knows
how to handle hstats.
(_stp_pmap_new_hstat_log): Ditto.
* runtime/pmap-gen.c (_stp_pmap_tls_object_init): New function.
(_stp_pmap_new): For dyninst, override the standard tls
data object init function with _stp_pmap_tls_object_init(), which knows
how to handle pmaps.
(_stp_pmap_set): For dyninst, get the proper tls pmap object.
(_stp_pmap_add): Ditto.
(_stp_pmap_get_cpu): Ditto.
(_stp_pmap_get): Ditto.
(_stp_pmap_del): Ditto.
* runtime/dyninst/linux_defs.h: Added container_of(), list_entry(),
list_for_each_entry(), and list_for_each_entry_safe().
Mark Wielaard [Wed, 19 Sep 2012 08:33:29 +0000 (10:33 +0200)]
parse.cxx swallow tokens we are definitely not using.
The tokens produced by expect_* () were immediately dropped on the floor
after inspection. And a lot of places in the parser called next () just
to get passed the current token without using it. Those tokens could
immediately be cleaned up saving ~3MB of "lost memory".
valgrind stap -v -k -p4 -e 'probe begin { log("Hello, World!"); exit(); }'
Before:
==12545== definitely lost: 2,470,128 bytes in 51,408 blocks
==12545== indirectly lost: 14,180,805 bytes in 319,624 blocks
After:
==14782== definitely lost: 18,856 bytes in 220 blocks
==14782== indirectly lost: 12,432,436 bytes in 264,176 blocks
Implements a cached unwinder, allowing most backtrace tapset functions
to be implemented in terms of stack() without loss of performance.
stack() calls can be made repeatedly and in any order, and they will
use the results of a single unwind. (Works only with the DWARF
unwinder).
_stp_stack_kernel_print et. al. retain their prior behaviour,
including a number of fallbacks not available to the incremental
unwind. These fallbacks only emit backtrace strings, which can be
tokenized on the tapset end as a last resort.
* runtime/unwind/unwind.h -- define struct unwind_cache to store PCs
obtained from unwinder.
* runtime/common_probe_context.h -- include two sets of unwinder
context and cache, one for user side, one for kernel.
* tapsets.cxx -- probe prologue includes a small thing to mark
the unwind caches as being in an uninitialized state.
* runtime/stack.c -- incremental unwinder implementation.
* runtime/stack-dwarf.c -- deleted. Code moved to stack.c since
this is now the preferred unwind method.
* tapset/linux/[u]context-symbols.stp -- change stack(), ustack()
to directly call incremental unwinder.