Josh Stone [Tue, 12 Aug 2014 17:29:30 +0000 (10:29 -0700)]
PR17260: Use get_context to guard stp_print_flush's lock
Holding a context ensures that any probes triggered in the interim will
be considered reentrant and skipped, since such a nested probe might
have recursed on that spinlock. We faced a similar situation before
with _stp_ctl_send and all the locks it touches.
Frank Ch. Eigler [Mon, 11 Aug 2014 19:56:15 +0000 (15:56 -0400)]
statement.nearest probes followup: some docs, samples, tweakage
* NEWS: Mention it.
* man/stapprobes.3stap: Document it.
* testsuite/systemtap.examples/*: Use it.
* testsuite/systemtap.*/: Baby test it.
* dwflpp.cxx: Drop debugging statement and make a speech.
Honggyu Kim [Mon, 4 Aug 2014 13:18:40 +0000 (22:18 +0900)]
dwflpp: register statement.nearest suffix
If a line number is given in 'statement', line records in dwarf may not
be found for a given line number.
In this case, alternative line numbers were suggested and exited.
With statement.nearest suffix, a kprobe is inserted into the nearest
line number that is available in dwarf line record.
* dwflpp.cxx(dwflpp.cxx::insert_alternative_linenos): Add a new method,
Add an arg "has_nearest" in dwflpp::iterate_over_srcfile_lines
* dwflpp.h(dwflpp.cxx::insert_alternative_linenos): Ditto.
* tapsets.cxx: Add a new suffix statement.nearest
Jonathan Lebon [Mon, 11 Aug 2014 19:40:19 +0000 (15:40 -0400)]
Merge branch 'jlebon/onthefly' (PR10995)
This branch adds support for on-the-fly probes as described in PR10995.
It also includes various minor fixes as well as a new file
runtime/linux/kprobes.c which hosts kprobes-related code (rather than
being dynamically emitted from tapsets.cxx).
Jonathan Lebon [Wed, 23 Jul 2014 20:24:45 +0000 (16:24 -0400)]
on-the-fly: don't use background timer if hrtimers missing
On older systems (< 2.6.17), hrtimers are not supported. Guard code
related to the background timer with this check so that we can at least
still compile code on these older platforms.
Jonathan Lebon [Tue, 22 Jul 2014 18:41:42 +0000 (14:41 -0400)]
on-the-fly: only start background timer if needed
Rather than always starting the background timer, only start it when it
is needed. That is, start the background timer when a probe which has an
effect on the conditions of probes which support on-the-fly operations
isn't a safe context for calling schedule_work() (determined by
otf_safe_context()).
Jonathan Lebon [Thu, 24 Jul 2014 18:15:22 +0000 (14:15 -0400)]
make schedule_work() call depend on otf_safe_context()
Now that each probe group directly describes whether their context is
safe for workqueue manipulations, we can directly emit in the probe
epilogue a call to schedule_work() if the probe is safe, rather than
doing it on a case-by-case basis.
Jonathan Lebon [Wed, 23 Jul 2014 18:32:15 +0000 (14:32 -0400)]
on-the-fly: make support a property of group rather than probe
Support for on-the-fly operations is more a property of the
derived_probe_group (which does the actual emitting), rather than
derived_probe.
For example, not because a dwarf_derived_probe supports on-the-fly
operations does it mean that a uprobe_derived_probe (which inherits from
dwarf_derived_probe) does. Similarly, a uprobe_derived_probe's support
for on-the-fly operations depends on the actual code emitted by the
group, which will emit different things depending on whether we're using
utrace or inode-uprobes for example.
To do this, we introduce a 'group' attribute which remembers to which
group a derived_probe has been added. This is then used during
translation time to check if the probe group supports on-the-fly
operations.
This patch also introduces otf_safe_context(), which determines whether
the context of the probe type is safe enough for direct workqueue
manipulations. This then allows us to only use the background timer if
the probe doing the toggling does not support workqueue manipulations.
Jonathan Lebon [Tue, 22 Jul 2014 20:24:00 +0000 (16:24 -0400)]
split linux/timer.c into .c and .h
For the background timer, it is useful to have some of the definitions
currently sitting in linux/timer.c. Split it into a header file and
include the header.
Jonathan Lebon [Tue, 22 Jul 2014 18:16:27 +0000 (14:16 -0400)]
on-the-fly: use a background timer to schedule work
Calling schedule_work() is not always safe from some contexts (e.g. when
tracing/probing the internals of workqueues themselves).
We remove the code which previously called schedule_work() in the common
epilogue of all probe types. We will need to vet on a case-by-case basis
which probe types are safe.
Meanwhile, we implement a background timer which simply checks if
schedule_work() needs to be called.
Jonathan Lebon [Thu, 17 Jul 2014 14:50:21 +0000 (10:50 -0400)]
affection.exp: also check probe globals locking
In light of the locking issue mentioned in the previous commit, this
commit now updates affection.exp so that locking is also checked to
ensure probes lock the right vars for the right access.
Jonathan Lebon [Wed, 16 Jul 2014 16:31:15 +0000 (12:31 -0400)]
on-the-fly: read-lock visited globals
If we have the following situation
probe X if (a || b) {...}
probe Y {a = ...}
probe Z {b = ...}
then we will have Y write-locking a and Z write-locking b, but because
these variables affect X's condition, the cond_enabled of X will be
re-evaluated in the out: path of both Y and Z. This means that it could
happen that Y tries to read b at the same time as Z updates it, and
vice-versa.
This patch ensures that Y and Z also read-lock b and a, respectively. It
does this by making the varuse collector also visit the conditions of
probes who we can affect.
Jonathan Lebon [Wed, 16 Jul 2014 15:59:56 +0000 (11:59 -0400)]
on-the-fly: use atomic_t for need_module_refresh
The need_module_refresh global can get written to in two different
locations at the same time. To avoid getting a messed up value, use
atomic_t operations.
Concurrency-wise, in the worse case, we get work scheduled twice rather
than once only.
Jonathan Lebon [Wed, 16 Jul 2014 14:52:32 +0000 (10:52 -0400)]
kprobes.c: link stap_dwarf_probe to stap_dwarf_kprobe
Prior to PR5673, the stap_dwarf_kprobe struct was embedded in the
stap_dwarf_probe struct. It was then moved out due to issues mentioned
in PR5673.
In this patch we simply add back a pointer member in stap_dwarf_probe to
its own stap_dwarf_kprobe so that they may never be mistakenly shared.
This also greatly simplifies many of the function signatures which
previously took in the stap_dwarf_probe and the stap_dwarf_kprobe as
separate parameters.
Jonathan Lebon [Wed, 16 Jul 2014 15:16:29 +0000 (11:16 -0400)]
kprobes.c: memset also after batch unregistration
We should not only clear the kprobe struct after a single
unregistration, but also when batch unregistration is used. (Even though
batch unregistration is normally only done when exiting, but better safe
than sorry!).
Jonathan Lebon [Wed, 16 Jul 2014 14:27:37 +0000 (10:27 -0400)]
kprobes.c: remove enabled_p from stap_dwarf_probe
Using the kernel function kprobe_disabled(), we can directly query
whether a kprobe is enabled or not. This makes the enabled_p field
redundant. We replace its use with a stapkp_enabled() function which
simply call kprobe_disabled().
Jonathan Lebon [Tue, 15 Jul 2014 20:17:31 +0000 (16:17 -0400)]
runtime: remove STP_ON_THE_FLY
This patch removes the use of the STP_ON_THE_FLY macro so that
on-the-fly related code is always emitted/executed. When no probes use
conditions, the overhead is quite small: the cond_enabled field of each
stap_probe is set to 1 at start-up.
In general, blocks that were previously incompatible with dyninst and
were behind an STP_ON_THE_FLY guard are now emitted only if !usermode.
The tests were adjusted to not test STP_ON_THE_FLY_DISABLED, which no
longer exists.
It is necessary to evaluate a given if(FOO) expression with a !!
prefix in order to turn it into a 0/1 boolean for probe.cond_enabled
matching purposes. With that done, a 1-bit field for cond_enabled is
sufficient.
Jonathan Lebon [Fri, 20 Jun 2014 19:10:53 +0000 (15:10 -0400)]
kprobes.c: register as disabled instead of post disabling
The register_kprobe() function supports settings the kprobe struct
flags member to KPROBE_FLAG_DISABLED to indicate that we want it
registered but disabled (see also Documentation/kprobes.txt).
This patch takes advantage of this by setting the flags member
accordingly during registration, rather than calling disable_kprobe()
after a successful registration.
Jonathan Lebon [Thu, 19 Jun 2014 20:41:17 +0000 (16:41 -0400)]
PR16861: reset kprobe struct and improve refresh
We need to ensure that the stap_dwarf_kprobe struct is completely
zero'ed out after each unregistration so as not to affect future
registrations which will use the same struct.
We also modify the signature of systemtap_module_refresh so that the
name of the module is passed. This allows us to only update the kprobes
related to that module, rather than checking all of them.
Finally, we also set the priority of the module notifier to 0 to
indicate we don't care in which order we are called (i.e. it shouldn't
matter whether we're called before or after the kprobes callback).
Jonathan Lebon [Tue, 17 Jun 2014 21:42:10 +0000 (17:42 -0400)]
kprobes.c: split stapkp_refresh_probe()
We break down stapkp_refresh_probe() into stapkp_enable_probe() and
stapkp_disable_probe(). We also introduce predicate functions
stapkp_should_enable_probe() and stapkp_should_disable_probe() to
improve clarity.
Jonathan Lebon [Tue, 17 Jun 2014 20:54:47 +0000 (16:54 -0400)]
kprobes.c: refactor stapkp_register_probe()
Here, we split the general stapkp_register_probe into a kprobe and
kretprobe variant, and then further split the work into preparing the
kprobe for registration and actually registering it.
Jonathan Lebon [Tue, 17 Jun 2014 19:07:12 +0000 (15:07 -0400)]
kprobes.c: factor out stapkp_unregister_probes()
In this commit, we do a few things:
1. We separate the kprobe nmissed accounting from actual unregistration.
The stapkp_add_missed() function now solely takes care of this.
2. We abstract away support for bulk unregistration behind one main
stapkp_unregister_probes() function. This function uses the bulk
method if supported, falling back on manual one-by-one unregistration
if not.
Jonathan Lebon [Tue, 17 Jun 2014 13:50:38 +0000 (09:50 -0400)]
kprobes.c: new file for dwarf kprobes handling
This is the first in a series of patches to move off as much of the
static code previously emitted by dwarf_derived_probe_group into a new
runtime/linux/kprobes.c file. This should greatly help legibility and
maintainability. No refactoring will be performed until everything has
been moved over.
In this first patch, we simply move off as much as possible from
emit_module_decls() into kprobes.c.
The one tricky bit is whether the 'module' and 'section' members of the
stap_dwarf_probe struct is a char* or a char[]. This is determined
dynamically, but we use macros to allow us to still declare the struct
in kprobes.c.
Jonathan Lebon [Mon, 16 Jun 2014 19:14:39 +0000 (15:14 -0400)]
uprobes-inode: add stapiu_consumer_[un]register()
To ensure the 'registered' member of the stapiu_consumer struct is
always valid, only make assignments to it in stapiu_[un]register() which
does the actual uprobe_[un]register() call.
Jonathan Lebon [Mon, 16 Jun 2014 15:10:55 +0000 (11:10 -0400)]
uprobes-inode: support on-the-fly [dis]arming
We add support for uprobes arming/disarming:
- Upon finding new target processes, we skip uprobe registration if the
associated probe handler's condition is not enabled.
- We add new functions stapiu_refresh() and stapiu_target_refresh(),
which check if any of the uprobes need to be registered or
unregistered according to the probe handler condition.
- We add the infrastructure needed for the stapiu_refresh() call to be
emitted if using inode uprobes.
Jonathan Lebon [Fri, 13 Jun 2014 19:22:06 +0000 (15:22 -0400)]
uprobes-inode: override on_the_fly_supported()
As the first step towards adding on-the-fly support to uprobes, we
override the on_the_fly_supported() method to determine if inode-uprobes
is supported. This requires us to change the signature of the method to
also pass the systemtap_session object.
Jonathan Lebon [Thu, 12 Jun 2014 20:59:22 +0000 (16:59 -0400)]
runtime: print refresh report when -t given
In addition to printing probe timings, when the -t option is given, info
about systemtap_module_refresh is also given if STP_ON_THE_FLY was
enabled. This gives an idea of the number of times refresh was called,
and the min/avg/max time it took to perform the refresh.
Jonathan Lebon [Wed, 11 Jun 2014 20:18:08 +0000 (16:18 -0400)]
kprobes: support on-the-fly [dis]arming
Using the newly added on-the-fly infrastructure, we turn on support for
kprobes. This entails the following steps:
1. Override derived_probe::on_the_fly_supported() to return true to
signal that on-the-lfy arming/disarming is supported.
2. Add an enabled_p field to the stap_dwarf_probe struct.
3. Inside systemtap_module_init(), after registering the k[ret]probe,
disable it right away if the probe handler condition is not met.
4. Inside systemtap_module_refresh():
- If a new module was inserted, then after registering the
k[ret]probe, disable it right away if the probe handler condition
is not met.
- If the module was removed, then unset the enabled_p field.
- If the kprobe is currently disabled, but the probe handler
condition is now true, enable it.
- If the kprobe is currently enabled, but the probe handler condition
is now false, disable it.
Jonathan Lebon [Mon, 2 Jun 2014 21:22:52 +0000 (17:22 -0400)]
hrtimer probes: support on-the-fly [dis]arming
This patch adds support for on-the-fly arming/disarming of hrtimer
probes. Although it is only supported for non-usermode, there were a few
changes required in dyninst/timer.c so that the same interface is
exposed for both usermode and kernel mode.
runtime/dyninst/timer.c
- Decouple timer creation from timer starting by factoring out
_stp_hrtimer_start() from _stp_hrtimer_create().
- Make _stp_hrtimer_cancel() actually just cancel the timer, and not
completely delete it.
- Add new _stp_hrtimer_delete() to delete the timer.
runtime/linux/timer.c
- Similarly, factor out _stp_hrtimer_start() from
_stp_hrtimer_create().
- Add new _stp_hrtimer_delete(), which also does a cancel.
tapset-timers.cxx
- Declare hrtimer_derived_probe as a probe that supports on-the-fly
operations.
- In emit_module_init(): unconditionally create the timer, but if in
STP_ON_THE_FLY mode, only bother to start it if its condition is
met.
- In emit_module_refresh(): check for which timers to start/cancel.
Jonathan Lebon [Fri, 6 Jun 2014 17:52:39 +0000 (13:52 -0400)]
runtime: lock module_refresh with mutex
If STP_ON_THE_FLY is enabled, then we need to ensure that
systemtap_module_refresh() is never run concurrently. Since we use a
workqueue, no concurrency can occur from arm/disarm refreshes alone.
However, the module notifier could fire at any time.
Jonathan Lebon [Mon, 2 Jun 2014 15:51:38 +0000 (11:51 -0400)]
runtime: use workqueue to schedule systemtap_module_refresh()
In this patch, we check in the probe handler common epilogue whether a
module refresh is needed as determined by the probe handlers. If so, we
schedule_work() so that systemtap_module_refresh() is eventually called
to update the necessary probes.
Jonathan Lebon [Mon, 2 Jun 2014 15:22:17 +0000 (11:22 -0400)]
translate.cxx: evaluate conditions in module_init and handlers
This patch ensures that the stap_probes[] cond_enabled field is updated
as required: once during systemtap_module_init(), and as needed in probe
handlers that may affect the result of condition evaluation (using the
derived_probe::affected_probe set populated in
semantic_pass_conditions()).
If the new evaluated value of the condition is different from its
previous value, then need_module_refresh is set (if this type of probe
supports on-the-fly arming/disarming), which will be acted upon in a
future patch.
Jonathan Lebon [Mon, 2 Jun 2014 19:15:51 +0000 (15:15 -0400)]
derived_probe: add on_the_fly_supported()
We add a new function derived_probe::on_the_fly_supported(), which
defaults to false. As we add on-the-fly arming/disarming to various
probe types, we simply need to override this function to notify the
runtime that the operation is supported (and thus e.g. to refresh the
module when necessary).
In semantic_pass_conditions(), we now only turn on STP_ON_THE_FLY if one
of the probes with a condition supports it.
Jonathan Lebon [Fri, 6 Jun 2014 17:50:39 +0000 (13:50 -0400)]
runtime: split stap_probe struct definition from stap_probes
Because probe handlers will need to refer to the stap_probes[]
cond_enabled field, we need to define the stap_probe struct prior to the
probe handlers.
This patch achieves this by moving probe emitting in between the
stap_probe struct definition and the actual stap_probes[] array
declaration.
Jonathan Lebon [Fri, 6 Jun 2014 17:56:36 +0000 (13:56 -0400)]
stap_probe struct: add cond_enabled field
The new cond_enabled field represents whether the condition of the
associated probe handler currently evaluates to true or false. This
field will be re-evaluated in probe handlers. This is why we also need
to unconst the stap_probes[] array.
In anticipation for on-the-fly probe arming/disarming, we change
semantic_pass_conditions() to not only inline the probe condition into
its body, but also to collect, for each probe, the set of probes whose
conditions may change after the probe's handler is run. That set is
stored in the new derived_probe::probes_with_affected_conditions.
These sets will be used by the translator to emit code that will check
whether affected probes should be armed/disarmed after a handler is run.
We also introduce the STP_ON_THE_FLY define which will gate all
on-the-fly related code and will be emitted only if required. Finally,
we have STP_ON_THE_FLY_DISABLED which can be used to disable all
on-the-fly arming/disarming.
Jonathan Lebon [Wed, 21 May 2014 18:51:34 +0000 (14:51 -0400)]
java probes: remove probe group
Java probes decay to SDT marker probes. There is no Java-specific code
that needs to be emitted during C unparsing and thus no
java_derived_probe_group needed. We remove forward decls to a
non-existent java_derived_probe_group struct and the
java_derived_probes reference in systemtap_session.
Frank Ch. Eigler [Sun, 10 Aug 2014 01:16:32 +0000 (21:16 -0400)]
PR17249: tolerate early module notifier calls with null mod->sect_attrs
In the case of MODULE_STATE_COMING, we may encounter NULL sect_attrs,
and we must not crash. Sadly, that case can mean the loss of ability
to probe module-init functions - i.e., breaking the bz6503 test case.
* runtime/transport/symbols.c (_stp_module_notifier): Don't assume
that mod->sect_attrs is valid. Treat COMING|LIVE notifications
similarly, except LIVE should assume init.* gone gone gone,
she been gone so long, she been gone gone gone so long.
Frank Ch. Eigler [Sun, 10 Aug 2014 01:31:34 +0000 (21:31 -0400)]
PR17232 take #3: mutex the control messages
As per jistone's advice, simplify control message control by imposing
a mutex over the whole receive-side handling of a ctl message. That
precludes concurrent or reentrant messages (independent of /ctl
open-time limits or threading assumptions). It lets the start and
exit handling functions keep track with fewer state variables.
In a way, this elaborates upon a reversion of commit #262f7598.
* runtime/transport/control.c (_stp_ctl_write_cmd): Use a new static
cmd_mutex for ctl message handling. Don't bother with counters and
flags for startedness etc; let the lower level functions handle
that. Handle error exits via goto out instead of return to assure
mutex unlocks.
* runtime/transport/transport.c (_stp_handle_start,
_stp_cleanup_and_exit): Drop the stp_transport_mutex control,
explain why unnecessary. Be more paranoid during module-notifier
cleanup.
PR17232 variant #2: in runtime, let STP_EXIT nest within STP_START
Uncommitted variant #1 consisted of using module refcounts in the
generated systemtap_module_init/exit function pair to ensure that an
uncleaned module cannot be unloaded. That precluded cleanup via
rmmod(8), so a robust but inconvenient solution.
This variant #2 consists of a surgical fix, wherein an STP_EXIT
message comes in during an STP_START is used to set an atomic flag for
deferred _stp_cleanup_and_exit() handling.
A variant #3 is coming soon, using a protocol-wide command-message
mutex, like we did back before commit #262f7598.
Josh Stone [Fri, 8 Aug 2014 20:46:56 +0000 (13:46 -0700)]
PR17242: Initialize tapset global arrays early on
Some of the tapset arrays were lazy-initialized on their function's
first call, but that requires the caller to always take a write lock,
and also makes those functions impure.
Now there is a "probe init" alias for the earliest possible begin probe,
and all of these arrays use that alias to initialize their contents.
None of these are big enough to expect noticeable overhead from having
to always initialize them.
Stan Cox [Fri, 8 Aug 2014 18:31:23 +0000 (14:31 -0400)]
Keep perf counters in a vector instead of an unordered map.
* session.h (perf_counters): Use a vector instead of unordered map so
iteration is in a predictable order.
* tapsets.cxx (visit_perf_op): Use it.
(dwarf_derived_probe): Likewise
(emit_probe_local_init): Likewise
(emit_module_utrace_decls): Likewise
(emit_module_inode_decls): Likewise
* tapset-perfmon.cxx (perf_builder::build): Likewise
* elaborate.h (perf_counter_refs): Make a set of strings.
* perf.sh: New test.
* perf.exp: Use it.
David Smith [Fri, 1 Aug 2014 15:53:18 +0000 (10:53 -0500)]
Add scripts to test systemtap probes on a set of kernel functions.
* scripts/kprobes_test/stap_probes_test.py: New script. While debugging
PR17140, I needed a script to bisect a list of kernel functions to put
systemtap probes on.
* scripts/kprobes_test/stap_gen_code.py: Ditto.
* scripts/kprobes_test/stap_run_module.py: Ditto.
* scripts/kprobes_test/README: Update with better instructions.
* scripts/kprobes_test/config_opts.py: Update comments.
* scripts/kprobes_test/monitor_system.py: Add beaker system monitoring
instructions.
cmdline: added "E" to the short options
elaborate: adapted to be able to perform the semantic pass on the
multiple inputs from user_files
main: parse and store the inputs in user_files, and give the additional
scripts new source "file" names (<input##>).
parse: parse function now has a parameter for the source "file" name
session: accept and store additional scripts for parsing, separate from
the cmdline_script. an additional script won't set have_script to true.
Added a few more unexplained backtraces (as visible with -Gdebug=1
runs), automated explanation-width calculations, and compressed the
initialization of the explanation/priority lookup tables.