Josh Stone [Wed, 10 Jun 2009 22:50:04 +0000 (15:50 -0700)]
PR10260: Clean up all resources after init errors
When anything in systemtap_module_init fails, and we return non-zero,
then the module load is aborted. The normal module unload path
(systemtap_module_exit) is not even attempted, so we need to make sure
that all partially-allocated resources are returned.
Our timer callbacks for the gettimeofday subsystem are a classic example
of this error. If we don't unregister the timers before aborting init,
they will later be called and cause a kernel fault.
We also were neglecting to free the percpu context. A memory leak is
less harmful, but that's fixed now too.
Josh Stone [Wed, 10 Jun 2009 02:58:15 +0000 (19:58 -0700)]
Fix condition propagation across aliases
When an instance of an alias has a condition, that condition gets
propagated to each of the locations that the alias defines. However,
the copy of the location list was not a deep copy, and so all other
instances of the alias would also incorrectly receive the condition.
This patch makes the location list copy a little deeper, and adds a
test case which demonstrates the issue.
Josh Stone [Tue, 9 Jun 2009 23:37:14 +0000 (16:37 -0700)]
Remove the spurious sdt @cast expansion
The result of sdt's private @cast expansion was not being used, and it's
not really needed anyway. The global cast visitor is registered to run
as a post-processing step on ALL functions and probes, and so it will
pick up and expand sdt's casts too.
Stan Cox [Tue, 9 Jun 2009 21:05:53 +0000 (17:05 -0400)]
* tapsets.cxx (sdt_var_expanding_visitor::process_name): New.
(sdt_var_expanding_visitor::visit_target_symbol): Have @cast use
types from a dtrace built object instead of a dtrace supplied header.
(dwarf_builder::build): Use it.
* tapsets.cxx (dwarf_builder::probe_table::probe_table): gcc 4.4
complains that shdr may be used uninitialized. I added returns to
ensure that it's ok, but gcc still complains. Set the thing to NULL
as well to silence the beast.
Stan Cox [Tue, 9 Jun 2009 01:57:44 +0000 (21:57 -0400)]
* tapsets.cxx (probe_table): Make mark_name and sess refs.
(probe_table::get_next_probe): Dissect using struct probe_table.
(sdt_var_expanding_visitor): Use lex_cast.
(dwarf_builder::build): Copy probe and location for TOK_MARK cases.
Call derive_probes for kprobe and utrace cases.
Josh Stone [Mon, 8 Jun 2009 23:37:00 +0000 (16:37 -0700)]
Remove dwflpp::default_name
It was just a basic NULL check, but creating its string temporaries was
causing a fair slowdown. Removing this function and adjusting the
callers shaves ~5% off the syscall.* elaboration time.
Josh Stone [Mon, 8 Jun 2009 22:36:42 +0000 (15:36 -0700)]
Let query_module abort early for simple matches
query_module was already returning DW_CB_ABORT when a simple match was
found, but dwflpp::iterate_over_modules was ignoring that and instead
forcing the module loop to restart. The only way out of the loop was
with the pending_interrupts flag, which is only for signalled
interrupts.
Now iterate_over_modules will only attempt the dwfl_getmodules loop
once, since that loop will only abort if the CB returns DW_CB_ABORT.
Then query_module is also modified to return ABORT if pending_interrupts
is flagged.
My trusty test, stap -l syscall.*, is nearly 2x faster with this change.
Empirically, I found that the kernel object is always the first "module"
returned, so the syscall probepoints always gets to short-circuit the
loop right away.
Josh Stone [Thu, 4 Jun 2009 01:51:30 +0000 (18:51 -0700)]
Fix uninitialized shdr in probe_table
* tapsets.cxx (dwarf_builder::probe_table::probe_table): gcc 4.4
complains that shdr may be used uninitialized. I added returns to
ensure that it's ok, but gcc still complains. Set the thing to NULL
as well to silence the beast.
Josh Stone [Tue, 2 Jun 2009 07:43:49 +0000 (00:43 -0700)]
Cache the last result of dwarf_getscopes
This one function accounted for ~30% of my callgrind profile of
"stap -l 'syscall.*'", even though it was only called ~1200 times. We
call dwarf_getscopes for each $target variable, with the same parameters
within a given probe. Since they're no nicely grouped, it's easy to
just cache the most recent call, and the next few calls will be a hit.
Overall this cuts the number of calls down to about 300, for an easy
speed gain.
Josh Stone [Tue, 2 Jun 2009 01:47:30 +0000 (18:47 -0700)]
Move the blacklist functions into dwflpp
For a call like "stap -l 'syscall.*'", I found that ~10% of the time was
spent compiling the blacklist regexps over again for each probe point.
By moving this functionality into the kernel dwflpp instance, we can
reuse the regexps and get an easy speed boost.
David Smith [Mon, 1 Jun 2009 17:20:08 +0000 (12:20 -0500)]
Avoid holding semaphore while making mmap callbacks.
* runtime/task_finder.c (__stp_call_mmap_callbacks_for_task): Grabs the
'mmap_sem' sempaphore. Caches vma information, releases the semaphore,
then makes mmap callbacks.
(__stp_utrace_task_finder_target_quiesce): Calls
__stp_call_mmap_callbacks_for_task() to make mmap callbacks on initial
attach to a task.
David Smith [Thu, 28 May 2009 15:58:17 +0000 (10:58 -0500)]
Avoid 1 case of holding a semaphore while mmap callbacks are being made.
* runtime/task_finder.c (__stp_call_mmap_callbacks_with_addr): Renamed
from __stp_call_mmap_callbacks_with_vma(). Also added some code from
__stp_utrace_task_finder_target_syscall_exit() that locks the 'mmap_sem'
semaphore. This avoids holding the semaphore while the mmap callbacks
are made.
(__stp_utrace_task_finder_target_syscall_exit): Just calls
__stp_call_mmap_callbacks_with_addr() in the mmap case.
Mark Wielaard [Thu, 28 May 2009 11:31:31 +0000 (13:31 +0200)]
Add ucontext-symbols and ucontext-unwind tapset functions to the manual.
* tapset/ucontext-unwind.stp (ubacktrace): Remove empty line before function
triggering parse errors for doc scanner.
* doc/SystemTap_Tapset_Reference/tapsets.tmpl (chapter context_stp): Add
tapset/ucontext-symbols.stp and tapset/ucontext-unwind.stp.
William Cohen [Wed, 27 May 2009 15:14:12 +0000 (11:14 -0400)]
Suggest rpms to install using debuginfo-install.
The patch makes use of the RPM libraries to determine which rpm supplied
the executable and from that information suggest a command to install the
appropriate debuginfo rpm.
This is enabled using the "--with-rpm" option for configure. Can be
explicitly disabled with "--without-rpm".
Fix nd_syscalls.stp for architectures using SYSCALL_WRAPPERS.
Add kprobe.function("SyS_*") probe points to nd_syscall.* probe aliases.
Analogue of commit 132c337c with two exceptions:
- remove sufficiency of these new probe points (use '?' instead of '!'),
because translator always considers them resolved,
- make non-SyS probe points optional in probe aliases affected by
syscall wrappers, because otherwise they will fail on such
architectures.
Josh Stone [Sat, 23 May 2009 02:55:50 +0000 (19:55 -0700)]
Fix another kernel/kprobe.function conflict
Both kernel.function and kprobe.function were defining a global array
stap_unreg_kprobes to use in bulk kprobes unregistration. The compiler
allowed the duplicate definition as long as they were the same size, as
it was when exercised in buildok/thirtyone.
kprobe.function now uses a separate stap_unreg_kprobes2, and the
testcase is modified to produce an imbalanced number of probes.
Josh Stone [Sat, 23 May 2009 01:01:18 +0000 (18:01 -0700)]
PR10190: Suppress warnings for optional kprobes
When a kernel.function or kprobe.function fails in registration, we
usually print a WARNING and move on. With this patch, kprobes that have
the optional '?' flag will not print any WARNING.
Josh Stone [Fri, 22 May 2009 22:27:56 +0000 (15:27 -0700)]
Move the "pure" tag into the body of __is_user_regs
The "/* pure */" tag has no effect unless it is within the embedded-C
body of a function. In this instance, they were accidentally moved out
during the syscall cleanups.
Josh Stone [Fri, 22 May 2009 21:57:33 +0000 (14:57 -0700)]
Use embedded-C for empty functions
The functions asmlinkage() and fastcall() are used to help access
syscall parameters on i686. All other archs don't need this, but they
still define empty functions to shield the callers from arch details.
However, stap issues warnings for empty script-level functions. This
patch changes them to "%{ /* pure */ %}" so there's no complaint, and
they will still get optimized away.
Josh Stone [Wed, 20 May 2009 21:46:25 +0000 (14:46 -0700)]
PR10177: init/kill time in sleepy context only
Previously, _stp_init_time and _stp_kill_time were being called from
begin/end/error probes, which will run with preemption disabled. The
BUG reported on RT kernels showed that cpufreq_unregister_notifier can
end up sleeping, which violates our preemption block.
This patch moves the init/kill into systemtap_module_init/exit, where it
is safe to sleep. The code maintains a new predicate with the define
STAP_NEED_GETTIMEOFDAY, so we don't still incur any timer overhead if
it's not used.
Mark Wielaard [Wed, 20 May 2009 21:11:43 +0000 (23:11 +0200)]
Properly read eh_frame and pass is_ehframe correctly.
* runtime/unwind.c (adjustStartLoc): Add extra dbug_unwind.
(_stp_search_unwind_hdr): Always pass true for is_ehframe.
(unwind_frame): Properly pass through is_ehframe to adjustStartLoc().
(unwind): Add extra dbug_unwind.
* translate.cxx (dump_unwindsyms): Output and use correct eh_frame
and eh_len.
Mark Wielaard [Wed, 20 May 2009 14:51:24 +0000 (16:51 +0200)]
Use debug_frame table, then fallback to eh_frame when necessary.
* runtime/unwind.c (unwind): Call new unwind_frame() first with debug_frame
data, then if that wasn't able to unwind again with eh_frame data.
(unwind_frame): Adapted version of old unwind() function that takes a
table, table length and whether it is an eh_frame table.
Mark Wielaard [Wed, 20 May 2009 13:40:29 +0000 (15:40 +0200)]
Pass and use ptrType and is_ehframe to unwind adjustStartLoc.
* runtime/unwind.c (adjustStartLoc): Add ptrType and is_ehframe as arguments.
Use these to adjust location when necessary.
(DEBUG_UNWIND): Move block before adjustStartLoc.
Pass false for is_ehframe throughout.
Mark Wielaard [Wed, 20 May 2009 13:24:02 +0000 (15:24 +0200)]
Fetch and store both debug_frame and eh_frame tables.
* runtime/sym.h (_stp_module): Remove unwind_data, unwind_data_len and
unwind_is_ehframe fields. Add debug_frame, eh_frame, debug_frame_len,
eh_frame_len and eh_frame_addr fields.
* runtime/unwind.c: Use debug_frame and debug_frame_len instead of
unwind_data and unwind_data_len throughout.
(cie_for_fde): Take unwind_data and is_ehframe as direct arguments.
* runtime/unwind/unwind.h (cie_for_fde): New function declaration.
* translate.cxx (get_unwind_data): Fetch and return both debug_frame
and eh_frame tables.
(dump_unwindsyms): Dump both debug_frame and eh_frame tables.
Unify formatting of syscalls.stp and syscalls2.stp.
Rules:
- Specify probe points for aliases starting from the alias declaration
line and with one probe point per line.
- Use K&R indent style -- probe alias/point/function opening brace goes
to the line following the declaration, other opening braces are kept
on the same line as the control statements.
- Indent using tabs.
- Surround operators with spaces.
- Put spaces after commas.
- Avoid trailing whitespaces.
Complete the names-to-numbers conversion in nd_syscalls.stp.
Replace in-scope variables references with *_arg functions. Use 'kprobe'
family of probes instead of 'kernel' family for dwarfless probing. Also
fix a few typos and unify formatting.
David Smith [Mon, 18 May 2009 18:05:01 +0000 (13:05 -0500)]
PR10091 fixes.
* runtime/itrace.c (usr_itrace_report_signal): Add a workaround for
ppc-specific problem.
* testsuite/systemtap.base/itrace.exp: Improved tests. Improved test
completeness. Will also no longer give fails for systems that don't
support single or block step (will give xfails instead).
Josh Stone [Fri, 15 May 2009 20:14:52 +0000 (13:14 -0700)]
Merge the dwflpp::query_cu_..._address methods
The method query_cu_containing_global_address was only called by
query_cu_containing_module_address, and the latter was just doing a
simple argument transform. They are now merged into a single method,
query_cu_containing_address. The function module_address_to_global is
also merged here at its only call site.
Mark Wielaard [Fri, 15 May 2009 13:06:33 +0000 (15:06 +0200)]
Tidy/tighten DEBUG_UNWIND ptrType a bit.
* runtime/unwind.c (_stp_enc_hi_name): Include prefix for hi == 0.
(_stp_enc_lo_name): Don't include prefix.
(_stp_eh_enc_name): Always include hi_name.
(unwind): Always include newline in dbug_unwind() calls.