Josh Stone [Mon, 8 Jun 2009 22:36:42 +0000 (15:36 -0700)]
Let query_module abort early for simple matches
query_module was already returning DW_CB_ABORT when a simple match was
found, but dwflpp::iterate_over_modules was ignoring that and instead
forcing the module loop to restart. The only way out of the loop was
with the pending_interrupts flag, which is only for signalled
interrupts.
Now iterate_over_modules will only attempt the dwfl_getmodules loop
once, since that loop will only abort if the CB returns DW_CB_ABORT.
Then query_module is also modified to return ABORT if pending_interrupts
is flagged.
My trusty test, stap -l syscall.*, is nearly 2x faster with this change.
Empirically, I found that the kernel object is always the first "module"
returned, so the syscall probepoints always gets to short-circuit the
loop right away.
Josh Stone [Thu, 4 Jun 2009 01:51:30 +0000 (18:51 -0700)]
Fix uninitialized shdr in probe_table
* tapsets.cxx (dwarf_builder::probe_table::probe_table): gcc 4.4
complains that shdr may be used uninitialized. I added returns to
ensure that it's ok, but gcc still complains. Set the thing to NULL
as well to silence the beast.
Josh Stone [Tue, 2 Jun 2009 07:43:49 +0000 (00:43 -0700)]
Cache the last result of dwarf_getscopes
This one function accounted for ~30% of my callgrind profile of
"stap -l 'syscall.*'", even though it was only called ~1200 times. We
call dwarf_getscopes for each $target variable, with the same parameters
within a given probe. Since they're no nicely grouped, it's easy to
just cache the most recent call, and the next few calls will be a hit.
Overall this cuts the number of calls down to about 300, for an easy
speed gain.
Josh Stone [Tue, 2 Jun 2009 01:47:30 +0000 (18:47 -0700)]
Move the blacklist functions into dwflpp
For a call like "stap -l 'syscall.*'", I found that ~10% of the time was
spent compiling the blacklist regexps over again for each probe point.
By moving this functionality into the kernel dwflpp instance, we can
reuse the regexps and get an easy speed boost.
David Smith [Mon, 1 Jun 2009 17:20:08 +0000 (12:20 -0500)]
Avoid holding semaphore while making mmap callbacks.
* runtime/task_finder.c (__stp_call_mmap_callbacks_for_task): Grabs the
'mmap_sem' sempaphore. Caches vma information, releases the semaphore,
then makes mmap callbacks.
(__stp_utrace_task_finder_target_quiesce): Calls
__stp_call_mmap_callbacks_for_task() to make mmap callbacks on initial
attach to a task.
David Smith [Thu, 28 May 2009 15:58:17 +0000 (10:58 -0500)]
Avoid 1 case of holding a semaphore while mmap callbacks are being made.
* runtime/task_finder.c (__stp_call_mmap_callbacks_with_addr): Renamed
from __stp_call_mmap_callbacks_with_vma(). Also added some code from
__stp_utrace_task_finder_target_syscall_exit() that locks the 'mmap_sem'
semaphore. This avoids holding the semaphore while the mmap callbacks
are made.
(__stp_utrace_task_finder_target_syscall_exit): Just calls
__stp_call_mmap_callbacks_with_addr() in the mmap case.
Mark Wielaard [Thu, 28 May 2009 11:31:31 +0000 (13:31 +0200)]
Add ucontext-symbols and ucontext-unwind tapset functions to the manual.
* tapset/ucontext-unwind.stp (ubacktrace): Remove empty line before function
triggering parse errors for doc scanner.
* doc/SystemTap_Tapset_Reference/tapsets.tmpl (chapter context_stp): Add
tapset/ucontext-symbols.stp and tapset/ucontext-unwind.stp.
William Cohen [Wed, 27 May 2009 15:14:12 +0000 (11:14 -0400)]
Suggest rpms to install using debuginfo-install.
The patch makes use of the RPM libraries to determine which rpm supplied
the executable and from that information suggest a command to install the
appropriate debuginfo rpm.
This is enabled using the "--with-rpm" option for configure. Can be
explicitly disabled with "--without-rpm".
Fix nd_syscalls.stp for architectures using SYSCALL_WRAPPERS.
Add kprobe.function("SyS_*") probe points to nd_syscall.* probe aliases.
Analogue of commit 132c337c with two exceptions:
- remove sufficiency of these new probe points (use '?' instead of '!'),
because translator always considers them resolved,
- make non-SyS probe points optional in probe aliases affected by
syscall wrappers, because otherwise they will fail on such
architectures.
Josh Stone [Sat, 23 May 2009 02:55:50 +0000 (19:55 -0700)]
Fix another kernel/kprobe.function conflict
Both kernel.function and kprobe.function were defining a global array
stap_unreg_kprobes to use in bulk kprobes unregistration. The compiler
allowed the duplicate definition as long as they were the same size, as
it was when exercised in buildok/thirtyone.
kprobe.function now uses a separate stap_unreg_kprobes2, and the
testcase is modified to produce an imbalanced number of probes.
Josh Stone [Sat, 23 May 2009 01:01:18 +0000 (18:01 -0700)]
PR10190: Suppress warnings for optional kprobes
When a kernel.function or kprobe.function fails in registration, we
usually print a WARNING and move on. With this patch, kprobes that have
the optional '?' flag will not print any WARNING.
Josh Stone [Fri, 22 May 2009 22:27:56 +0000 (15:27 -0700)]
Move the "pure" tag into the body of __is_user_regs
The "/* pure */" tag has no effect unless it is within the embedded-C
body of a function. In this instance, they were accidentally moved out
during the syscall cleanups.
Josh Stone [Fri, 22 May 2009 21:57:33 +0000 (14:57 -0700)]
Use embedded-C for empty functions
The functions asmlinkage() and fastcall() are used to help access
syscall parameters on i686. All other archs don't need this, but they
still define empty functions to shield the callers from arch details.
However, stap issues warnings for empty script-level functions. This
patch changes them to "%{ /* pure */ %}" so there's no complaint, and
they will still get optimized away.
Josh Stone [Wed, 20 May 2009 21:46:25 +0000 (14:46 -0700)]
PR10177: init/kill time in sleepy context only
Previously, _stp_init_time and _stp_kill_time were being called from
begin/end/error probes, which will run with preemption disabled. The
BUG reported on RT kernels showed that cpufreq_unregister_notifier can
end up sleeping, which violates our preemption block.
This patch moves the init/kill into systemtap_module_init/exit, where it
is safe to sleep. The code maintains a new predicate with the define
STAP_NEED_GETTIMEOFDAY, so we don't still incur any timer overhead if
it's not used.
Mark Wielaard [Wed, 20 May 2009 21:11:43 +0000 (23:11 +0200)]
Properly read eh_frame and pass is_ehframe correctly.
* runtime/unwind.c (adjustStartLoc): Add extra dbug_unwind.
(_stp_search_unwind_hdr): Always pass true for is_ehframe.
(unwind_frame): Properly pass through is_ehframe to adjustStartLoc().
(unwind): Add extra dbug_unwind.
* translate.cxx (dump_unwindsyms): Output and use correct eh_frame
and eh_len.
Mark Wielaard [Wed, 20 May 2009 14:51:24 +0000 (16:51 +0200)]
Use debug_frame table, then fallback to eh_frame when necessary.
* runtime/unwind.c (unwind): Call new unwind_frame() first with debug_frame
data, then if that wasn't able to unwind again with eh_frame data.
(unwind_frame): Adapted version of old unwind() function that takes a
table, table length and whether it is an eh_frame table.
Mark Wielaard [Wed, 20 May 2009 13:40:29 +0000 (15:40 +0200)]
Pass and use ptrType and is_ehframe to unwind adjustStartLoc.
* runtime/unwind.c (adjustStartLoc): Add ptrType and is_ehframe as arguments.
Use these to adjust location when necessary.
(DEBUG_UNWIND): Move block before adjustStartLoc.
Pass false for is_ehframe throughout.
Mark Wielaard [Wed, 20 May 2009 13:24:02 +0000 (15:24 +0200)]
Fetch and store both debug_frame and eh_frame tables.
* runtime/sym.h (_stp_module): Remove unwind_data, unwind_data_len and
unwind_is_ehframe fields. Add debug_frame, eh_frame, debug_frame_len,
eh_frame_len and eh_frame_addr fields.
* runtime/unwind.c: Use debug_frame and debug_frame_len instead of
unwind_data and unwind_data_len throughout.
(cie_for_fde): Take unwind_data and is_ehframe as direct arguments.
* runtime/unwind/unwind.h (cie_for_fde): New function declaration.
* translate.cxx (get_unwind_data): Fetch and return both debug_frame
and eh_frame tables.
(dump_unwindsyms): Dump both debug_frame and eh_frame tables.
Unify formatting of syscalls.stp and syscalls2.stp.
Rules:
- Specify probe points for aliases starting from the alias declaration
line and with one probe point per line.
- Use K&R indent style -- probe alias/point/function opening brace goes
to the line following the declaration, other opening braces are kept
on the same line as the control statements.
- Indent using tabs.
- Surround operators with spaces.
- Put spaces after commas.
- Avoid trailing whitespaces.
Complete the names-to-numbers conversion in nd_syscalls.stp.
Replace in-scope variables references with *_arg functions. Use 'kprobe'
family of probes instead of 'kernel' family for dwarfless probing. Also
fix a few typos and unify formatting.
David Smith [Mon, 18 May 2009 18:05:01 +0000 (13:05 -0500)]
PR10091 fixes.
* runtime/itrace.c (usr_itrace_report_signal): Add a workaround for
ppc-specific problem.
* testsuite/systemtap.base/itrace.exp: Improved tests. Improved test
completeness. Will also no longer give fails for systems that don't
support single or block step (will give xfails instead).
Josh Stone [Fri, 15 May 2009 20:14:52 +0000 (13:14 -0700)]
Merge the dwflpp::query_cu_..._address methods
The method query_cu_containing_global_address was only called by
query_cu_containing_module_address, and the latter was just doing a
simple argument transform. They are now merged into a single method,
query_cu_containing_address. The function module_address_to_global is
also merged here at its only call site.
Mark Wielaard [Fri, 15 May 2009 13:06:33 +0000 (15:06 +0200)]
Tidy/tighten DEBUG_UNWIND ptrType a bit.
* runtime/unwind.c (_stp_enc_hi_name): Include prefix for hi == 0.
(_stp_enc_lo_name): Don't include prefix.
(_stp_eh_enc_name): Always include hi_name.
(unwind): Always include newline in dbug_unwind() calls.
Josh Stone [Fri, 15 May 2009 01:47:33 +0000 (18:47 -0700)]
[tracepoints] Print pointer arguments with %p
We know the full type of every tracepoint argument, so for those that
are pointers, print $$vars/$$parms using "%p". The integer-type
arguments continue to use the generic "%#x".
Mark Wielaard [Thu, 14 May 2009 17:07:10 +0000 (19:07 +0200)]
PR10139 Mark .probes section SHF_ALLOC.
* includes/sys/sdt.h (STAP_PROBE_DATA_): Mark .probes section SHF_ALLOC.
* tapsets.cxx (dwarf_builder::build): Search in either dwarf or main elf
file for .probes section.
Keiichi KII [Wed, 13 May 2009 20:55:11 +0000 (16:55 -0400)]
PR 6930: Add additional testcases for flight recorder mode
* testsuite/parseko/cmdline17.stp:
command line check - bad combination with -D and -L
* testsuite/parseko/cmdline18.stp:
command line check - bad combination with -D and -d
* testsuite/parseko/cmdline19.stp:
command line check - bad combination with -D and -c
* testsuite/parseko/cmdline20.stp:
command line check - need output file with -D
* testsuite/parseko/cmdline21.stp:
command line check - need output file with -S
* testsuite/systemtap.base/flightrec3.exp:
New test case for file switching with bulk mode
* testsuite/systemtap.base/flightrec3.stp:
Test script for file switching per cpu
Mark Wielaard [Sun, 10 May 2009 18:24:40 +0000 (20:24 +0200)]
Get .probes section through dwarf debuginfo file if necessary.
* tapsets.cxx (dwarf_builder::build): Add some comments, verbose log
messages and get Elf through dwarf_getelf if it exists before searching
for .probes section.
Josh Stone [Sat, 9 May 2009 02:30:42 +0000 (19:30 -0700)]
Allow @cast failures to get optimized away
We have the saved_conversion_error field, but I wasn't using it. Now
@cast errors are saved in that field, so they're only seen if the
optimizer doesn't remove the @cast.