Sagar Patel [Thu, 29 Aug 2019 15:40:21 +0000 (11:40 -0400)]
PR23285 (2): enable prometheus-exporter type scripts on stapbpf
The eBPF backend now supports prometheus-exporter scripts. This implementation
introduces character escaping macros enabling the array dump macros on stapbpf.
However, there are some issues with foreach loops which are used in the macros
(PR24953).
1) Developed character escaping macros helper functions.
2) Added prometheus probe tapset for stapbpf.
3) Added stapbpf procfs file path to stap-exporter.
4) Introduced a sample stapbpf prometheus-exporter script in EXAMPLES.
5) Updated NEWS.
Sagar Patel [Thu, 29 Aug 2019 15:32:17 +0000 (11:32 -0400)]
PR24926: correct printing of utf-8 characters on stapbpf
There were two bugs corrupting the string bytes and instructions. The
first bug involved the implicit sign extension of negative char values.
The second bug involved a faulty optimization (fixup_operands) which
used the incorrect instruction opcode.
1) Cast char to unsigned char before casting to uint32_t.
2) Changed opcode of optimized instruction to (BPF_STX | BPF_MEM | BPF_W).
Frank Ch. Eigler [Fri, 23 Aug 2019 23:23:13 +0000 (19:23 -0400)]
stapbpf: correct 32-bit handling of EXIT bpf-map value type
The bpf bytecode declares a 4byte->8byte map for the shared globals;
the userspace must use a 64-bit type to receive those values, lest
it suffer a most unfortunate stack smash.
Frank Ch. Eigler [Fri, 23 Aug 2019 15:26:06 +0000 (11:26 -0400)]
testsuite procfs_bpf.exp: tolerate stapbpf early exit
It was observed on some platforms that the stap/bpf test case could
abort so quickly that the dejagnu (tclsh) code to open the bpf/procfs
pipes could itself hang - and indefinitely! Replaced these tclsh
level ops with a /bin/timeout-wrapped echo or cat operation.
Frank Ch. Eigler [Wed, 21 Aug 2019 23:29:45 +0000 (19:29 -0400)]
PR23879, PR24875: fix task-finder-vma on f29+
It was reported & rediscovered that some vma-dependent runtime
facilities have been broken: @vma() and *ubacktrace(). It turns out
that modern gcc/ld.so links/loads binaries in slightly different ways
than older toolchains. Specifically, the first page of ELF files is
now loaded only r--p instead of r-xp protection flags. The
_stp_vma_mmap_cb() routine now accepts the r--p case too. It now
ignores the flags entirely.
Frank Ch. Eigler [Wed, 21 Aug 2019 01:20:40 +0000 (21:20 -0400)]
PR24904: support linux 5.2's stacktrace.c changes
The following kernel commit disabled the older struct stack_trace APIs
on architectures that support the newer stackwalk APIs. Provide an
adaptation layer to stack_trace_save_regs().
Serhei Makarov [Thu, 15 Aug 2019 15:09:06 +0000 (11:09 -0400)]
stapbpf PR23858 :: support sorting by value in foreach loop
Previously stapbpf's behaviour on this was wrong.
This slightly nightmarish bit of code handles more cases.
XXX Also need to take into account s->sort_aggr for statistics aggregates
-- currently unsupported.
To support nested foreach loops, map_get_next_key was changed to take & return
addresses-of string addresses rather than string addresses. Previously
we'd need to allocate and copy multiple string keys onto one tiny stack,
which is not feasible.
* bpf-internal.h (SORT_FLAGS): New define packs sort_column, sort_direction in one word.
(GET_SORT_COLUMN): New define unpacks sort_column from SORT_FLAGS.
(GET_SORT_DIRECTION): New define unpacks sort_direction from SORT_FLAGS.
* bpf-translate.cxx (bpf_unparser::visit_foreach_loop): String key/value iteration,
hardcode keysize on stack to 8bytes (either an int or an address of a string),
check for sort_aggr (UNSUPPORTED),
pack sort_column and sort_direction into sort_flags,
properly handle returned address of a string.
* stapbpf/bpfinterp.cxx (map_int_keys, map_str_keys): Remove typedefs.
(struct map_keys): Nightmare begins. Support 4 cases of int/str keys/values.
(convert_int_key, convert_str_key, convert_int_kp, convert_str_kp): More nightmare,
handle different cases of packing/unpacking BPF key/val into C++ key/val.
(convert_key, convert_kp): More nightmare. 'Overload' pack/unpack for ANY key/val.
(compute_key_size): More nightmare. Calculate memory required for a key.
(map_sort): New function handles initial sorting of map.
(map_next): New function handles retrieval from sorted map.
(map_get_next_key): Handle sorted int/str keys/values on different columns,
now takes strings to give a place to allocate returned str key/vals
that is not on the BPF stack.
(bpf_intepret): Pass strings to map_get_next_key.
* testsuite/systemtap.bpf/bpf_tests/foreach_string.stp: Enable full testcase.
* testsuite/systemtap.bpf/bpf_tests/foreach_pr23858.stp: New testcase.
Frank Ch. Eigler [Tue, 13 Aug 2019 19:08:41 +0000 (15:08 -0400)]
PR23285: stapbpf/procfs: use effective-userid consistently
The euid is easy and consistent in the construction of the
stapbpf/procfs path name. Use the umask() field at least for
the fifo nodes (should also on the parent directory probably).
For the stapbpf -v case, print some fifo-related diagnostics.
Split out the .stp test case into a separate file, for easier
hand-execution.
Sagar Patel [Thu, 8 Aug 2019 19:40:10 +0000 (15:40 -0400)]
PR23285 (1): enable procfs probes for stapbpf
The eBPF backend now supports procfs probes. This implementation
uses FIFO special files instead of proc filesystem files. The file
path format used is /var/tmp/systemtap-USER/MODNAME. One limitation
is that both read and write probes cannot exist for the same file.
1) Added procfs probe data structures to hold probe information.
2) Created an interface between target variables and eBPF interpreter.
3) Dedicated a single thread for each file which monitors for I/O.
4) Developed a cleaning routine and error handling mechanisms.
5) Updated NEWS and man pages.
stapbpf pr23875 bugfix :: allocate actual keysize in foreach to avoid stack clobber
* bpf-translate.cxx (bpf_unparser::visit_foreach_loop): allocate actual keysize.
* testsuite/systemtap.bpf/bpf_tests/foreach_string.stp: new partial PR23858 testcase,
added only the parts necessary to trigger a segfault when bugfix not applied.
William Cohen [Tue, 23 Jul 2019 18:24:14 +0000 (14:24 -0400)]
Fix aarch64 to properly access arguments for wrapped syscalls
Linux 4.18 added wrappers for aarch64 syscalls that pass a pointer to
a struct pt_regs holding the values for the actual arguments. The
syscall tapsets initialize CONTEXT->sregs to point at this data
structure. However, the aarch64 specific register access code was
using the CONTEXT->kregs and just getting the processor register state
when the kprobe triggered rather than the expected arguments in the
data structure being passed into the syscall. The aarch64 specific
register code now gets the syscall arguments from the correct pt_regs
structure.
runtime: adapt to inconsistent export of task_work_add and task_work_cancel
Ubuntu user <walac> reports their kernel 5.0.0-17-generic #18-Ubuntu
SMP encounters pass-5 errors with an unresolved task_work_cancel()
symbol. It turns out this is due to a ubuntu-specific
(android-related) kernel commit that exports only task_work_add().
Updated buildrun.cxx and relevant runtime files to separately check
for exportedness of task_work_cancel().
William Cohen [Fri, 19 Jul 2019 14:46:59 +0000 (10:46 -0400)]
Eliminate ambiguous python shebangs
Fedora rawhide (31) now throws errors for ambiguous python shebangs.
The systemtap.spec has been modified as outlined in the Fedora feature
wiki page to avoid those ambiguities:
William Cohen [Thu, 18 Jul 2019 15:37:37 +0000 (11:37 -0400)]
PR23866: Make the bpf backend use BPF raw tracepoints for kernel.trace("*")
The BPF raw tracepoints provide arguments that better match the
Systemtap lkm kernel tracepoint probes than regular BPF tracepoints.
The BPF backend will use the BPF raw tracepoints unless the user
specifies the old behavior with a --compatible=4.1 option on the
commandline to address.
The new BPF raw tracepoints and their argument are discovered in the
same way as the old BPF tracepoints. A number of small machine
generated C files are generated with macros are compiled to query the
available tracepoints in the kernel. Debug information describing the
data structures passed into the BPF tracepoints is examined to
determine the type and location of the tracepoint arguments.
Each probe handler BPF code for BPF raw tracepoints is put into a
raw_trace section as the method of registering the BPF raw tracepoints
is different than the regular BPF tracepoints.
The bpf tests have been revised to include the --compatible=4.1 option
for the tests where it makes a difference.
Add some minor tweaks to backtracing add-ons and mention in NEWS
* NEWS: mention new backtracing add-ons
* runtime/regs.h: fix bug with REG_LINK macro for __aarch64__
* tapset/linux/ucontext-unwind.stp: tweak the register-taking backtraces to be more general and work
with other archs
stapbpf/bpfinterp.cxx pr24758 :: add error info to abort() calls
This is still not a proper error reporting scheme but it lets me debug
things in the meantime without having to fire up a debugger just to
understand which place in the code called abort().
* bpfinterp.cxx (stapbpf_abort): New macro, prints reason + aborts.
(stapbpf_just_abort): New macro, aborts for unspecified reason,
but still prints the location of the abort() call.
(stapbpf_stat_get): use stapbpf_abort().
(bpf_handle_transport_msg): ditto.
(bpf_interpret): ditto.
Jafeer Uddin [Tue, 25 Jun 2019 20:23:41 +0000 (16:23 -0400)]
Add more capabilities to systemtap backtracing
* runtime/sym.c: simplified logic in _stp_snprint_addr because it doesn't need to be that verbose,
also added ability to have file names and line numbers print in backtraces
* runtime/sym.h: added new symbol flag to generate a more "fuller" backtrace (includes file names
and line numbers)
* tapset/linux/context-unwind.stp: added new backtrace tapset function that also prints file names
* tapset/linux/ucontext-unwind.stp: added new backtrace tapset function that also prints file names
and functions to allow backtraces using a user provided pc & sp.
Jafeer Uddin [Tue, 25 Jun 2019 19:14:26 +0000 (15:14 -0400)]
Make use of dwarf_aggregate_size to find size attr of types
When trying to probe go lang structure pointers, was getting
"cannot find byte_size attribute" errors. This patch fixes
that issue by using the dwarf_aggregate_size function in
elfutils.
Frank Ch. Eigler [Mon, 24 Jun 2019 21:05:53 +0000 (17:05 -0400)]
runtime: stp_tracepoint_module_notifier rc change
Some LKML traffic indicates NOTIFY_OK may be better than NOTIFY_DONE (0)
to return in a routine success case. Following suit for this site.
OTOH leaving another instance in runtime/linux/symbols.c alone for now.
Sagar Patel [Fri, 7 Jun 2019 15:36:07 +0000 (11:36 -0400)]
PR11353: enable probe elision optimization
Previously, the compiler only issued warnings for probes with empty
handlers. With the new implementation, the compiler additionally
elides probes with empty handlers. The joining of probes to their
groups has been delayed until after this optimization is performed.
Many tests in the testsuite were built under the assumption that
probes with empty handlers are not elided. Consequently, many of
them stopped working and were fixed.
1) Removed probes with empty handlers from session.
2) Issued warnings whenever a probe is elided.
3) Added tests which check for correct probe elision.
4) Fixed tests affected by probe elision.
5) Updated NEWS.
Serhei Makarov [Mon, 24 Jun 2019 18:19:16 +0000 (14:19 -0400)]
PR24543 just-in-case fix :: stapbpf breaks when cpu0 is disabled
Certain perf_events constructs created by stapbpf must be pinned to
one cpu, it doesn't matter which. Previously cpu0 was used.
*Very rarely* it's possible for cpu0 to be disabled.
(It's not recommended,
In that case stapbpf will fail with a super confusing error.
This patch improves readability and allows stapbpf to use a fallback cpu.
cpu0 is still preferred whenever available.
* stapbpf/stapbpf.cxx (default_cpu): New global.
(mark_active_cpus): Set default_cpu, prefer cpu0 whenever possible.
(create_group_fds): Use default_cpu, not cpu0.
(register_uprobes): Ditto.
(register_kprobes): Ditto.
(register_tracepoints): Ditto.
(register_timers): Ditto.
(register_perf): Ditto.
(load_bpf_file): Clarifying comment on when active cpu info is checked,
use default_cpu for uctx bpf_transport_context.
Some housekeeping work to allow generating different code for the
less-restrictive userspace BPF interpreter vs. the more-restrictive
in-kernel BPF JIT interpreter.
* bpf-internal.h (enum bpf_target): New enum to track target BPF version.
(struct program::target): New field,
tracks the BPF version this program is intended for.
(struct program::program): Require target to be specified on creation.
* bpf-base.cxx (struct program::program): Initialize target field.
* bpf-translate.cxx (translate_bpf_pass): Specify targets of each program,
for now begin/end probes are being emitted for userspace BPF,
all other probes are being emitted for in-kernel BPF.
Down the line we may have a userspace timer probe (PR23477)..
Sagar Patel [Wed, 5 Jun 2019 17:10:05 +0000 (13:10 -0400)]
PR12025: use the decimal or hex format specifier respectively in
the automatic printing of integers and pointers
Previously, the detailed types structures were not propagated to
global variables. Consequently, integers and pointers were both
printed using the hex format specifier. With the new implementation,
pointer and integer types can be differentiated.
1) Propagated detailed type structures to global variables.
2) Applied warning for potential type mismatch.
3) Added tests which check for hex and decimal format specifiers.
William Cohen [Fri, 7 Jun 2019 19:15:27 +0000 (15:15 -0400)]
@count() and @sum() should provide 0 for empty entries
When compiled with the bpf backend mmfilepage.stp would have a
segmentation fault when printing out recorded information in the probe
end. This was cause by a foreach using an index from one global array
for other global arrays that did not have entries for some of those
indices. The code computing the @count() and @sum() values did not
handle the situation where there were no statistics for that
particular index in the global array. The proper default is that
@count() and @sum() in those cases should be 0.
Sagar Patel [Wed, 22 May 2019 20:09:19 +0000 (16:09 -0400)]
PR24343: fix uninitialized variable warning and empty output of stap -L for return probes
The uninitialized variable warning was caused by exit syscalls as they don't have
conventional return strings. The incorrect output of stap -L was caused by synthetic
probes created from @entry variables which resulted in miscalculation of the set
intersection check.
1) Set 'never' filters for exit syscalls.
2) Applied synthetic probe flag to filter out synthetic probes from the set intersection.
3) Added tests which check for correct flagging of synethic probes.
Stan Cox [Wed, 29 May 2019 02:53:14 +0000 (22:53 -0400)]
Add aarch64 stapdyn support.
* runtime/dyninst/regs.c (_stp_print_regs): New for aarch64
* runtime/dyninst/stapdyn.h (struct pt_regs): New for aarch64 since
asm/ptrace.h does not define it.
* runtime/dyninst/uprobes-regs.c (enter_dyninst_uprobe_regs): Add aarch64.
* stapdyn/mutatee.cxx (get_dwarf_registers): Add aarch64.
William Cohen [Thu, 23 May 2019 18:46:34 +0000 (14:46 -0400)]
Provide gettimeofday_* functions for bpf backend
BPF has a helper function ktime_get_ns() that provides a nanosecond
time from the time that the machine was booted. When working across
machines really want to have a timestamp based on gettimeofday. This
new BPF backend tapset computes an offset and scaling to convert the
ktime_get_ns() values into gettimeofday_*(). This will allow easier
comparison of timestamped traces between machines.
Frank Ch. Eigler [Fri, 17 May 2019 15:32:15 +0000 (11:32 -0400)]
configury: make python*-config work on rhel6 again
commit 5dabffcf0e77d7479ad stopped searching for all hypothetical
variants of "python{2,}-config". RHEL6 sports python2 binaries
but only a python-config, which was missed by that change. Let's
return to searching for all these aliases.
Serhei Makarov [Wed, 15 May 2019 20:48:34 +0000 (16:48 -0400)]
stapbpf/bpfinterp.cxx (map_get_next_key) :: try to pass all warnings
turns out RHEL7 gcc did not understand __attribute__ ((nonstring)).
This code has extra paranoia in adding a NUL beyond the area
overwritten by strncpy. Switch from strncpy to memcpy since bpf
syscall is treating everything as opaque memory.
PR gcc/80115 and gdb/24541 had a fight about how values stored in
registers with aliases (e.g. al, ax, rax) should be named. We all
lost. :-O
This patch tweaks the i386 controls to bias gcc toward wider aliases,
and adds a comment block explaining the situation. Unfortunately,
cases still exist where a sys/sdt.h note consumer has to use
arch-specific heuristics to decode gcc's intent.
Serhei Makarov [Thu, 9 May 2019 20:47:22 +0000 (16:47 -0400)]
stapbpf/stapbpf.cxx :: fix perf_fds packing code for non-contiguous CPUs
Spotted an error, not the same error that Coverity thinks is happening.
Validity of perf_fds[cpu] (not perf_fds[i]) is indicated by cpus_active[cpu].
There are some additional issues I missed in new code since I first fixed the
code to work with noncontiguous CPUs.
* stapbpf/stapbpf.cxx (perf_event_loop): assign perf_fds[cpu], not perf_fds[i] !!,
maintain i -> cpu mapping to retrieve correct perf_header and transport_context.
Serhei Makarov [Thu, 9 May 2019 20:45:11 +0000 (16:45 -0400)]
stapbpf/stapbpf.cxx :: placate the gods of Coverity
* stapbpf/stapbpf.cxx (instantiate_maps): UNUSED_VALUE #if0 out unused code;
NEGATIVE_RETURNS handle error return from sysconf(_SC_NPROCESSORS_CONF),
previous sysconf fix was valid but it might make sense to print a warning.
(register_tracepoints): RESOURCE_LEAK close fd on read failure.
Serhei Makarov [Thu, 9 May 2019 20:43:36 +0000 (16:43 -0400)]
stapbpf/bpfinterp.cxx :: placate the gods of Coverity
* bpf-internal.h (BPF_MAXSTRINGLEN_PLUS): new define, BPF_MAXSTRINGLEN+1.
* stapbpf/bpfinterp.cxx (map_get_next_key): BUFFER_SIZE_WARNING use bigger buffer.
(bpf_interpret): UNUSED_VALUE memset regs to 0x0,
OVERFLOW_BEFORE_WIDEN indicate in 32-bit LSH operation that widening is inappropriate.
debugfs: split the beginning and the end of __create_file() off
appears to have encoded duplicate $debugfs/systemtap directory
existence into an -EEXIST error code, which we didn't handle.
We now treat it as though it were NULL. Tested on 5.1-rc7.