Jafeer Uddin [Tue, 25 Jun 2019 19:14:26 +0000 (15:14 -0400)]
Make use of dwarf_aggregate_size to find size attr of types
When trying to probe go lang structure pointers, was getting
"cannot find byte_size attribute" errors. This patch fixes
that issue by using the dwarf_aggregate_size function in
elfutils.
Frank Ch. Eigler [Mon, 24 Jun 2019 21:05:53 +0000 (17:05 -0400)]
runtime: stp_tracepoint_module_notifier rc change
Some LKML traffic indicates NOTIFY_OK may be better than NOTIFY_DONE (0)
to return in a routine success case. Following suit for this site.
OTOH leaving another instance in runtime/linux/symbols.c alone for now.
Sagar Patel [Fri, 7 Jun 2019 15:36:07 +0000 (11:36 -0400)]
PR11353: enable probe elision optimization
Previously, the compiler only issued warnings for probes with empty
handlers. With the new implementation, the compiler additionally
elides probes with empty handlers. The joining of probes to their
groups has been delayed until after this optimization is performed.
Many tests in the testsuite were built under the assumption that
probes with empty handlers are not elided. Consequently, many of
them stopped working and were fixed.
1) Removed probes with empty handlers from session.
2) Issued warnings whenever a probe is elided.
3) Added tests which check for correct probe elision.
4) Fixed tests affected by probe elision.
5) Updated NEWS.
Serhei Makarov [Mon, 24 Jun 2019 18:19:16 +0000 (14:19 -0400)]
PR24543 just-in-case fix :: stapbpf breaks when cpu0 is disabled
Certain perf_events constructs created by stapbpf must be pinned to
one cpu, it doesn't matter which. Previously cpu0 was used.
*Very rarely* it's possible for cpu0 to be disabled.
(It's not recommended,
In that case stapbpf will fail with a super confusing error.
This patch improves readability and allows stapbpf to use a fallback cpu.
cpu0 is still preferred whenever available.
* stapbpf/stapbpf.cxx (default_cpu): New global.
(mark_active_cpus): Set default_cpu, prefer cpu0 whenever possible.
(create_group_fds): Use default_cpu, not cpu0.
(register_uprobes): Ditto.
(register_kprobes): Ditto.
(register_tracepoints): Ditto.
(register_timers): Ditto.
(register_perf): Ditto.
(load_bpf_file): Clarifying comment on when active cpu info is checked,
use default_cpu for uctx bpf_transport_context.
Some housekeeping work to allow generating different code for the
less-restrictive userspace BPF interpreter vs. the more-restrictive
in-kernel BPF JIT interpreter.
* bpf-internal.h (enum bpf_target): New enum to track target BPF version.
(struct program::target): New field,
tracks the BPF version this program is intended for.
(struct program::program): Require target to be specified on creation.
* bpf-base.cxx (struct program::program): Initialize target field.
* bpf-translate.cxx (translate_bpf_pass): Specify targets of each program,
for now begin/end probes are being emitted for userspace BPF,
all other probes are being emitted for in-kernel BPF.
Down the line we may have a userspace timer probe (PR23477)..
Sagar Patel [Wed, 5 Jun 2019 17:10:05 +0000 (13:10 -0400)]
PR12025: use the decimal or hex format specifier respectively in
the automatic printing of integers and pointers
Previously, the detailed types structures were not propagated to
global variables. Consequently, integers and pointers were both
printed using the hex format specifier. With the new implementation,
pointer and integer types can be differentiated.
1) Propagated detailed type structures to global variables.
2) Applied warning for potential type mismatch.
3) Added tests which check for hex and decimal format specifiers.
William Cohen [Fri, 7 Jun 2019 19:15:27 +0000 (15:15 -0400)]
@count() and @sum() should provide 0 for empty entries
When compiled with the bpf backend mmfilepage.stp would have a
segmentation fault when printing out recorded information in the probe
end. This was cause by a foreach using an index from one global array
for other global arrays that did not have entries for some of those
indices. The code computing the @count() and @sum() values did not
handle the situation where there were no statistics for that
particular index in the global array. The proper default is that
@count() and @sum() in those cases should be 0.
Sagar Patel [Wed, 22 May 2019 20:09:19 +0000 (16:09 -0400)]
PR24343: fix uninitialized variable warning and empty output of stap -L for return probes
The uninitialized variable warning was caused by exit syscalls as they don't have
conventional return strings. The incorrect output of stap -L was caused by synthetic
probes created from @entry variables which resulted in miscalculation of the set
intersection check.
1) Set 'never' filters for exit syscalls.
2) Applied synthetic probe flag to filter out synthetic probes from the set intersection.
3) Added tests which check for correct flagging of synethic probes.
Stan Cox [Wed, 29 May 2019 02:53:14 +0000 (22:53 -0400)]
Add aarch64 stapdyn support.
* runtime/dyninst/regs.c (_stp_print_regs): New for aarch64
* runtime/dyninst/stapdyn.h (struct pt_regs): New for aarch64 since
asm/ptrace.h does not define it.
* runtime/dyninst/uprobes-regs.c (enter_dyninst_uprobe_regs): Add aarch64.
* stapdyn/mutatee.cxx (get_dwarf_registers): Add aarch64.
William Cohen [Thu, 23 May 2019 18:46:34 +0000 (14:46 -0400)]
Provide gettimeofday_* functions for bpf backend
BPF has a helper function ktime_get_ns() that provides a nanosecond
time from the time that the machine was booted. When working across
machines really want to have a timestamp based on gettimeofday. This
new BPF backend tapset computes an offset and scaling to convert the
ktime_get_ns() values into gettimeofday_*(). This will allow easier
comparison of timestamped traces between machines.
Frank Ch. Eigler [Fri, 17 May 2019 15:32:15 +0000 (11:32 -0400)]
configury: make python*-config work on rhel6 again
commit 5dabffcf0e77d7479ad stopped searching for all hypothetical
variants of "python{2,}-config". RHEL6 sports python2 binaries
but only a python-config, which was missed by that change. Let's
return to searching for all these aliases.
Serhei Makarov [Wed, 15 May 2019 20:48:34 +0000 (16:48 -0400)]
stapbpf/bpfinterp.cxx (map_get_next_key) :: try to pass all warnings
turns out RHEL7 gcc did not understand __attribute__ ((nonstring)).
This code has extra paranoia in adding a NUL beyond the area
overwritten by strncpy. Switch from strncpy to memcpy since bpf
syscall is treating everything as opaque memory.
PR gcc/80115 and gdb/24541 had a fight about how values stored in
registers with aliases (e.g. al, ax, rax) should be named. We all
lost. :-O
This patch tweaks the i386 controls to bias gcc toward wider aliases,
and adds a comment block explaining the situation. Unfortunately,
cases still exist where a sys/sdt.h note consumer has to use
arch-specific heuristics to decode gcc's intent.
Serhei Makarov [Thu, 9 May 2019 20:47:22 +0000 (16:47 -0400)]
stapbpf/stapbpf.cxx :: fix perf_fds packing code for non-contiguous CPUs
Spotted an error, not the same error that Coverity thinks is happening.
Validity of perf_fds[cpu] (not perf_fds[i]) is indicated by cpus_active[cpu].
There are some additional issues I missed in new code since I first fixed the
code to work with noncontiguous CPUs.
* stapbpf/stapbpf.cxx (perf_event_loop): assign perf_fds[cpu], not perf_fds[i] !!,
maintain i -> cpu mapping to retrieve correct perf_header and transport_context.
Serhei Makarov [Thu, 9 May 2019 20:45:11 +0000 (16:45 -0400)]
stapbpf/stapbpf.cxx :: placate the gods of Coverity
* stapbpf/stapbpf.cxx (instantiate_maps): UNUSED_VALUE #if0 out unused code;
NEGATIVE_RETURNS handle error return from sysconf(_SC_NPROCESSORS_CONF),
previous sysconf fix was valid but it might make sense to print a warning.
(register_tracepoints): RESOURCE_LEAK close fd on read failure.
Serhei Makarov [Thu, 9 May 2019 20:43:36 +0000 (16:43 -0400)]
stapbpf/bpfinterp.cxx :: placate the gods of Coverity
* bpf-internal.h (BPF_MAXSTRINGLEN_PLUS): new define, BPF_MAXSTRINGLEN+1.
* stapbpf/bpfinterp.cxx (map_get_next_key): BUFFER_SIZE_WARNING use bigger buffer.
(bpf_interpret): UNUSED_VALUE memset regs to 0x0,
OVERFLOW_BEFORE_WIDEN indicate in 32-bit LSH operation that widening is inappropriate.
debugfs: split the beginning and the end of __create_file() off
appears to have encoded duplicate $debugfs/systemtap directory
existence into an -EEXIST error code, which we didn't handle.
We now treat it as though it were NULL. Tested on 5.1-rc7.
This is a preview version of the statistics aggregate feature
for inclusion in the stap 4.1 release. Only @sum, @count, @avg
operations are supported for the time being.
* testsuite/systemtap.bpf/bpf_tests/stat1.stp: Basic test (<<<, @count, @sum, @avg).
* testsuite/systemtap.bpf/bpf_tests/stat2.stp: Test stat arrays and foreach.
* testsuite/systemtap.bpf/bpf_tests/stat3.stp: Test delete and array-in.
stapbpf PR23476 (2/7) :: generate scalar+array stat aggregate declarations
* bpf-translate.cxx (translate_globals): Add a case for scalar pe_stats (init scalar stat maps),
add a case for array pe_stats (init a set of array stat maps),
pull out some duplicate code between stat and non-stat cases.
(output_maps): Add a separate case to give snazzy names to stat maps.
stapbpf PR23476 (1/7) :: redesign struct globals for stat aggregates
Modify struct globals to allow stats map slots (map_id = -1) and track a
set of percpu maps for each stat aggregate -- one map per field from struct stat_data.
* bpf-internal.h (struct globals): General redesign.
(globals::map_idx): New typedef to identify index of a map.
(globals::map_slot): New type (replacing pair<short,short>) with support for
marking a value's 'slot' as non-scalar (no index) and/or stats (no single map).
(globals::globals_map): Use map_slot type.
(globals::stat_field): New typedef to identify a stats field name.
(globals::stat_fields): The set of supported stats fields.
(globals::stat_iter_field): Stat field to use to obtain keys for foreach, in, &c.
(globals::stats_map): New typedef to identify one map per stats field.
(globals::scalar_stats): One map per stats field to represent scalar stats.
(globals::array_stats): One map per stats field per array to represent stats arrays.
(globals::internal_map_idx): Change type to globals::map_idx.
(globals::perf_event_map_idx): Change type to globals::map_idx.
Also take the first step to supporting stats map slots in bpf-translate.cxx:
* bpf-base.cxx (program::load_map): Guard against unimplemented uses of stats map slots.
* bpf-translate.cxx (bpf_unparser::emit_store): Use new map_slot type from globals_map.
(bpf_unparser::visit_foreach_loop): Ditto.
(bpf_unparser::visit_delete_statement): Ditto.
(bpf_unparser::visit_assignment): For now, explicitly mark <<< as unimplemented.
(bpf_unparser::visit_symbol): Use new map_slot type from globals_map.
(bpf_unparser::visit_arrayindex): Ditto.
(bpf_unparser::visit_array_in): Ditto.
(globals::stat_fields): Initialize globals::stat_fields with count, sum for now.
(globals::stat_iter_field): Initialize globals::stat_iter_field.
(output_maps): Use new map_slot type from globals_map, note requirement to support stats map slots.
NOTE: All of the above bpf-translate.cxx uses of map_slot will change
to also handle stats arrays in the next patches.
William Cohen [Thu, 2 May 2019 14:41:59 +0000 (10:41 -0400)]
Force correct order of evaluation of macro arguments in check_*register macros
Noted that a number of tests were failing on x86 machines with errors
like the following:
ERROR: register access fault [man error::fault] near identifier 'module_name' at
/usr/share/systemtap/tapset/linux/context.stp:392:10
The problem was traced to the maxregno argument for the macro having a
?: operator which has lower precedence than || or >. This caused the
conditional tests in check_fetch_register and check_store_register for
error reporting to incorrectly trigger. Used ()'s in the conditionals
to force the correct order of evaluation.
William Cohen [Tue, 23 Apr 2019 19:08:08 +0000 (15:08 -0400)]
Adjust syscall_get_arguments to match kernel's implementation
The syscall_get_argument function arguments changed due to
Linux git commit 32d9258662. Remove the unused arguments
to match the expect arguments for syscall_get_arguments
when needed.
William Cohen [Wed, 10 Apr 2019 18:55:05 +0000 (14:55 -0400)]
Disable kprobe optimization again
On x86 processors running linux 5.0 kernel the uprobes_onthefly.exp
test would trigger a RCU hang (PR24416). Disable the kprobes
optimization until these problems reported in RHBZ1697531 get fixed in
the kernel.
David Ward [Mon, 11 Feb 2019 17:25:38 +0000 (12:25 -0500)]
overload.py: Fix python version 2/3 compatibility
The modified XML tree is outputted either as a bytearray with UTF-8
encoding in python version 3, or as a string in python version 2.
Handle this by writing the bytearray directly to sys.stdout.buffer,
or the string directly to sys.stdout, respectively.
Remove what appears to be "troubleshooting code" that was added in
commit 616ec7a0b, which dumps a large amount of unnecessary output
to stderr.
Call this script using the configured program name for python.
David Ward [Mon, 11 Feb 2019 17:25:37 +0000 (12:25 -0500)]
configure: Fix handling of python versions 2 and 3
When python version 2 is not found, AM_PROG_PYTHON sets the output
variable PYTHON to ":" (which is intentional; see "man 1P colon").
Fix incorrect tests that compared PYTHON to an empty string.
Use the same behavior for python version 3: when it is not found,
set the output variable PYTHON3 to ":" and test that accordingly.
Pass the variables "python3" and "py3execdir" to the subconfigure
unconditionally, just like the variables "python" and "pyexecdir".
When a program named "python" exists, fix a conditional that tests
if it is python version 3.
Do not guess the name of the python-config script. Simply append
"-config" to the program name for the python interpreter.
wcohen discovered that the (guru-mode) @*register operator doesn't
sufficiently check the context it is run in, possibly derefencing
null context->*regs pointers, or going out-of-bounds with register
numbering. This code adds checking via a generic runtime/**/loc2c*
check_register_{fetch,store} macro. It is used as a wrapper for
all architectures for both kernel and user register
fetch/store ops.
William Cohen [Mon, 1 Apr 2019 15:25:06 +0000 (11:25 -0400)]
Add needed arch_syscall0_prefix define for arm64
On the x86_64 a functions that implement a syscall with no arguments
is used for both the 32-bit and 64-bit versions of the system call and
there are aliases for the same function. To avoid having handlers run
twice the arch_syscall0_prefix only instruments the 64-bit versions.
However, this prefix was not being set for arm64 and on the arm64 the
syscalls with no arguments would fall back to the tracepoint versions.
Added the arch_syscall0_prefix define to have the syscall tapsets use
the non-dwarf function probes for those syscalls with no arguments on
arm64.
In order to ensure a more welcoming environment for vegans and cows,
all instances of 'deadbeef' in the stapbpf interpreter's memory space
have been replaced by an exhortation to 'ea7bee75' ('eat beets').
William Cohen [Sun, 31 Mar 2019 19:47:36 +0000 (15:47 -0400)]
Update _stp_sockopt_optname_list[] to match current current socket.h defines
There have been a number of updates and additions to the Linux
kernel's include/uapi/asm-generic/socket.h defines since the code in
aux_syscalls.stp for _stp_sockopt_optname_list[] was initially
created. Defines such as SO_RCVTIMEO, SO_SNDTIMEO, and SO_TIMESTAMP
maybe be replaced by SO_RCVTIMEO_NEW, SO_SNDTIMEO_NEW, and
SO_TIMESTAMP_NEW. Before this patch systemtap scripts would fail to
build with very new 5.1.0-rc kernels due to the missing defines.
Frank Ch. Eigler [Tue, 26 Mar 2019 20:31:24 +0000 (16:31 -0400)]
PR24239 redux: testsuite / dump fallout on incremental resolution
Earlier PR24239 work made global / function resolution incremental
(transitive, starting from references in end-user scripts) rather than
tapset-wide (selecting entire tapsets en masse). This also affected
--dump-functions mode (which should be unselective), and
global-printing mode (the ordering of the output variables changed).
Updated the test suite to tolerate some different orderings, and
updated the translator to fix dumping & pragma/c variable arity.
Serhei Makarov [Tue, 26 Mar 2019 17:05:26 +0000 (13:05 -0400)]
PR23875 :: support string map keys in foreach iteration
* bpf-translate.cxx (bpf_unparser::visit_foreach_loop): Add code to handle
and correctly allocate space for string map keys.
* bpf-interp.cxx (as_ptr): New overload yielding void * from char *,
to return cached string map keys into bpf registers.
(typedef map_int_keys): Renamed from map_keys, only handles caching integer keys.
(typedef map_str_keys): New typedef, handled caching string keys.
(struct map_keys): New struct (XXX pseudo-union) for either int or str key iteration.
(map_get_next_key): Rewrite to support both int and str key iteration,
take bpf_transport_context.
(bpf_interpret): Pass bpf_transport_context to map_get_next_key.
Serhei Makarov [Fri, 22 Mar 2019 19:28:48 +0000 (15:28 -0400)]
stapbpf PR24329,PR23816 :: Properly allocate space for map value lookup.
* stapbpf/bpfinterp.cxx (bpf_interpret): new vector map_values for map value storage,
get rid of lookup_tmp, replace return with branch to cleanup code for map_values,
properly allocate a correctly-sized buffer for each bpf_lookup_elem() operation,
cleanup map_values() on exit.
* stapbpf/bpfinterp.h (bpf_transport_context::map_attrs): new field used to pass
map size information to bpf_interpret.
(bpf_transport_context::bpf_transport_context): take map_attrs argument.
* stapbpf/stapbpf.cxx (init_perf_transport): pass map_attrs to bpf_transport_context.
(main): pass map_attrs to bpf_transport_context.
William Cohen [Fri, 22 Mar 2019 17:51:37 +0000 (13:51 -0400)]
Fix the speculate.stp test
The speculate.stp test used target variables for syscall.*.return
probes. The changes to to the syscall tapsets to use non-dwarf
probes in most cases broke this example. Added appropriate probes
on syscall syscall entries to record the needed information.