Julian Seward [Fri, 12 Nov 2021 11:13:45 +0000 (12:13 +0100)]
Bug 444399 - disInstr(arm64): unhandled instruction 0xC87F2D89 (LD{,A}XP and ST{,L}XP).
This is unfortunately a big and complex patch, to implement LD{,A}XP and
ST{,L}XP. These were omitted from the original AArch64 v8.0 implementation
for unknown reasons.
(Background) the patch is made significantly more complex because for AArch64
we actually have two implementations of the underlying
Load-Linked/Store-Conditional (LL/SC) machinery: a "primary" implementation,
which translates LL/SC more or less directly into IR and re-emits them at the
back end, and a "fallback" implementation that implements LL/SC "manually", by
taking advantage of the fact that V serialises thread execution, so we can
"implement" LL/SC by simulating a reservation using fields LLSC_* in the guest
state, and invalidating the reservation at every thread switch.
(Background) the fallback scheme is needed because the primary scheme is in
violation of the ARMv8 semantics in that it can (easily) introduce extra
memory references between the LL and SC, hence on some hardware causing the
reservation to always fail and so the simulated program to wind up looping
forever.
For these instructions, big picture:
* for the primary implementation, we take advantage of the fact that
IRStmt_LLSC allows I128 bit transactions to be represented. Hence we bundle
up the two 64-bit data elements into an I128 (or vice versa) and present a
single I128-typed IRStmt_LLSC in the IR. In the backend, those are
re-emitted as LDXP/STXP respectively. For LL/SC on 32-bit register pairs,
that bundling produces a single 64-bit item, and so the existing LL/SC
backend machinery handles it. The effect is that a doubleword 32-bit LL/SC
in the front end translates into a single 64-bit LL/SC in the back end.
Overall, though, the implementation is straightforward.
* for the fallback implementation, it is necessary to extend the guest state
field `guest_LLSC_DATA` to represent a 128-bit transaction, by splitting it
into _DATA_LO64 and DATA_HI64. Then, the implementation is an exact
analogue of the fallback implementation for single-word LL/SC. It takes
advantage of the fact that the backend already supports 128-bit CAS, as
fixed in bug 445354. As with the primary implementation, doubleword 32-bit
LL/SC is bundled into a single 64-bit transaction.
Detailed changes:
* new arm64 guest state fields LLSC_DATA_LO64/LLSC_DATA_LO64 to replace
guest_LLSC_DATA
* (ridealong fix) arm64 front end: a fix to a minor and harmless decoding bug
for the single-word LDX/STX case.
* arm64 front end: IR generation for LD{,A}XP/ST{,L}XP: tedious and
longwinded, but per comments above, an exact(ish) analogue of the singleword
case
* arm64 backend: new insns ARM64Instr_LdrEXP / ARM64Instr_StrEXP to wrap up 2
x 64 exclusive loads/stores. Per comments above, there's no need to handle
the 2 x 32 case.
* arm64 isel: translate I128-typed IRStmt_LLSC into the above two insns
* arm64 isel: some auxiliary bits and pieces needed to handle I128 values;
this is standard doubleword isel stuff
* arm64 isel: (ridealong fix): Ist_CAS: check for endianness of the CAS!
* arm64 isel: (ridealong) a couple of formatting fixes
* IR infrastructure: add support for I128 constants, done the same as V128
constants
* memcheck: handle shadow loads and stores for I128 values
* testcase: memcheck/tests/atomic_incs.c: on arm64, also test 128-bit atomic
addition, to check we really have atomicity right
* testcase: new test none/tests/arm64/ldxp_stxp.c, tests operation but not
atomicity. (Smoke test).
The sequence of instructions emitted by the arm64 backend for doubleword
compare-and-swap is incorrect. This could lead to incorrect simulation of the
AArch8.1 atomic instructions (CASP, at least). It also causes failures in the
upcoming fix for v8.0 support for LD{,A}XP/ST{,L}XP in bug 444399, at least
when running with the fallback LL/SC implementation
(`--sim-hints=fallback-llsc`, or as autoselected at startup). In the worst
case it can cause segfaulting in the generated code, because it could jump
backwards unexpectedly far.
The problem is the sequence emitted for ARM64in_CASP:
* the jump offsets are incorrect, both for `bne out` (x 2) and `cbnz w1, loop`.
* using w1 to hold the success indication of the stxp instruction trashes the
previous value in x1. But the value in x1 is an output of ARM64in_CASP,
hence one of the two output registers is corrupted. That confuses any code
downstream that want to inspect those values to find out whether or not the
transaction succeeded.
The fixes are to
* fix the branch offsets
* use a different register to hold the stxp success indication. w3 is a
convenient check.
Mark Wielaard [Thu, 11 Nov 2021 17:02:09 +0000 (18:02 +0100)]
Add demangle-rust to check_PROGRAMS
The demangle-rust.vgtest would fail because the demangle-rust binary
wasn't build by default. Add it to check_PROGRAMS and define
demangle_rust_SOURCES to make sure it is always build.
Paul Floyd [Tue, 9 Nov 2021 22:11:15 +0000 (23:11 +0100)]
Bug 445032 valgrind/memcheck crash with SIGSEGV when SIGVTALRM timer used and libthr.so associated
The problem was that 'struct sigframe' has both a uContext struct
member and a puContext pointer to that struct. And puContext wasn't
being initialized to point to uContext.
It seems that the pthread sigreturn code uses puContext on i386.
amd64, with register arguments, didn't have this problem.
Mark Wielaard [Mon, 8 Nov 2021 16:12:12 +0000 (17:12 +0100)]
vbit-test F16 Iops are tested on the wrong architectures
Because of what looks like some copy/paste issues the new F16 Iops
seem to be tested on the wrong architectures. They are only implemented
on arm64. So this patch only enables them for arm64.
Carl Love [Mon, 1 Nov 2021 16:18:32 +0000 (11:18 -0500)]
Valgrind Add powerpc R=1 tests
Contributed by Will Schmidt <will_schmidt@vnet.ibm.com>
This includes updates and adjustments as suggested by Carl.
Add tests that exercise PCRelative instructions.
These instructions are encoded with R==1, which indicate that
the memory accessed by the instruction is at a location
relative to the currently executing instruction.
These tests are built using -Wl,-text and -Wl,-bss
options to ensure the location of the target array is at a
location with a specific offset from the currently
executing instruction.
The write instructions are aimed at a large buffer in
the bss section; which is checked for updates at the
completion of each test.
In order to ensure consistent output across assorted
systems, the tests have been padded with ori, nop instructions
and align directives.
Detailed changes:
* Makefile.am: Add test_isa_3_1_R1_RT and test_isa_3_1_R1_XT tests.
* isa_3_1_helpers.h: Add identify_instruction_by_func_name() helper function
to indicate if the test is for R==1.
Add helpers to initialize and print changes to the pcrelative_write_target
array.
Add #define to help pad code with a series of eyecatcher ORI instructions.
* test_isa_3_1_R1_RT.c: New test.
* test_isa_3_1_R1_XT.c: New test.
* test_isa_3_1_R1_XT.stdout.exp: New expected output.
* test_isa_3_1_R1_XT.stdout.exp: New expected output.
* test_isa_3_1_R1_RT.stderr.exp: New expected output.
* test_isa_3_1_R1_RT.stderr.exp: New expected output.
* test_isa_3_1_R1_RT.vgtest: New test handler.
* test_isa_3_1_R1_XT.vgtest: New test handler.
* test_isa_3_1_common.c: Add indicators (updates_byte,updates_halfword,
updates_word) indicators to control the output from the R==1 tests.
Add helper check for "_R1" to indicate if instruction is coded with R==1.
Add init and print helpers for the pcrelative_write_target array.
Carl Love [Wed, 20 Oct 2021 20:40:13 +0000 (20:40 +0000)]
Fix for the prefixed stq instruction in PC relative mode.
The pstq instruction for R=1, was not using the correct effective address.
The EA_hi and EA_lo should have been based on the value of EA as calculated
by the function calculate_prefix_EA. Unfortuanely, the EA_hi and EA_lo
addresses were still using the previous code (not PC relative) to calculate
the address from the contants of RA plus the offset.
Mark Wielaard [Tue, 2 Nov 2021 13:27:45 +0000 (14:27 +0100)]
gdbserver_tests: Filter out glibc hwcaps libc.so
On some systems the gdbserver_tests would fail because the filter
for the optimized hwcaps subdir didn't match because the file is
called slightly differently, with the version number before .so
instead of after. For example: /lib64/glibc-hwcaps/power9/libc-2.28.so
Carl Love [Fri, 29 Oct 2021 21:30:33 +0000 (16:30 -0500)]
Bug 444571 - PPC, fix the lxsibzx and lxsihzx so they only load their respective sized data.
The lxsibzx was doing a 64-bit load. The result was initializing
additional bytes in the register that should not have been initialized.
The memcheck/tests/linux/dlclose_leak test detected the issue. The
code generation uses lxsibzx and stxsibx with -mcpu=power9. Previously
the lbz and stb instructions were generated.
The same issue was noted and fixed with the lxsihzx instruction. The
memcheck/tests/linux/badrw test now passes as well.
Andreas Arnez [Fri, 22 Oct 2021 17:55:12 +0000 (19:55 +0200)]
Bug 444242 - s390x: Sign-extend "relative long" offset in EXRL
In s390_irgen_EXRL, the offset is zero-extended instead of sign-extended,
typically causing Valgrind to crash when a negative offset occurs.
Fix this with a new helper function that calculates a "relative long"
address from a 32-bit offset. Replace other calculations of "relative
long" addresses by invocations of this function as well. And for
consistency, do the same with "relative" (short) addresses.
Mark Wielaard [Sun, 17 Oct 2021 20:13:25 +0000 (22:13 +0200)]
Set version once in configure.ac, use in valgrind.h andvg-entities.xml
Currently the version is updated in 3 places, configure.ac,
include/valgrind.h and docs/xml/vg-entities.xml. This goes wrong from
time to time. So only define the version (and release date) once in
configure.ac and update both other places at configure time.
Mark Wielaard [Wed, 13 Oct 2021 15:05:29 +0000 (17:05 +0200)]
coregrind: Vg_FnNameKind recognize __libc_start_call_main as below main
Depending on architecture glibc has various functions that set things
up to call "main". glibc 2.34 added __libc_start_call_main (at least
on ppc64le and s390x). Other variants recognized are __libc_start_main,
generic_start_main and variants of those names.
This fixes the massif/tests/deep-D and massif/tests/mmapunmap on ppc64le.
Mark Wielaard [Wed, 13 Oct 2021 11:49:15 +0000 (13:49 +0200)]
NEWS: add various core changes and arm64 additions
Add demangler update, __libc_freeres not being called on fatal signal,
DWARF reader improvements, glibc 2.34 support and various new arm64
v8.2 updates.
Remove Tool Changes section, since there were no user visible
changes to the tools in 3.18.0.
Mark Wielaard [Tue, 12 Oct 2021 21:15:41 +0000 (23:15 +0200)]
Implement BPF_MAP_LOOKUP_AND_DELETE_ELEM and BPF_MAP_FREEZE
Implement BPF_MAP_LOOKUP_AND_DELETE_ELEM (command 21) and
BPF_MAP_FREEZE (command 22) and produce a WARNING instead of a fatal
error for unrecognized BPF commands.
Lubomir Rintel [Mon, 4 Oct 2021 13:40:29 +0000 (15:40 +0200)]
Add close_range(2) support
This is a system call introduced in Linux 5.9.
It's typically used to bulk-close file descriptors that a process inherited
without having desired so and doesn't want to pass them to its offspring
for security reasons. For this reason the sensible upper limit value tends
to be unknown and the users prefer to stay on the safe side by setting it
high.
This is a bit peculiar because, if unfiltered, the syscall could end up
closing descriptors Valgrind uses for its purposes, ending in no end of
mayhem and suffering.
This patch adjusts the upper bounds to a safe value and then skips over
the descriptor Valgrind uses by potentially calling the real system call
with sub-ranges that are safe to close.
The call can fail on negative ranges and bad flags -- we're dealing with
the first condition ourselves while letting the real call fail on bad
flags.
Mark Wielaard [Tue, 12 Oct 2021 20:47:57 +0000 (22:47 +0200)]
coregrind: Don't call final_tidyup (__libc_freeres) on FatalSignal
When a program gets a fatal signal (one it doesn't handle) valgrind
terminates the program. Before termination it will try to call
final_tidyup which tries to run __libc_freeres and
__gnu_cxx::__freeres to get rid of some memory glibc or libstdc++
don't normally release.
But when the program got the fatal signal in a critical section inside
glibc it might leave the datastructures in a bad state and cause
__libc_freeres to crash. This makes valgrind itself crash just before
producing its own error summary, making the valgrind run unusable.
A reproducer can found at
https://bugzilla.redhat.com/show_bug.cgi?id=1952836 and
https://bugzilla.redhat.com/show_bug.cgi?id=1225994#c7
This reproducer is really a worse case scenario with multiple threads
racing to get into the critical section that when interrupted will
make __libc_freeres unable to cleanup. But it seems a good policy in
general. If a program is terminated by a fatal signal instead of
normal termination, it seems not having some of the glibc/libstdc++
resource cleaned up is an expected thing.
Mark Wielaard [Tue, 12 Oct 2021 18:01:45 +0000 (20:01 +0200)]
filter_xml: Filter out '@*' from <fn> symbol names
With glibc 2.34 we would start seeing some function names ending in
'@*' this was already filtered out using drd/tests/filter_stderr.in
but not when using the drd xml tests. This would make
drd/tests/thread_name_xml and drd/tests/bar_bad_xml fail.
Filter this out in the memcheck/tests/filter_xml script, which is
also used by the drd test filters.
Tested against glibc 2.34, 2.33 and 2.17 on x86_64.
Mark Wielaard [Tue, 12 Oct 2021 16:51:23 +0000 (18:51 +0200)]
drd/tests: Extract start_thread which can come from libpthread or libc
The drd/tests/tc21_pthonce and drd/tests/annotate_barrier tests
would fail if start_thread came from libc (as it does in glibc 2.34)
instead of from libpthread. Extract start_thread in filter_stderr.in
and update the backtraces in annotate_barrier.stderr.exp and in
tc21_pthonce.stderr.exp
Tested against glibc 2.34, 2.33 and 2.17 on x86_64.
Paul Floyd [Sun, 10 Oct 2021 19:56:49 +0000 (21:56 +0200)]
Fix the ramaining easily fixable warnings with clang
There's one remaining
memalign2.c:29:9: warning: unused variable 'piece' [-Wunused-variable]
because of a block of #if FreeBSD for memalign that looks unnecessary
Otherwise all that is left is a few like
warning: unknown warning option '-Wno-alloc-size-larger-than'; did you mean '-Wno-frame-larger-than='? [-Wunknown-warning-option]
because there is no standard for compiler arguments.
Mark Wielaard [Sun, 10 Oct 2021 15:13:43 +0000 (17:13 +0200)]
Remove more warnings from tests
GCC12 catches various issues in tests at compile time that we want to
catch at runtime. Also glibc 2.34 deprecated various mallinfo related
functions. Add the relevant -Wno-foobar flags to those tests. In one
case, unit_oset.c, the warning was correct and the uninitialized
variable was explicitly set.
Mark Wielaard [Sun, 10 Oct 2021 14:35:37 +0000 (16:35 +0200)]
Fix printf warning in libmpiwrap.c
libmpiwrap.c:1379:45: warning: format '%d' expects argument of type 'int',
but argument 5 has type 'MPI_Request' {aka 'struct ompi_request_t *'}
Unfortunately MPI_Request is an opaque type (we don't really know what
is in struct ompi_request_t) so we cannot simply print it as int. In
other places we print an MPI_Request as 0x%lx by casting it to an
unsigned long. Do the same here.
Mark Wielaard [Sun, 10 Oct 2021 13:56:50 +0000 (15:56 +0200)]
Remove some warnings from tests
Various tests do things which we want to detect at runtime, like
ignoring the result of malloc or doing a deliberate impossibly large
allocation or operations that would result in overflowing or
truncated strings, that generate a warning from gcc.
In once case, mq_setattr called with new and old attrs overlapping,
this was explicitly fixed, in others -Wno-foobar was added to silence
the warning. This is safe even for older gcc, since a compiler will
ignore any -Wno-foobar they don't know about - since they do know they
won't warn for foobar.
Mark Wielaard [Thu, 7 Oct 2021 11:43:19 +0000 (13:43 +0200)]
Fix make distcheck by removing references to uncommitted files
Some files for the freebsd port have not yet committed, but were
already referenced in the Makefiles. Remove those references for
now to make distcheck happy.
Paul Floyd [Thu, 7 Oct 2021 05:53:33 +0000 (07:53 +0200)]
FreeBSD support, patch 2
Files in the root directory
Several Makefile.am files that have dependencies on FreeBSD autoconf
variables. Included a few new filter files to act as placeholders
to create new freebsd subdirectories.
Updated NEWS with the FreeBSD bugzilla items plus a couple of other
items fixed indirectly.
Andreas Arnez [Fri, 1 Oct 2021 18:10:54 +0000 (20:10 +0200)]
s390x: Add missing "cc" clobbers in test case inline asms
Some inline assemblies in various s390x test cases miss specifying the
condition code "cc" in the clobber list. Although this has not actually
been seen to cause wrong code generation, it certainly might, so fix this.
Andreas Arnez [Fri, 24 Sep 2021 18:06:39 +0000 (20:06 +0200)]
s390x: Fix compile warnings in test cases
Some GCC versions emit the following warnings for some s390x-specific test
cases:
warning: listing the stack pointer register '15' in a clobber list is
deprecated
warning: this 'else' clause does not
guard... [-Wmisleading-indentation] ...this statement, but...
Fix these.
Most of inline assemblies declaring r15 as clobbered do not actually
change its value. Only in stmg_wrap() it becomes necessary to save and
restore r15.
Mark Wielaard [Sat, 2 Oct 2021 10:03:46 +0000 (12:03 +0200)]
Ajust filter_gdb for arm64 with eglibc 2.19 and gdb 7.7.1
Older ubuntu arm64 setups used eglibc 2.19 and gdb 7.7.1. In that
case select.c could be under linux/generic and the select argument
list could be split up differently over several lines. Adjust
filter_gdb to catch those differences.
Also checked against an Debian arm64 with glibc 2.31 and gdb 10.1.
Mark Wielaard [Fri, 1 Oct 2021 20:25:40 +0000 (22:25 +0200)]
Add none/tests/scripts/shell.stderr.exp-dash4 for dash 0.5.11
dash 0.5.11 produces slightly different error messagess.
The new exp file is similar to shell.stderr.exp-dash3 but
with the extra (second) "shell: " output removed.
Carl Love [Tue, 28 Sep 2021 15:49:10 +0000 (15:49 +0000)]
Fix tests for mfspr
Split out the mfspr tests into a separate test using command line option
"-M". The value in the LR and CTR registers changed. It appears the
changes are due to changes in the test program jm-insns.c. Splinting
these instructions out will help to minimize the size of future updates
when the test program changes.
Carl Love [Thu, 9 Sep 2021 23:10:07 +0000 (23:10 +0000)]
fix sraw, srawi, srad, sradi instructions
For ISA 3.0 and beyond, the instructions also write the XER register.
Split the instructions out to a new command line option so we can create
an ISA 2.07 expect file, ISA 3.0 LE and ISA 3.0 BE expect file. The new
command line option is "-s" to just run just these four instructions.
Carl Love [Thu, 9 Sep 2021 19:06:00 +0000 (19:06 +0000)]
Add support for the mcrxrx instruction.
The mcrxrx instruction was introduced in ISA 3.0. It was missed when the
ISA 3.0 support was added to Valgrind.
The mcrxr instruction is not supported on ISA 3.0 and beyond. The
instructions both do a move to the condition register however the mcrxrx
moves [OV|OV32|CA|CA32]. Where the mcrxr instruction moves XER[32:35]
(S0, OV, and CA bits) to the CR.
Carl Love [Wed, 8 Sep 2021 22:01:05 +0000 (22:01 +0000)]
Fix dfp tests.
Due to changes between the compiler and linker, we need to add .machine
arguments to configure file to properly detect the availability of the
dfp instructions.
Add print statement if HAS_DFP is not enabled to make it
easier to determine when HAS_DFP is not enabled.
powerpc: Add .machine directives for scv, copy, paste, cpabort instructions
GCC is no longer passing the "-many" flag to the assembler. So, the
inline assembly instructions statements need to use the .machine directives
for the specific platform.
Andreas Arnez [Thu, 30 Sep 2021 12:10:29 +0000 (14:10 +0200)]
configure.ac: Avoid the use of "which"
The "which" command is not always installed, but configure.ac uses it in
the function AC_HWCAP_CONTAINS_FLAG to force invocation of the executable
"true" rather than the shell builtin with the same name. (The point here
is to get LD_SHOW_AUXV=1 evaluated by the dynamic loader.)
Another option might be to hard-wire the location /bin/true, because the
filesystem hierarchy standard requires it to be there. However, the FHS
doesn't apply to BSDs and at least some FreeBSD versions do not stick to
that specific rule.
On the other hand, the "env" command seems to be available on all relevant
platforms, so use that instead.
- prevent null dereferencing on dlang_type
- prevent buffer overflow when decoding user input
- Add support for demangling local D template declarations
- Add support for demangling D function literals as template
value parameters
- Add support for D `typeof(*null)' types
- Fix -Wundef warnings in ansidecl.h
- Fix endian bug in rust demangler
- Adjust mangling of __alignof__
- Avoid -Wstringop-truncation
Update libiberty demangler to support Rust v0 name mangling
Update the libiberty demangler using the auxprogs/update-demangler
script to the gcc git 01d92cfd79872e4cffc78bf233bb9b767336beb8.
Updates rust demangling to support the new v0 mangling scheme.
This includes the following changes:
- Update the update-demangler script to use gcc git instead of svn.
- The result of running the updated script to get an updated
demangler and resolving the merge conflicts.
- A change to long_namespace_xml.stderr.exp because two overly long
symbols aren't demangled anymore, but just returned as is.
- an update to the m_demangle/demangle.c source to deal with Rust
demangling in cp_demangle, which now directly demangles old and
new style rust symbols.
Mark Wielaard [Sun, 19 Sep 2021 12:30:19 +0000 (14:30 +0200)]
readdwarf3: Introduce abbv_state to read .debug_abbrev more lazily
With the inline parser often a lot of DIEs are skipped, so reading
all abbrevs up front wastes time and memory. A lot of time and memory
can be saved by reading the abbrevs on demand. Do this by introducing
an abbv_state that is used to keep track of the abbrevs already read.
This does technically make the CUConst struct not const.
Mark Wielaard [Sat, 18 Sep 2021 20:16:33 +0000 (22:16 +0200)]
readdwarf3: Reuse abbrev if possible between units
Instead of destroying the ht_abbrvs after processing a CU save it
and the offset so it can be reused for the next CU if that happens
to have the same abbrev offset. dwz compressed DWARF often reuse
the same abbrev for multiple CUs.
Mark Wielaard [Fri, 17 Sep 2021 22:24:38 +0000 (00:24 +0200)]
readdwarf3: Reuse fndn_ix_Table as much as possible
Both the var parser and the inl parser kept a fndn_ix_Table.
Initialize only one per debuginfo read pass and reuse if the stmt offset
is the same as last time (CUs can share the same line table and alt
files do share one for all units).