Paul Floyd [Tue, 12 Apr 2022 21:34:41 +0000 (23:34 +0200)]
Bug 452274 memcheck crashes with Assertion 'sci->status.what == SsIdle' failed
FreeBSD (and Darwin) use the carry flag for syscall syscall status.
That means that in the assembler for do_syscall_for_client_WRK
they have a call to LibVEX_GuestAMD64_put_rflag_c (amd64) or
LibVEX_GuestX86_put_eflag_c (x86). These also call WRK functions.
The problem is that do_syscall_for_client_WRK has carefully crafted
labels correspinding to IP addresses. If a signal interrupts
processdings, IP can be compared to these addresses so that
VG_(fixup_guest_state_after_syscall_interrupted) can work
out how to resume the syscall. But if IP is in the save
carry flag functions, the address is not recognized and
VG_(fixup_guest_state_after_syscall_interrupted) fails.
The crash in the title happens because the interrupted
syscall does not reset its status, and on the next syscall
it is expected that the status be idle.
To fix this I added global variables that get set to 1
just before calling the save carry flag functions, and cleared
just after. VG_(fixup_guest_state_after_syscall_interrupted)
can then check this and work out which section we are in
and resume the syscall correctly.
Also:
Start a new NEWS section for 3.20
Add a regtest for this and also a similar one for Bug 445032
(x86-freebsd only, new subdir).
I saw that this problem also probably exists with macOS, so I made
the same changes there (not yet tested)
Mark Wielaard [Mon, 11 Apr 2022 12:45:49 +0000 (14:45 +0200)]
Extend helgrind suppression for _IO_*xsputn* FILE* state manipulation
commit 7b5867b1f "helgrind reports false races for printfs using
mempcpy on FILE* state" extended the helgrind-glibc-io-xsputn
suppression by also covering mempcpy (instead of __GI_mempcpy).
The test added in that commit exposed a couple of other variants
of this suppression where _IO_*xsputn* called memcpy (instead of
mempcpy) and/or had an extra indirection/function in between.
Replace the two two suppressions with one that covers all cases
where _IO_*xsputn* *mem*cpy variants with possibly another ...
function in between.
Mark Wielaard [Fri, 8 Apr 2022 12:58:38 +0000 (14:58 +0200)]
helgrind reports false races for printfs using mempcpy on FILE* state
We already have a suppression for helgrind which is for when glibc
uses __GI_mempcpy to manipulate internal FILE state (this was bug
352130). But since glibc-2.26 mempcpy is used instead __GI_mempcpy,
making the suppresion from the original bug obsolete.
This patch adds a new suppression using mempcpy but doesn't replace
the original suppression for older systems.
Patch adding suppression + testcase by Jesus Checa <jcheca@redhat.com>
Randy MacLeod [Wed, 17 Oct 2018 01:01:04 +0000 (21:01 -0400)]
Fix out of tree builds.
The paths to these files need to be fully specified in
the out of tree build case. glibc-2.X.supp is a generated file so the
full path is deliberately not specified in that case.
Also adjust the mpi include dir location as valgrind.h is
generated as well and needs to be taken out of build dir.
Also adjust the location of generated xml file. And the search paths
for the xmllint, xsltproc and xmlto programs.
Signed-off-by: Alexander Kanavin <alex.kanavin@gmail.com>
Aaron Merey [Wed, 26 Jan 2022 01:24:18 +0000 (20:24 -0500)]
Bug 445011: SIGCHLD is sent when valgrind uses debuginfod-find
Valgrind fork+execs debuginfod-find in order to perform debuginfod
queries. Any SIGCHLD debuginfod-find sends upon termination can
mistakenly be delivered to the client running under valgrind.
To prevent this, record in a hash table the PID of each process
valgrind forks for internal use. Do not send SIGCHLD to the client
if it is from a PID in this hash table.
Mark Wielaard [Thu, 7 Apr 2022 20:02:12 +0000 (22:02 +0200)]
Update mc_main Copyright message to include 2022
We haven't ran auxprogs/change-copyright-year since we switched to git.
This means most Copyright year ranges still say 2017. The script also
doesn't work for years >= 2020. Instead of trying to figure out how to
correctly update the per file Copyright year ranges just update the
main copyright notice that the program outputs on startup.
Since memfd_secret introduced in kernel 5.14, valgrind should rename
the "memfd" test to "memfd_create" test to avoid the ambiguity, so that
user will not get confused with the "memfd_secret" test.
After this change, syscall memfd_create will be tested by:
Mark Wielaard [Wed, 16 Feb 2022 21:56:31 +0000 (22:56 +0100)]
Warn for execve syscall with argv or argv[0] being NULL.
For execve valgrind would silently fail when argv was NULL or
unadressable. Make sure that this produces a warning under memcheck.
The linux kernel accepts argv[0] being NULL, but most other kernels
don't since posix says it should be non-NULL and it causes argc to
be zero which is unexpected and might cause security issues.
This adjusts some testcases so they don't rely on execve succeeding
when argv is NULL and expect warnings about argv or argv[0] being
NULL or unaddressable.
Carl Love [Tue, 5 Apr 2022 01:31:33 +0000 (21:31 -0400)]
Powerpc 32bit, fix the vbpermq support
Passing the two 128-bit vA and vB arguments doesn't work in 32-bit mode.
The clean helper was changed to compute the result for 8 indexes. The
helper is then called twice to get the result for the upper 64-bits of the
vB register and the lower 64-bits of the vB register.
The patch is an additional fix for bugzilla 451827.
Paul Floyd [Sun, 3 Apr 2022 20:00:09 +0000 (22:00 +0200)]
Fixes for memcheck/tests/freebsd/realpathat
The syscall to realpathat was missing the buffer size argument.
By luck, no problem on amd64 but this failed on x86.
This adds the argument and a filter for the errors (size_t can be 4 or 8 bytes)
Mark Wielaard [Fri, 1 Apr 2022 15:28:24 +0000 (17:28 +0200)]
configure.ac: AC_HEADER_TIME is deprecated just check for sys/time.h
AC_HEADER_TIME is deprecated and checks for various things, like
whether you can include both time.h and sys/time.h together. Which
is fine on all systems these days. Just check whether sys/time.h
is available. HAVE_SYS_TIME_H is used once in the code base in the
timerfd-syscall.c testcase. So even this limited check might be
overkill.
Carl Love [Wed, 23 Mar 2022 18:41:16 +0000 (13:41 -0500)]
Powerpc, re-implement the vbpermq instruction support
The instruction support generates too many Iops when multiple vbpermq
instructions occur together in the binary. This patch changes the
implementation to use a clean helper and thus avoid overflowing the
internal Valgrind buffer.
Mark Wielaard [Sat, 19 Mar 2022 00:06:40 +0000 (01:06 +0100)]
bpf attr->raw_tracepoint.name may be NULL for BPF_RAW_TRACEPOINT_OPEN.
For BPF_RAW_TRACEPOINT_OPEN attr->raw_tracepoint.name may be NULL.
Otherwise it should point to a valid (max 128 char) string. Only
raw_tracepoint.prog_fd needs to be set.
Carl Love [Fri, 11 Feb 2022 20:07:20 +0000 (14:07 -0600)]
Powerpc: Fix checking for scv support, add check to scv instruction parsing.
The check for the scv instruction in coregrind/m_machine.c issues an scv
instruction and uses sigill to determine if the instruction is supported.
Issuing scv on systems that don't support scv, i.e. scv support is not in
HWCAPS2, generates a message in dmesg "Facility 'SCV' unavailable (12),
exception".
This patch removes the sigill based scv instruction test from
coregrind/m_machine.c. The scv support is now determined by reading the
HWCAPS2 in setup_client_stack(). VG_(machine_ppc64_set_scv_support) is
called to set the flag ppc_scv_supported in struct VexArchInfo.
The allow_scv flag is added in disInstr_PPC_WRK. The allow_scv flag is
used to ensure the host has support for scv before generating the iops for
the scv instruction.
On s390x Linux platforms the sys_ipc semtimedop call has four instead of
five parameters, where the timeout is passed in the third instead of the
fifth.
Reflect this difference in the handling of VKI_SEMTIMEDOP.
Mark Wielaard [Fri, 11 Feb 2022 16:50:47 +0000 (17:50 +0100)]
arm64: Mismatch detected between RDMA and atomics features
check_hwcaps contains code that tries to enforce Arm architecture's
rules for the support of features (FEAT_) on v8.1. Specifically for
v8.1 FEAT_RDM and FEAT_LSE (named FEAT_ATOMICS in Valgrind) are
mandatory.
But an v8.x implementation can implement any of the v8.{x+1}
features, or not, as it chooses. Also under QEMU, which tends
to implement features on an "as-demanded" basis, you sometimes
end up with an odd combination of features, which does not
strictly comply with the architecture.
So ignore the "v8.x" architecture levels, and look only only at
"is feature X present or not". Unless the features are really not
independent.
Carl Love [Tue, 8 Feb 2022 23:52:33 +0000 (17:52 -0600)]
Powerpc: Update ACC support to reflect being mapped over vsr registers
The ISA 3.1 implemention provides the effect of ACC and VSRs
logically containing the same data. Future versions of the
hardware may define new state or redefine the backing state
of the registers.
This reworks the code to support the ACC as implemented as a logical
mapping over the VSR registers, and lays groundwork for a future
implementation utilizing a separate register file. There
is a single boolean variable, ACC_mapped_on_VSR, that can be set in
disInstr_PPC_WRK(), based on the ISA being used, to select which
implementation model to use.
Mark Wielaard [Wed, 9 Feb 2022 22:37:53 +0000 (23:37 +0100)]
Do not try to record fd name for io_uring_setup
In POST(sys_io_uring_setup) we tried to use record_fd_open_with_given_name
with ARG1 as name. But ARG1 isn't a char pointer. So this might crash with
--track-fds=yes. Since no (file) name is associated with the fd returned by
io_uring_setup use record_fd_open_nameless instead.
Andreas Arnez [Mon, 3 Jan 2022 17:15:05 +0000 (18:15 +0100)]
s390: Fix VFLRX and WFLRX instructions
Due to a typo in s390_irgen_VFLR, the VFLR instruction behaves incorrectly
when its m3 field contains 4, meaning extended format. In that case VFLR
is also written as VFLRX (or WFLRX) and supposed to round down from the
extended 128-bit format to the long 64-bit format. However, the typo
checks for m3 == 2 instead, so the value of 4 is unhandled, causing
Valgrind to throw a specification exception.
Mark Wielaard [Tue, 8 Feb 2022 15:36:08 +0000 (16:36 +0100)]
ppc64 --track-origins=yes failure because of bad cmov addHRegUse
For Pin_CMov getRegUsage_PPCInstr called addHRegUse for the dst
register with HRmWrite, but since this is a conditional move the
register could be both read and written (read + write = modify).
This matches the dst of Pin_FpCMov and Pin_AvCMov.
In a very rare case, and only with --track-origins=yes, this
could cause bad code generation.
This is slightly amazing, this code is from 2005 and as far as
I know we never seen an issue with --track-origins=yes on power
before. And I have been unable to come up simple reproducer.
Carl Love [Tue, 1 Feb 2022 21:29:30 +0000 (21:29 +0000)]
Fix setting condition code for Vector Compare quad word instructions.
The vcmpgtsq., vcmpgtuq,, vcmpequq. instructions set the condition code field
6 to 0b1000 for true, 0b0010 for false. The condition code was being set
according to the typical condition code values for equal and greater than
which is incorrect for these instructions. The patch fixes the setting of the
condition code as specified in the instructions.
Carl Love [Fri, 14 Jan 2022 23:04:44 +0000 (23:04 +0000)]
Assorted changes to protect from side affects from the feature checking code.
Patch contributed by Will Schmidt <will_schmidt@vnet.ibm.com>
This problem was initially reported by Tulio, he assisted me in
identifying the underlying issue here.
This was discovered on a Power10, and occurs since the ISA 3.1 support
check uses the brh instruction via a hardcoded ".long 0x7f1401b6" asm stanza.
That encoding writes to r20, and since the stanza does not contain a clobber
the compiler did not know to save or restore that register upon entry or exit.
The junk value remaining in r20 subsequently caused a segfault.
This patch adds clobber masks to the instruction stanzas, as well as
updates the associated comments to clarify which registers are being
used.
As part of this change I've also
- updated the .long for the cnttzw instruction to write to r20, and
zeroed the reserved bits from that instruction so it is properly
decoded by the disassembler.
- updated the .long for the dadd instruction to write to f0.
I've inspected the current codegen with these changes in place, and
confirm that r20 is now saved and restored on entry and exit from the
machine_get_hwcaps() function.
bugzilla 447995 Valgrind segfault on power10 due to hwcap checking code
Paul Floyd [Sat, 11 Dec 2021 11:32:08 +0000 (12:32 +0100)]
Bug 446823 FreeBSD - missing syscalls when using libzm4
Adds syscall wrappers for __specialfd and __realpathat.
Also remove kernel dependency on COMPAT_FREEBSD10.
This change also reorganizes somewhat the scalar test
and adds configure time checks for the FreeBSD version,
allowing regression tests to be compiled depending on the
FreeBSD release.
From now on, scalar.c will contain syscalls for FreeBSD 11 and 12
and subsequent releases will get their own scalar, starting with
scalar_13_plus.c.
Paul Floyd [Thu, 9 Dec 2021 21:54:23 +0000 (22:54 +0100)]
FreeBSD sigreturn arg names again
Also make drd/tests/shared_timed_mutex more robust
Already not great using time delays, but the test seems
to fail intermittently due to spurious wakeups. So instead
of railing straight away, make it "three strikes and you're out".
Andreas Arnez [Thu, 9 Dec 2021 14:27:41 +0000 (15:27 +0100)]
Bug 444481 - Don't unmap the vDSO on s390x
Newer Linux kernels on s390x may use the vDSO as a "trampoline" for
syscall restart. This means that the vDSO is no longer optional, and
unmapping it may lead to a segmentation fault when a system call restart
is performed.
So far Valgrind has been unmapping the vDSO on s390x. Just don't do this
anymore.
Julian Seward [Wed, 8 Dec 2021 06:52:09 +0000 (07:52 +0100)]
Bug 446103 - Memcheck: `--track-origins=yes` causes extreme slowdowns for large mmap/munmap.
This patch rewrites the Level 2 origin-tracking cache (ocacheL2) so that
set-address-range-permissions (SARP) operations on it, for large ranges, are
at least a factor of 2.5 x faster. This is primarily targeted at SARPs in the
range of hundreds to thousands of megabytes. The Level 1 origin-tracking
cache covers 64MB address space, so SARPs that fit within it are mostly
unaffected. There are extensive comments in-line. Changes are:
* Change the Level 2 cache from a single AVL tree (OSet) into 4096 such trees,
selected by middle bits of the tag, hence "taking out" 12 significant bits
of search in any given tree.
* For the OCacheLine type, use a union so as to overlay the w32 and descr
arrays with an array of 64-bit values. This is used to speed up cases where
those fields are to be set to zero, or checked against zero.
* Due to the various fast-paths added by this patch, OC_BITS_PER_LINE has
pretty much been frozen at the current value, 5.
* ocache_sarp_Set_Origins, ocache_sarp_Clear_Origins: deal with large ranges
in 32-byte steps instead of 4-byte steps.
* MC_(helperc_b_store32), MC_(helperc_b_store16): rewrite these to be (much)
more efficient.
* fast-return cases for VG_(OSetGen_Lookup) and VG_(OSetGen_Remove) when the
tree is empty
When running `cg_annotate` on files produced with `cg_diff`, it's common
to get multiple occurrences of this pair of errors:
```
Use of uninitialized value $pairs[0] in numeric lt (<) at
/home/njn/grind/ws1/cachegrind/cg_annotate line 848.
Use of uninitialized value $high in numeric lt (<) at
/home/njn/grind/ws1/cachegrind/cg_annotate line 859.
```
This is because `cg_annotate` wasn't properly handling the case where no
source code lines have annotations, which never happens in the normal
case but does happen in `cg_diff` output.
Happily, it turns out that the warnings were harmless, the fix is
trivial, and it doesn't change the output at all.
Mark Wielaard [Thu, 2 Dec 2021 13:41:44 +0000 (14:41 +0100)]
valgrind-di-server.c: Fix minor file descriptor leak on error
In handle_transaction when a file descriptor is opened for a file,
but then cannot be stat or the file turns out to be zero size we
leak the file descriptor. Call close (fd) before reporting error.
Rust v0 symbols can have `#` chars in them, things like this:
```
core::panic::unwind_safe::AssertUnwindSafe<<proc_macro::bridge::server::Dispat
cher<proc_macro::bridge::server::MarkedTypes<rustc_expand::proc_macro_server::Rustc>> as proc_macro::bridge::server::DispatcherTrait>::dispatch::{closure#14}>, ()>
```
`cg_diff` currently messes these up in two ways.
- It treats anything after a `#` in the input file as a comment. In
comparison, `cg_annotate` only treats a `#` as starting a comment at
the start of a line.
- It uses `#` to temporarily join file names and function names while
processing.
This commit adjusts the parsing to fix the first problem, and changes
the joiner sequence to `###` to fix the second problem.
Paul Floyd [Mon, 29 Nov 2021 21:44:17 +0000 (22:44 +0100)]
Bug 446251 TARGET_SIGNAL_THR added to enum target_signal
gdb considers FreeBSD SIGTHR to be the evuivalent if SIGLWP
not a signal in its own right. Remove the extra enum entry
(which fixes errors in converting signals from number to
string) and map TARGET_SIGNAL_LWP to SIGTHR.
Paul Floyd [Tue, 23 Nov 2021 22:37:02 +0000 (23:37 +0100)]
Anticipate testcase problems with GCC 12
There will be a lot more to come.
On amd64 Linux
In faultstatus was seeing the division by zero and emitting a ud2 opcode.
In wrap3 a pair of mutually recursive functions were being inlined.
When forced not to be inlined GCC merged them into a single function.
It cannot see that the client requests have diffeent behaviour.
Paul Floyd [Mon, 22 Nov 2021 03:12:16 +0000 (04:12 +0100)]
Add missing syscall wrapper on Solaris
I tried to test drd/tests/pth_mutex_signal on Solaris
(you never know) but encountered a missing syscall
wrapper. So this adds a very basic wrapper for lwp_mutex_unlock.
Also update a Solaris expected that I missed amongst the FreeBSD changes.
Mark Wielaard [Mon, 22 Nov 2021 12:07:59 +0000 (13:07 +0100)]
readdwarf3.c (parse_inl_DIE) inlined_subroutine can appear in namespaces
This was broken by commit 75e3ef0f3 "readdwarf3: Skip units without
addresses when looking for inlined functions". Specifically by this
part: "Also use skip_DIE instead of read_DIE when not parsing
(skipping) children"
rustc puts concrete function instances in namespaces (which is
allowed in DWARF since there is no strict separation between type
declarations and program scope entries in a DIE tree), the inline
parser didn't expect this and so skipped any DIE under a namespace
entry. This wasn't an issue before because "skipping" a DIE tree was
done by reading it, so it wasn't actually skipped. But now that we
really skip the DIE (sub)tree (which is faster than actually parsing
it) some entries were missed in the rustc case.
Mark Wielaard [Fri, 19 Nov 2021 14:00:27 +0000 (15:00 +0100)]
memcheck/tests/libstdc++.supp: rename suppression
The name malloc-leaks-cxx-stl-string-classes-debug was confusing
since the suppression wasn't a leak, not part of stl, string,
classes or debug. Rename it to libstdcxx-emergency-eh-alloc-pool
to indicate it is part of the emergency exception handling memory
pool.
Note that suppression is only needed for some test cases, normally
the pool is cleaned up as part of cxx_freeres.
Julian Seward [Sat, 13 Nov 2021 18:59:07 +0000 (19:59 +0100)]
amd64 front end: add more spec rules:
S after SHRQ
Z after SHLQ
NZ after SHLQ
Z after SHLL
S after SHLL
The lack of at least one of these was observed to cause occasional false
positives in Memcheck.
Plus add commented-out cases so as to complete the set of 12 rules
{Z,NZ,S,NS} after {SHRQ,SHLQ,SHLL}. The commented-out ones are commented
out because I so far didn't find any use cases for them.
Paul Floyd [Sat, 13 Nov 2021 11:31:41 +0000 (12:31 +0100)]
Bugs 435732 and 403802 again
This time with debuginfo removed.
Also update the vgtest files for a couple of massif tests
(and also the expected because of the commmand line change).
Not yet tested these two with debuginfo installed.
Julian Seward [Sat, 13 Nov 2021 08:27:01 +0000 (09:27 +0100)]
Bug 445415 - arm64 front end: alignment checks missing for atomic instructions.
For the arm64 front end, none of the atomic instructions have address
alignment checks included in their IR. They all should. The effect of
missing alignment checks in the IR is that, since this IR will in most cases
be translated back to atomic instructions in the back end, we will get
alignment traps (SIGBUS) on the host side and not on the guest side, which is
(very) incorrect behaviour of the simulation.
Paul Floyd [Fri, 12 Nov 2021 23:00:38 +0000 (00:00 +0100)]
Bugs 435732 and 403802
The problem is that the testcase specific suppression has stacks
that are too specific. This causes breakage with different versions
of GCC and libstdc++. The suppression only needs to mask the memory
pool used for standard io.
There are several suppression stanzas so future tweaks may still be
necessary.
Paul Floyd [Fri, 12 Nov 2021 22:44:54 +0000 (23:44 +0100)]
Make memcheck tests demangle and demangle-rust clang-friendly.
Clang uses CMOV for ternary operators which does not immediately
trigger an error. Using double free and new/free mismatch still
poses no problem with clang but still uses the demangling.
Julian Seward [Fri, 12 Nov 2021 11:13:45 +0000 (12:13 +0100)]
Bug 444399 - disInstr(arm64): unhandled instruction 0xC87F2D89 (LD{,A}XP and ST{,L}XP).
This is unfortunately a big and complex patch, to implement LD{,A}XP and
ST{,L}XP. These were omitted from the original AArch64 v8.0 implementation
for unknown reasons.
(Background) the patch is made significantly more complex because for AArch64
we actually have two implementations of the underlying
Load-Linked/Store-Conditional (LL/SC) machinery: a "primary" implementation,
which translates LL/SC more or less directly into IR and re-emits them at the
back end, and a "fallback" implementation that implements LL/SC "manually", by
taking advantage of the fact that V serialises thread execution, so we can
"implement" LL/SC by simulating a reservation using fields LLSC_* in the guest
state, and invalidating the reservation at every thread switch.
(Background) the fallback scheme is needed because the primary scheme is in
violation of the ARMv8 semantics in that it can (easily) introduce extra
memory references between the LL and SC, hence on some hardware causing the
reservation to always fail and so the simulated program to wind up looping
forever.
For these instructions, big picture:
* for the primary implementation, we take advantage of the fact that
IRStmt_LLSC allows I128 bit transactions to be represented. Hence we bundle
up the two 64-bit data elements into an I128 (or vice versa) and present a
single I128-typed IRStmt_LLSC in the IR. In the backend, those are
re-emitted as LDXP/STXP respectively. For LL/SC on 32-bit register pairs,
that bundling produces a single 64-bit item, and so the existing LL/SC
backend machinery handles it. The effect is that a doubleword 32-bit LL/SC
in the front end translates into a single 64-bit LL/SC in the back end.
Overall, though, the implementation is straightforward.
* for the fallback implementation, it is necessary to extend the guest state
field `guest_LLSC_DATA` to represent a 128-bit transaction, by splitting it
into _DATA_LO64 and DATA_HI64. Then, the implementation is an exact
analogue of the fallback implementation for single-word LL/SC. It takes
advantage of the fact that the backend already supports 128-bit CAS, as
fixed in bug 445354. As with the primary implementation, doubleword 32-bit
LL/SC is bundled into a single 64-bit transaction.
Detailed changes:
* new arm64 guest state fields LLSC_DATA_LO64/LLSC_DATA_LO64 to replace
guest_LLSC_DATA
* (ridealong fix) arm64 front end: a fix to a minor and harmless decoding bug
for the single-word LDX/STX case.
* arm64 front end: IR generation for LD{,A}XP/ST{,L}XP: tedious and
longwinded, but per comments above, an exact(ish) analogue of the singleword
case
* arm64 backend: new insns ARM64Instr_LdrEXP / ARM64Instr_StrEXP to wrap up 2
x 64 exclusive loads/stores. Per comments above, there's no need to handle
the 2 x 32 case.
* arm64 isel: translate I128-typed IRStmt_LLSC into the above two insns
* arm64 isel: some auxiliary bits and pieces needed to handle I128 values;
this is standard doubleword isel stuff
* arm64 isel: (ridealong fix): Ist_CAS: check for endianness of the CAS!
* arm64 isel: (ridealong) a couple of formatting fixes
* IR infrastructure: add support for I128 constants, done the same as V128
constants
* memcheck: handle shadow loads and stores for I128 values
* testcase: memcheck/tests/atomic_incs.c: on arm64, also test 128-bit atomic
addition, to check we really have atomicity right
* testcase: new test none/tests/arm64/ldxp_stxp.c, tests operation but not
atomicity. (Smoke test).
The sequence of instructions emitted by the arm64 backend for doubleword
compare-and-swap is incorrect. This could lead to incorrect simulation of the
AArch8.1 atomic instructions (CASP, at least). It also causes failures in the
upcoming fix for v8.0 support for LD{,A}XP/ST{,L}XP in bug 444399, at least
when running with the fallback LL/SC implementation
(`--sim-hints=fallback-llsc`, or as autoselected at startup). In the worst
case it can cause segfaulting in the generated code, because it could jump
backwards unexpectedly far.
The problem is the sequence emitted for ARM64in_CASP:
* the jump offsets are incorrect, both for `bne out` (x 2) and `cbnz w1, loop`.
* using w1 to hold the success indication of the stxp instruction trashes the
previous value in x1. But the value in x1 is an output of ARM64in_CASP,
hence one of the two output registers is corrupted. That confuses any code
downstream that want to inspect those values to find out whether or not the
transaction succeeded.
The fixes are to
* fix the branch offsets
* use a different register to hold the stxp success indication. w3 is a
convenient check.
Mark Wielaard [Thu, 11 Nov 2021 17:02:09 +0000 (18:02 +0100)]
Add demangle-rust to check_PROGRAMS
The demangle-rust.vgtest would fail because the demangle-rust binary
wasn't build by default. Add it to check_PROGRAMS and define
demangle_rust_SOURCES to make sure it is always build.