Bart Van Assche [Tue, 23 Mar 2021 02:12:20 +0000 (19:12 -0700)]
configure, drd: Only build the swapcontext test if swapcontext() is available
Add a configure test for swapcontext() since MUSL does not provide a
swapcontext() implementation. See also
https://bugs.kde.org/show_bug.cgi?id=434775 .
Julian Seward [Wed, 17 Mar 2021 07:10:49 +0000 (08:10 +0100)]
Bug 401416 - Compile failure with openmpi 4.0.
In short, use the missing symbol names only when compiling against OpenMPI
version 3 or below, or when compiling against a non-OpenMPI implementation.
Modified version of a patch originally from Mark Wielaard.
Julian Seward [Sat, 13 Mar 2021 18:20:50 +0000 (19:20 +0100)]
amd64 front end: try to avoid a Memcheck false positive related to CPUID. n-i-bz.
In the amd64 front end, CPUID is implemented by calling dirty helper. The way
the side-effects for this call are declared can lead to false positives from
Memcheck. This is a somewhat inelegant "fix", but it's the least-worst that
can be done without changing parameter-passing for the helper functions
involved. A big in-line comment explains the problem and fix.
Andreas Arnez [Fri, 5 Mar 2021 19:16:46 +0000 (20:16 +0100)]
s390x: Improve isel for Iop_V128to64 and friends
The existing instruction selector for Iop_V128to64, Iop_V128HIto64, and
Iop_V128to32 stores the vector register on the stack and then reads the
requested integer value back from the stack into the target GPR. This is
fairly inefficient.
Load the requested value directly from the vector register into the target
GPR instead, using S390_VEC_GET_ELEM.
Mark Wielaard [Tue, 9 Mar 2021 17:51:57 +0000 (18:51 +0100)]
vgdb might crash if valgrind is killed
This is an odd corner case, but happens specifically with the gdb
testcase make check TESTS=gdb.base/valgrind-infcall-2.exp. At the
end valgrind gets killed with SIGKILL (-9) which cannot be blocked.
But vgdb at the time is inside waitstopped. It sees the process wasn't
exited (WIFEXITED(status) is false) and so assumes the process was
stopped by a signal. Which it asserts:
assert (WIFSTOPPED(status));
signal_received = WSTOPSIG(status);
if (signal_received == signal_expected)
break;
But the assert fails and vgdb dumps core. The gdb testcase doesn't care,
because it already finished its test and just makes sure all processes
are gone. But it slowly fills your disk with core files (if you have
enabled them) when running the testsuite.
The fix is to simply check first whether the program has termined
normally or by getting a fatal signal.
Fix nlcontrolc.vgtest hanging on newer glibc and/or arm64
This test verifies that GDB can interrupt a process with all threads
blocked in a long select syscall.
The test used to terminate by having GDB modifying the select argument.
However, modifying the select argument works only for specific arch
and/or specific versions of glibc.
The test then blocks on other architectures/glibc versions.
The previous version of the test was:
* first launching sleepers so as to have all threads blocked in long select
* interrupting these threads
* changing the select time arg so that the threads burn cpu
* and then change variables to have the program exit.
The new version does:
* first launches sleepers so that all threads are burning cpu.
* interrupting these threads
* change the local variables of sleepers so that the threads will
block in a long select syscall
* interrupt these threads
* kill the program.
With this new version, we still check the behaviour of gdb+vgdbserver
for both burning and sleep threads, but without having the termination
depending on modifying select syscall argument.
Tested on debian amd64 and on ubuntu arm64 (to check the test does not hang
on an arm64 platform).
Carl Love [Sat, 2 May 2020 04:49:33 +0000 (23:49 -0500)]
ISA 3.1 VSX Mask Manipulation Operations
Add support for:
mtvsrbmMove to VSR Byte Mask
mtvsrbmiMove To VSR Byte Mask Immediate
mtvsrdmMove to VSR Doubleword Mask
mtvsrhmMove to VSR Halfword Mask
mtvsrqmMove to VSR Quadword Mask
mtvsrwmMove to VSR Word Mask
vcntmbbVector Count Mask Bits Byte
vcntmbdVector Count Mask Bits Doubleword
vcntmbhVector Count Mask Bits Halfword
vcntmbwVector Count Mask Bits Word
vexpandbmVector Expand Byte Mask
vexpanddmVector Expand Doubleword Mask
vexpandhmVector Expand Halfword Mask
vexpandqmVector Expand Quadword Mask
vexpandwmVector Expand Word Mask
vextractbmVector Extract Byte Mask
vextractdmVector Extract Doubleword Mask
vextracthmVector Extract Halfword Mask
vextractqmVector Extract Quadword Mask
vextractwmVector Extract Word Mask
Re-implemented the copy_MSB_bit_fields() function. It can be done similarly to
the implementation of the vgnb instruction leveraging the clean helpers
used for the vgnb instruction.
Reimplemented the vexpandXm instructions eliminating
the call to copy_MSB_bit_fileds() and the need for the
for(i = 0; i< max; i++) loop.
Reimplemented the mtvsrXm instructions to remove the
need for the for(i = 0; i< max; i++) loop.
The computations for vexpandXm and mtvsrXm instructions
can be done much more efficiently.
Mark Wielaard [Thu, 4 Mar 2021 18:24:06 +0000 (19:24 +0100)]
arm64: Handle sp, lr, fp as DwReg in CfiExpr
When copy_convert_CfiExpr_tree sees a DwReg on arm64 we simply call
I_die_here; This causes an issue in the case we really do have to handle
that case (see https://bugzilla.redhat.com/show_bug.cgi?id=1923493).
Handle the stack pointer (sp), link register (x30) and frame pointer (x29),
which we already keep in D3UnwindRegs, like we do for other architectures
in evalCfiExpr and copy_convert_CfiExpr_tree.
Paul Floyd [Wed, 3 Mar 2021 07:53:51 +0000 (08:53 +0100)]
Keep on churning.
Without #define _XOPEN_SOURCE macports clang 9.0.1 on OSX 10.7.5 was
giving me
In file included from swapcontext.c:12:
/usr/include/ucontext.h:43:2: error: The deprecated ucontext routines require
_XOPEN_SOURCE to be defined
^
swapcontext.
So I added #define _XOPEN_SOURCE
But that gives, on Solaris 11.3
In file included from /usr/include/limits.h:12:0,
from /usr/gcc/4.8/lib/gcc/i386-pc-solaris2.11/4.8.2/include-fixed/limits.h:168,
from /usr/gcc/4.8/lib/gcc/i386-pc-solaris2.11/4.8.2/include-fixed/syslimits.h:7,
from /usr/gcc/4.8/lib/gcc/i386-pc-solaris2.11/4.8.2/include-fixed/limits.h:34,
from swapcontext.c:7:
/usr/include/sys/feature_tests.h:354:2: error: #error "Compiler or options invalid for pre-UNIX 03 X/Open applications and pre-2001 POSIX applications"
#error "Compiler or options invalid for pre-UNIX 03 X/Open applications \
^
So make the #define _XOPEN_SOURCE conditional on darwin.
Paul Floyd [Tue, 2 Mar 2021 16:48:14 +0000 (17:48 +0100)]
Modify cxx17_aligned_new testcase to accommdate clang.
Explicitly use ordinary scalar delete and update the expecteds.
Otherwise g++ uses sized scalar delete whilse clang uses
ordinary scalar delete which causes a diff.
Mark Wielaard [Sun, 28 Feb 2021 23:39:31 +0000 (00:39 +0100)]
Remove deep-D.post.exp-ppc64 from EXTRA_DIST.
massif/tests/deep-D.post.exp-ppc64 was remove in commit 24a94df73
"VG_(get_fnname_kind): Recognize gcc "optimized" below main functions."
but was still listed in massif/tests/Makefile.am (EXTRA_DIST). Causing
make dist to fail.
Mark Wielaard [Sun, 28 Feb 2021 23:26:00 +0000 (00:26 +0100)]
VG_(get_fnname_kind): Recognize gcc "optimized" below main functions.
The VG_(get_fnname_kind) function detects some special "below main"
function names. Specifically __libc_start_main and generic_start_main
both of which are used to call the actual main () function from the
application. We already recognized one variant, generic_start_main.isra.0,
but only for powerpc. Recognize all possibly specialed optimized variants
gcc can produce by simply checking for the function name with dot as
prefix. This fixes the memcheck/tests/supp_unknown.vgtest and
massif/tests/deep-D.vgtest with gcc 11.
We can now also get rid of the special cases in
massif/tests/deep-D.post.exp-ppc64 and memcheck/tests/supp_unknown.supp.
Mike Hommey [Fri, 26 Feb 2021 08:09:52 +0000 (17:09 +0900)]
sys_newfstatat: don't complain if |file_name| is NULL.
This is a followup to 2a7d3ae76, in the case rust code runs against a
glibc that supports statx but a kernel that doesn't, in which case glibc
falls back to fstatat.
Mark Wielaard [Fri, 19 Feb 2021 22:49:10 +0000 (23:49 +0100)]
Use pkglibexec as vglibdir.
vglibdir is the directory from where valgrind loads its internal tool
executables and vgpreloads. Currently vglibdir is pkglibdir, so those
internal tools are intermingeled with normal executables and libraries
that the user might use directly.
Make vglibdir equal to pkglibexecdir so the internal tools get installed
and loaded from libexec and don't get get stored under lib.
This leaves just the static archives and the mpiwrapper libraries that
the user would link/load themselves under pkglibdir.
This seems more in line with the FHS lib/libexec standard and makes it
slightly easier to combine the tools from a multilib target (say the
memcheck-amd64-linux and memcheck-x86-linux tools) because they would
be installed under the same directory, while the pkglibdir can differ
depending on arch/target (lib/lib64).
Mark Wielaard [Sat, 27 Feb 2021 16:44:30 +0000 (17:44 +0100)]
gdbserver_tests: filter out Download failed: messages.
gdb can also use debuginfod and is excessively chatty when downloads
fail (even when DEBUGINFOD_URLS isn't set). Filter those messages out
of the gdb output.
Mark Wielaard [Fri, 26 Feb 2021 01:34:32 +0000 (02:34 +0100)]
Make the dwarf3 reader more robust and less chatty when things go wrong
Skip some stuff when seeing an unknown language, be less chatty about
parser issues.
All the issues seem to come from the multi-file, that is the shared
(supplementary or alt) file containing debuginfo shared by all the
gcc/runtime libraries.
There are a couple of issues that this patch works around:
- The multifile contains entries for the 'D' language, which has some
constructs we don't expect.
- We don't read partial units correctly, which means we often don't know
the language we are looking at.
- The parser is very chatty about issues it didn't expect (even if they
are ignored, it will still output something)
It only shows up with --read-var-info=yes which some tests enable, but
which is disabled by default.
Also increate the timeout of drd/tests/pth_cleanup_handler.c because
DWARF reading is so slow.
Aaron Merey [Fri, 19 Feb 2021 03:58:25 +0000 (22:58 -0500)]
PR432215 Add debuginfod functionality
debuginfod is an HTTP server for distributing ELF/DWARF debugging
information. When a debuginfo file cannot be found locally, Valgrind
is able to query debuginfod servers for the file using its build-id.
readelf.c: Add debuginfod_find_debug_file(). Spawns a child process to
exec `debuginfod-find` in order to query servers for the debuginfo
file. Also add helper debuginfod_find_path().
pub_core_pathscan.h: Moved from priv_initimg_pathscan.h in order to use
VG_(find_executable)() in readelf.c.
docs: Add information regarding debuginfod to valgrind.1
memcheck/tests/linux: Add new test debuginfod-check.
tests/vg_regtest.in: Clear $DEBUGINFOD_URLS before running any tests.
vmodsq Vector Modulo Signed Quadword
vmoduq Vector Modulo Unsigned Quadword
vmulesd Vector Multiply Even Signed Doubleword
vmuleud Vector Multiply Even Unsigned Doubleword
vmulosd Vector Multiply Odd Signed Doubleword
vmuloud Vector Multiply Odd Unsigned Doubleword
vmsumcud Vector Multiply-Sum & write Carry-out Unsigned Doubleword
xscvqpsqz VSX Scalar Convert with round to zero Quad-Precision to Signed
Quadword
xscvqpuqz VSX Scalar Convert with round to zero Quad-Precision toUnsigned
Quadword
xscvsqqp VSX Scalar Convert Signed Quadword to Quad-Precision
xscvuqqp VSX Scalar Convert Unsigned Quadword to Quad-Precision
Bart Van Assche [Tue, 23 Feb 2021 19:49:14 +0000 (11:49 -0800)]
drd/tests/swapcontext: Improve the portability of this test further
- Remove the VALGRIND_STACK_REGISTER() invocation for the initial thread
stack since it is superfluous. Remove the pthread_attr_getstack() call
that became superfluous by this change.
- Change SIGINT into SIGALRM for FreeBSD since pthread_kill(..., SIGINT)
causes the application to return a SIGINT status.
- Reduce the stack size of the threads created by this test.
Mark Wielaard [Tue, 23 Feb 2021 15:19:26 +0000 (16:19 +0100)]
Filter out unsupported instructions from HWCAP2 on powerpc.
Valgrind currently doesn't support the DARN random number instruction
and the SCV syscall instruction. Filter them out of HWCAP2 so glibc
and applications don't try to use them when running under valgrind.
Also suppress printing a log message for scv instructions in the
instruction stream.
Reported by: Florian Weimer <fweimer@redhat.com>
DARN bug: https://bugs.kde.org/show_bug.cgi?id=411189
SCV bug: https://bugs.kde.org/show_bug.cgi?id=431157
Mark Wielaard [Tue, 23 Feb 2021 10:50:13 +0000 (11:50 +0100)]
gdbserver_tests/hgtls.vgtest: Make sure gdb is installed before running
The other gdbserver_tests that need to run gdb make sure it is actually
available before trying to run it, otherwise the test is skipped. Do the
same to hgtls.vgtest by adding test -e gdb to the prereq.
Mark Wielaard [Sun, 21 Feb 2021 21:45:51 +0000 (22:45 +0100)]
Fix typo in DWARF 5 line table readers
This typo meant the directory entry was most often zero, which
happened to be sometimes correct anyway (since zero is the compdir).
So for simple testcases it looked correct. But it would be wrong for
compilation units not in the current compdir. Like files compiled with
a relative of absolute path (and then combined into the same compilation
unit with LTO).
The same typo was in both readdwarf.c (read_dwarf2_lineblock) and
readdwarf3.c (read_filename_table). read_dwarf2_lineblock also had
an extra "dwarf" string in the --debug-dump=line output.
Mark Wielaard [Sun, 21 Feb 2021 14:18:54 +0000 (15:18 +0100)]
swapcontext.vgtest fails with glibc-debuginfo installed
With debuginfo installed the backtace contains the swapcontext.S
source file. Filter that out, like the clone.S source file is in
drd/tests/filter_stderr.
Mark Wielaard [Sat, 20 Feb 2021 19:05:31 +0000 (20:05 +0100)]
Fix valgrind.h include in drd/tests/swapcontext.c
In tree tests should include "valgrind.h" not <valgrind/valgrind.h>
the later might pick up the system installed valgrind.h and doesn't
work when srcdir != builddir.
Bart Van Assche [Mon, 15 Feb 2021 04:08:52 +0000 (20:08 -0800)]
core: Pass stack change user requests on to tools
Since DRD tracks the lowest and highest stack address that has been used,
it needs to know about stack registration events. Hence pass on stack
registration events to tools.
Mark Wielaard [Sat, 20 Feb 2021 15:56:33 +0000 (16:56 +0100)]
Update NEWS with some core and platform (s390) changes and bug fixes.
Mention the new DWARF version 5 support needed with GCC 11.
s390 now supports z14 vector instructions.
Add missing bugs fixed and sort them by bug number (n-i-bz last).
Pull in 3.16.1 release data.
Mark Wielaard [Fri, 12 Feb 2021 22:29:34 +0000 (23:29 +0100)]
PR217695 malloc/calloc/realloc/memalign failure doesn't set errno to ENOMEM
When one of the allocation functions in vg_replace_malloc failed
they return NULL, but didn't set errno. This is slightly tricky since
errno is implementation defined and might be a macro. In the case of
glibc ernno is defined as:
We can use the same trick as we use for __libc_freeres in
coregrind/vg_preloaded.c. Define the function as "weak". This means
it will only be defined if another library (glibc in this case)
actually provides a definition. Otherwise it will be NULL.
So we will only call it if it is defined and one of the allocation
functions failed, returned NULL.
Include a new linux only memcheck testcase, enomem.vgtest.
Mark Wielaard [Fri, 12 Feb 2021 19:42:00 +0000 (20:42 +0100)]
PR432809 VEX should support REX.W + POPF
It seems a REX.W prefix simply explicitly sets the operant size to 8,
and so can/must be ignored as redundant. This is what we already do
for PUSH, POP and PUSHF. All instructions are described as "When in
64-bit mode, instruction defaults to 64-bit operand size and cannot
encode 32-bit operand size." in the instruction manual.
Original patch and analysis by Mike Dalessio <mike.dalessio@gmail.com>
Mark Wielaard [Thu, 11 Feb 2021 17:29:52 +0000 (18:29 +0100)]
vg_regtest: test-specific environment variables not reset between tests
Test-specific environment variables set in .vgtest files are not reset
between tests. This can result in tests running with environment variables
intended for a previously run test. This can be easily fixed by clearing
the @env and @envB arrays in tests/vg_regtest:read_vgtest_file()
Mark Wielaard [Sun, 7 Feb 2021 23:25:52 +0000 (00:25 +0100)]
PR140939 --track-fds reports leakage of stdout/in/err and doesn't respect -q
Make --track-fds=yes not report on file descriptors 0, 1, and 2 (stdin,
stdout, and stderr) by default. Add a new option --track-fds=all that does
report on the std file descriptors still being open. Update testsuite and
documentation.
Original patch by Peter Kelly <pmk@cs.adelaide.edu.au>
Updated by Daniel Fahlgren <daniel@fahlgren.se>
Mark Wielaard [Sat, 6 Feb 2021 21:02:56 +0000 (22:02 +0100)]
PR140178 Support opening /proc/self/exe
Some programs open /proc/self/exe to read some data. Currently valgrind
supports following the /proc/self/exe link (to the original binary, so you
could then open that), but directly opening /proc/self/exe will open the
valgrind tool, not the executable file itself.
Add ML_(handle_self_exe_open) which dups VG_(cl_exec_fd) if the file
to open is /proc/self/exe or /proc/<pid>/exe. And do the same for openat.
Mark Wielaard [Wed, 3 Feb 2021 15:56:14 +0000 (16:56 +0100)]
syswrap-linux.c: Pass implicit VKI_IPC_64 for shmctl also on arm64.
The shmctl syscall on amd64, arm64 and riscv (but we don't have a port
for that last one) always use IPC_64. Explicitly pass it to the generic
PRE/POST handlers so they select the correct (64bit) data structures on
those architectures.
On Linux, there are two variants of the direct shmctl syscall:
- sys_shmctl: always uses shmid64_ds, does not accept IPC_64
- sys_old_shmctl: uses shmid_ds or shmid64_ds depending on IPC_64
The following Linux ABIs have the sys_old_shmctl variant:
alpha, arm, microblaze, mips n32/n64, xtensa
Other ABIs (and future ABIs) have the sys_shmctl variant, including ABIs
that only got sys_shmctl in Linux 5.1 (such as x86, mips o32, ppc,
s390x).
We incorrectly assume the sys_old_shmctl variant on nanomips and x86,
causing shmat() calls under valgrind to fail with EINVAL.
On x86, the issue was previously masked by the non-existence of
__NR_shmctl until a9fc7bceeb0b0 ("Update Linux x86 system call number
definitions") in 2019.
On mips o32, ppc, and s390x this issue is not visible as our headers do
not have __NR_shmctl for those ABIs (396 since Linux 5.1).
Fix the issue by correcting the preprocessor check in get_shm_size() to
only assume the old Linux sys_old_shmctl behavior on the specific
affected platforms.
Also, exclude the use of direct shmctl entirely on Linux x86, ppc,
mips o32, s390x in order to keep compatibility with pre-5.1 kernel
versions that did not yet have direct shmctl for those ABIs.
This currently only has actual effect on x86 as only it has __NR_shmctl
in our headers.
Mark Wielaard [Mon, 1 Feb 2021 21:46:43 +0000 (22:46 +0100)]
Handle Iop_NegF16, Iop_AbsF16 and Iop_SqrtF16 as non-trapping.
Add Iop_NegF16, Iop_AbsF16 and Iop_SqrtF16 to VEX/priv/ir_defs.c
primopMightTrap. Also rewrite case statement slightly so GCC will warn
if an enumeration value is missed.
Mark Wielaard [Mon, 25 Jan 2021 14:33:34 +0000 (15:33 +0100)]
Add support for DWARF5 as produced by GCC11
Implement DWARF5 in readdwarf.c and readdwarf3.c
Since gcc11 will default to DWARF5 by default it is time for
valgrind to support it. The patch handles everything gcc11 produces
(except for the new DWARF expressions).
There is some duplication in the patch since we actually have two DWARF
readers which use slightly different abstractions (Slices vs Cursors).
It would be nice if we could merge these somehow. The reader in
readdwarf3.c is only used when --read-var-info=yes is used (which
drd uses to provide the allocation context).
The handling of DW_FORM_implicit_const is tricky with the current design.
An abbrev which contains an attribute encoded with DW_FORM_implicit_const
has its value also in the abbrev. The code in readdwarf3.c assumed it
always could simply get the data from the .debug_info/current Cursor.
For now I added a value field to the name_form field that holds the
associated value. This is slightly wasteful since the extra field is
not necessary for other forms.
Tested against GCC10 (defaulting to DWARF4) and GCC11 (defaulting to
DWARF5) on x86_64. No regressions in the regtests.
Mark Wielaard [Sat, 23 Jan 2021 20:54:07 +0000 (21:54 +0100)]
Define AT as UChar in VEX/priv/guest_ppc_toIR.c (dis_vsx_accumulator_prefix)
GCC notices that AT is passed around as char, specifically as %u argument
to DIP. But ifieldAT returns an UChar and vsx_matrix_ger takes AT as UChar.
This causes lots of format string warnings when building with GCC11.
Mark Wielaard [Sat, 23 Jan 2021 19:22:28 +0000 (20:22 +0100)]
Fix indentation in coregrind/m_debuginfo/readpdb.c (DEBUG_SnarfLinetab)
GCC warns:
readpdb.c:1631:16: warning: this 'if' clause does not guard...
[-Wmisleading-indentation]
1631 | if (debug)
| ^~
In file included from ./pub_core_basics.h:38,
from m_debuginfo/readpdb.c:38:
../include/pub_tool_basics.h:69:30: note: ...this statement, but the latter
is misleadingly indented as if it were guarded by the 'if'
69 | #define ML_(str) VGAPPEND(vgModuleLocal_, str)
| ^~~~~~~~~~~~~~
../include/pub_tool_basics.h:66:29: note: in definition of macro 'VGAPPEND'
66 | #define VGAPPEND(str1,str2) str1##str2
| ^~~~
m_debuginfo/readpdb.c:1636:19: note: in expansion of macro 'ML_'
1636 | ML_(addLineInfo)(
| ^~~
The warning message is slightly hard to read because of the macro expansion.
But GCC is right that the indentation is misleading. Fixed by reindenting.
Carl Love [Mon, 11 Jan 2021 17:39:23 +0000 (11:39 -0600)]
PPC64: Fix load store instructions
This patch fixes numerous errors in the ISA support.
The word and prefix versions of the instructions do not use the same mask
to extract the immediate values. The prefix instructions should all use
the DFOM_IMMASK.
The parsing of prefix instructions has been fixed to ensure the ISA 3.1
instructions all have the ISA_3_1_PREFIX_CHECK check.
Fixed the commenting to improve the comments for the instruction parsing.
Carl Love [Mon, 11 Jan 2021 16:41:47 +0000 (10:41 -0600)]
PPC64: Fix EA calculation for prefixed instructions
The effective address (EA) calculation for the prefixed instructions
concatenate an 18-bit immediate value from the prefix word and a 16-bit
immediate value fro the instruction word. This results in a 34-bit value.
The concatenated value must be stored into a long long int not a 32-bit
integer.
Carl Love [Sun, 10 Jan 2021 02:15:46 +0000 (20:15 -0600)]
PPC64: Fix for VG_MAX_INSTR_SZB, max instruction size is now 8bytes for prefix inst
The ISA 3.1 support has both word instructions of length 4-bytes and prefixed
instruction of length 8-bytes. The following fix is needed when Valgrind
is compiled using an ISA 3.1 compiler.
Julian Seward [Mon, 4 Jan 2021 12:33:24 +0000 (13:33 +0100)]
arm64 insn selector: improved handling of Or1/And1 trees.
This is the exact analog of cadd90993504678607a4f95dfe5d1df5207c1eb0, to the
point of almost being a copy-n-paste. That commit split (amd64) iselCondCode
into two functions, iselCondCode_C (existing) and iselCondCode_R (new). The
latter computes an I1-typed expression into a register rather than a condition
code. The two functions cooperate so as to minimise between conversions between
a condition-code value and a value in a register.
Julian Seward [Sat, 2 Jan 2021 16:18:53 +0000 (17:18 +0100)]
More arm64 isel tuning: create {and,orr,eor,add,sub} reg,reg,reg-shifted-by-imm
Thus far the arm64 isel can't generate instructions of the form
{and,or,xor,add,sub} reg,reg,reg-shifted-by-imm
and hence sometimes winds up generating pairs like
lsh x2, x1, #13 ; orr x4, x3, x2
when instead it could just have generated
orr x4, x3, x1, lsh #13
This commit fixes that, although only for the 64-bit case, not the 32-bit
case. Specifically, it can transform the IR forms
{Add,Sub,And,Or,Xor}(E1, {Shl,Shr,Sar}(E2, immediate)) and
{Add,And,Or,Xor}({Shl,Shr,Sar}(E1, immediate), E2)
into a single arm64 instruction. Note that `Sub` is not included in the
second line, because shifting the first operand requires inverting the arg
order in the arm64 instruction, which isn't allowable with `Sub`, since it's
not commutative and arm64 doesn't offer us a reverse-subtract instruction to
use instead.
This gives a 1.1% reduction generated code size when running
/usr/bin/date on Memcheck.
Julian Seward [Sat, 2 Jan 2021 15:15:03 +0000 (16:15 +0100)]
A bit of tuning of the arm64 isel: do PUT(..) = 0x0:I64 in a single insn.
When running Memcheck, most blocks will do one and often two of `PUT(..) =
0x0:I64`, as a result of the way the front end models arm64 condition codes.
The arm64 isel would generate `mov xN, #0 ; str xN, [xBaseblock, #imm]`,
which is pretty stupid. This patch changes it to a single insn:
`str xzr, [xBaseblock, #imm]`.
This is a special-case for `PUT(..) = 0x0:I64`. General-case integer stores
of 0x0:I64 are unchanged.
This gives a 1.9% reduction in generated code size when running
/usr/bin/date on Memcheck.
Paul Floyd [Wed, 30 Dec 2020 12:57:39 +0000 (13:57 +0100)]
Add an extra suppression.
On Fedora 33 with gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)
it looks like fun:__static_initialization_and_destruction_0 is
now inlined which causes the existing suppression for the
same reachable to no longer match.
For the cases of sfbm that are actually just sign-extensions to a wider width,
emit that directly and do disassembly-printing accordingly. No functional
change.