pkelsey [Mon, 26 Feb 2018 02:53:22 +0000 (02:53 +0000)]
This is an implementation of the client side of TCP Fast Open (TFO)
[RFC7413]. It also includes a pre-shared key mode of operation in which
the server requires the client to be in possession of a shared secret in
order to successfully open TFO connections with that server.
The names of some existing fastopen sysctls have changed (e.g.,
net.inet.tcp.fastopen.enabled -> net.inet.tcp.fastopen.server_enable).
ae@FreeBSD.org [Fri, 15 Dec 2017 12:37:32 +0000 (12:37 +0000)]
Follow the RFC6980 and silently ignore following IPv6 NDP messages
that had the IPv6 fragmentation header:
o Neighbor Solicitation
o Neighbor Advertisement
o Router Solicitation
o Router Advertisement
o Redirect
Introduce M_FRAGMENTED mbuf flag, and set it after IPv6 fragment reassembly
is completed. Then check the presence of this flag in correspondig ND6
handling routines.
pfg [Mon, 27 Nov 2017 15:01:59 +0000 (15:01 +0000)]
sys/sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
pfg [Mon, 20 Nov 2017 19:45:28 +0000 (19:45 +0000)]
include: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
pfg [Mon, 20 Nov 2017 19:43:44 +0000 (19:43 +0000)]
sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
kib [Tue, 7 Nov 2017 09:46:26 +0000 (09:46 +0000)]
Use hardware timestamps to report packet timestamps
for SO_TIMESTAMP and other similar socket options.
Provide new control message SCM_TIME_INFO to supply information about
timestamp. Currently it indicates that the timestamp was
hardware-assisted and high-precision, for software timestamps the
message is not returned. Reserved fields are added to ABI to report
additional info about it, it is expected that raw hardware clock value
might be useful for some applications.
kib [Tue, 7 Nov 2017 09:29:14 +0000 (09:29 +0000)]
Add a place for a driver to report rx timestamps
in nanoseconds from boot for the received packets.
The rcv_tstmp field overlaps the place of Ln header length indicators,
not used by received packets. The basic pkthdr rearrangement change
in sys/mbuf.h was provided by gallatin.
There are two accompanying M_ flags: M_TSTMP means that there is the
timestamp (and it was generated by hardware).
Another flag M_TSTMP_HPREC indicates that the timestamp is
high-precision. Practically M_TSTMP_HPREC means that hardware
provided additional precision comparing with the stamps when the flag
is not set. E.g., for ConnectX all packets are stamped by hardware
when PCIe transaction to write out the completion descriptor is
performed, but PTP packet are stamped on port. For Intel cards, when
PTP assist is enabled, only PTP packets are stamped in the limited
number of registers, so if Intel cards ever start support this
mechanism, they would always set M_TSTMP | M_TSTMP_HPREC if hardware
timestamp is present for the given packet.
Add IFCAP_HWRXTSTMP interface capability to indicate the support for
hardware rx timestamping, and ifconfig(8) command to toggle it.
Based on the patch by: gallatin
Reviewed by: gallatin (previous version), hselasky
Sponsored by: Mellanox Technologies
MFC after: 2 weeks (? mbuf KBI issue)
X-Differential revision: https://reviews.freebsd.org/D12638
if: Add ioctls to get RSS key and hash type/function.
It will be needed by hn(4) to configure its RSS key and hash
type/function in the transparent VF mode in order to match VF's
RSS settings. The description of the transparent VF mode and
the RSS hash value issue are here:
https://svnweb.freebsd.org/base?view=revision&revision=322299
https://svnweb.freebsd.org/base?view=revision&revision=322485
These are generic enough to promise two independent IOCs instead
of abusing SIOCGDRVSPEC.
Setting RSS key and hash type/function is a different story,
which probably requires more discussion.
Comment about UDP_{IPV4,IPV6,IPV6_EX} were only in the patch
in the review request; these hash types are standardized now.
kib [Sat, 24 Jun 2017 17:01:11 +0000 (17:01 +0000)]
Implement address space guards.
Guard, requested by the MAP_GUARD mmap(2) flag, prevents the reuse of
the allocated address space, but does not allow instantiation of the
pages in the range. It is useful for more explicit support for usual
two-stage reserve then commit allocators, since it prevents accidental
instantiation of the mapping, e.g. by mprotect(2).
Use guards to reimplement stack grow code. Explicitely track stack
grow area with the guard, including the stack guard page. On stack
grow, trivial shift of the guard map entry and stack map entry limits
makes the stack expansion. Move the code to detect stack grow and
call vm_map_growstack(), from vm_fault() into vm_map_lookup().
As result, it is impossible to get random mapping to occur in the
stack grow area, or to overlap the stack guard page.
glebius [Thu, 8 Jun 2017 21:30:34 +0000 (21:30 +0000)]
Listening sockets improvements.
o Separate fields of struct socket that belong to listening from
fields that belong to normal dataflow, and unionize them. This
shrinks the structure a bit.
- Take out selinfo's from the socket buffers into the socket. The
first reason is to support braindamaged scenario when a socket is
added to kevent(2) and then listen(2) is cast on it. The second
reason is that there is future plan to make socket buffers pluggable,
so that for a dataflow socket a socket buffer can be changed, and
in this case we also want to keep same selinfos through the lifetime
of a socket.
- Remove struct struct so_accf. Since now listening stuff no longer
affects struct socket size, just move its fields into listening part
of the union.
- Provide sol_upcall field and enforce that so_upcall_set() may be called
only on a dataflow socket, which has buffers, and for listening sockets
provide solisten_upcall_set().
o Remove ACCEPT_LOCK() global.
- Add a mutex to socket, to be used instead of socket buffer lock to lock
fields of struct socket that don't belong to a socket buffer.
- Allow to acquire two socket locks, but the first one must belong to a
listening socket.
- Make soref()/sorele() to use atomic(9). This allows in some situations
to do soref() without owning socket lock. There is place for improvement
here, it is possible to make sorele() also to lock optionally.
- Most protocols aren't touched by this change, except UNIX local sockets.
See below for more information.
o Reduce copy-and-paste in kernel modules that accept connections from
listening sockets: provide function solisten_dequeue(), and use it in
the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4),
infiniband, rpc.
o UNIX local sockets.
- Removal of ACCEPT_LOCK() global uncovered several races in the UNIX
local sockets. Most races exist around spawning a new socket, when we
are connecting to a local listening socket. To cover them, we need to
hold locks on both PCBs when spawning a third one. This means holding
them across sonewconn(). This creates a LOR between pcb locks and
unp_list_lock.
- To fix the new LOR, abandon the global unp_list_lock in favor of global
unp_link_lock. Indeed, separating these two locks didn't provide us any
extra parralelism in the UNIX sockets.
- Now call into uipc_attach() may happen with unp_link_lock hold if, we
are accepting, or without unp_link_lock in case if we are just creating
a socket.
- Another problem in UNIX sockets is that uipc_close() basicly did nothing
for a listening socket. The vnode remained opened for connections. This
is fixed by removing vnode in uipc_close(). Maybe the right way would be
to do it for all sockets (not only listening), simply move the vnode
teardown from uipc_detach() to uipc_close()?
imp [Tue, 28 Feb 2017 23:42:47 +0000 (23:42 +0000)]
Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
ed@FreeBSD.org [Wed, 3 Aug 2016 06:33:04 +0000 (06:33 +0000)]
mprotect(): Change prototype to comply to POSIX.
Our mprotect() function seems to take a "const void *" address to the
pages whose permissions need to be adjusted. POSIX uses "void *". Simply
stick to the POSIX one to prevent us from writing unportable code.
kib [Sun, 28 Feb 2016 17:52:33 +0000 (17:52 +0000)]
Implement process-shared locks support
for libthr.so.3, without breaking the ABI. Special value is stored in
the lock pointer to indicate shared lock, and offline page in the shared
memory is allocated to store the actual lock.
Reviewed by: vangyzen (previous version)
Discussed with: deischen, emaste, jhb, rwatson,
Martin Simmons <martin@lispworks.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
jhb [Thu, 4 Jun 2015 19:41:15 +0000 (19:41 +0000)]
Add a new file operations hook for mmap
operations. File type-specific logic is now placed in the mmap hook
implementation rather than requiring it to be placed in
sys/vm/vm_mmap.c. This hook allows new file types to support mmap() as
well as potentially allowing mmap() for existing file types that do not
currently support any mapping.
The vm_mmap() function is now split up into two functions. A new
vm_mmap_object() function handles the "back half" of vm_mmap() and accepts
a referenced VM object to map rather than a (handle, handle_type) tuple.
vm_mmap() is now reduced to converting a (handle, handle_type) tuple to a
a VM object and then calling vm_mmap_object() to handle the actual mapping.
The vm_mmap() function remains for use by other parts of the kernel
(e.g. device drivers and exec) but now only supports mapping vnodes,
character devices, and anonymous memory.
The mmap() system call invokes vm_mmap_object() directly with a NULL object
for anonymous mappings. For mappings using a file descriptor, the
descriptors fo_mmap() hook is invoked instead. The fo_mmap() hook is
responsible for performing type-specific checks and adjustments to
arguments as well as possibly modifying mapping parameters such as flags
or the object offset. The fo_mmap() hook routines then call
vm_mmap_object() to handle the actual mapping.
The fo_mmap() hook is optional. If it is not set, then fo_mmap() will
fail with ENODEV. A fo_mmap() hook is implemented for regular files,
character devices, and shared memory objects (created via shm_open()).
While here, consistently use the VM_PROT_* constants for the vm_prot_t
type for the 'prot' variable passed to vm_mmap() and vm_mmap_object()
as well as the vm_mmap_vnode() and vm_mmap_cdev() helper routines.
Previously some places were using the mmap()-specific PROT_* constants
instead. While this happens to work because PROT_xx == VM_PROT_xx,
using VM_PROT_* is more correct.
to add type-specific information to struct kinfo_file. - Move the
various fill_*_info() methods out of kern_descrip.c and into the various
file type implementations. - Rework the support for kinfo_ofile to
generate a suitable kinfo_file object for each file and then convert
that to a kinfo_ofile structure rather than keeping a second, different
set of code that directly manipulates type-specific file information. -
Remove the shm_path() and ksem_info() layering violations.
kib [Thu, 19 Jun 2014 05:00:39 +0000 (05:00 +0000)]
Add MAP_EXCL flag for mmap(2).
It should be combined with MAP_FIXED, and prevents the request from
deleting existing mappings in the region, failing instead.
Reviewed by: alc
Discussed with: jhb
Tested by: markj, pho (previous version, as part of the bigger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
to request that a mapping use an address in the first 2GB of the
process's address space. This flag should have the same semantics as the
same flag on Linux.
To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address. While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.
kib [Wed, 21 Aug 2013 17:45:00 +0000 (17:45 +0000)]
Implement read(2)/write(2) and neccessary lseek(2)
for posix shmfd. Add MAC framework entries for posix shm read and write.
Do not allow implicit extension of the underlying memory segment past
the limit set by ftruncate(2) by either of the syscalls. Read and
write returns short i/o, lseek(2) fails with EINVAL when resulting
offset does not fit into the limit.
Discussed with: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
Sebastian Huber [Thu, 23 Aug 2018 10:13:02 +0000 (12:13 +0200)]
Add __nl_item to <sys/_types.h> and use it
Add __nl_item to <sys/_types.h> for FreeBSD compatibility. Use it in
<langinfo.h> and the Cygwin <nl_types.h>. Make the enum __nl_item in
<langinfo.h> anonymous.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Sebastian Huber [Mon, 20 Aug 2018 11:20:06 +0000 (13:20 +0200)]
RTEMS: Add __tls_get_addr() to crt0
Add __tls_get_addr() for all targets to crt0. This is not only used on
ARM. In particular, it is used on RISC-V. This helps to adequately
support the GCC libgomp.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de
Replacing page_size() with allocation_granularity() was incorrect.
The values returned by get_mem_values() are # of pages of size
page_size(). Multiplying with allocation_granularity() here
results in values 16 times too big.
Masamichi Hosoda [Thu, 16 Aug 2018 00:18:50 +0000 (09:18 +0900)]
Fix strtod ("nan") and strtold ("nan") returns wrong negative NaN
The definition of qNaN for x86_64 and i386 was wrong.
strto{d|ld} ("nan") returned wrong negative NaN
instead of correct positive NaN
since it used the wrong definition.
On the other hand, strtof ("nan") returns correct positive NaN
since it uses nanf ("") instead of the wrong definition.
This commit makes strto{d|ld} ("nan") uses {nan|nanl} ("")
like strtof ("nan") using.
So strto{d|ld} ("nan") returns positive NaN.
Wilco Dijkstra [Thu, 16 Aug 2018 10:39:35 +0000 (10:39 +0000)]
Improve sincosf comments
Improve comments in sincosf implementation to make the code easier
to understand. Rename the constant pi64 to pi63 since it's actually
PI * 2^-63. Add comments for fields of sincos_t structure. Add comments
describing implementation details to reduce_fast.
Keep the denormal-operand exception masked; modify FE_ALL_EXCEPT accordingly.
By excluding the denormal-operand exception from FE_ALL_EXCEPT, it will not
be possible anymore to UNmask this exception by means of the API defined by
/usr/include/fenv.h
Note: terminology has changed since IEEE Std 854-1987; denormalized numbers
are called subnormal numbers nowadays.
This modification has basically been motivated by the fact that it is also
not possible on Linux to manipulate the denormal-operand exception by means
of the interface as defined by /usr/include/fenv.h. This has been the state
of affairs on Linux since 2001 (Andreas Jaeger).
The exceptions required by the standard (IEEE Std 754), in case they can be
supported by the implementation, are:
FE_INEXACT, FE_UNDERFLOW, FE_OVERFLOW, FE_DIVBYZERO and FE_INVALID.
Although it is allowed to define additional exceptions, there is no reason
to support the "denormal-operand exception" in this case (fenv.h), because
the subnormal numbers can be handled almost as fast the normalized numbers
by the hardware of the x86/x86_64 architecture. Said differently, a reason
to trap on the input of subnormal numbers does not exist. At least that is
what William Kahan and others at Intel asserted around 2000.
(that is William Kahan of the K-C-S draft, the precursor to the standard)
This commit modifies winsup/cygwin/include/fenv.h as follows:
- redefines FE_ALL_EXCEPT from 0x3f to 0x3d
- removes the definition for FE_DENORMAL
- introduces __FE_DENORM (0x2) (enum in Linux also uses __FE_DENORM)
- introduces FE_ALL_EXCEPT_X86 (0x3f), i.e. ALL x86/x86_64 FP exceptions
wordexp uses fprintf in a dangerous way. It uses an unchecked
input string as format string, rather than as parameter to a %s.
Replace fprintf with fputs.
fnstenv MUST be followed by fldenv in fegetenv(), as the former disables all
exceptions in the x87 FPU, which is not appropriate here (fegetenv() ).
fldenv after fnstenv should reload the x87 FPU w/ the configuration that was
saved by fnstenv, i.e. a configuration that might have exceptions enabled.
Note: x86_64 uses SSE for floating-point, not the x87 FPU. However, because
feraiseexcept() attempts to provoke an exception using the x87 FPU, the bug
in fegetenv() will make this attempt futile here (x86_64).
Note: WoW uses the x87 FPU for floating-point, not SSE. Here anything that
would normally result in triggering an exception, not only feraiseexcept(),
will not be able to, as result of the bug in fegetenv().
pfg [Sun, 21 Jan 2018 20:27:47 +0000 (20:27 +0000)]
Define a new __alloc_size2 attribute to complement the exiting support.
At least on GCC7 calling __alloc_size(x) twice is not equivalent to
calling using the attribute once with two arguments. The later is the
documented use in GCC documentation so add a new alloc_size(n, x)
alternative to cover for the few places where it is used: basically:
calloc(3), reallocarray(3) and mallocarray(9).
Submitted by: Mark Millard
MFC after: 3 days
Reference:
http://docs.freebsd.org/cgi/mid.cgi?F227842D-6BE2-4680-82E7-07906AF61CD7
pfg [Mon, 20 Nov 2017 19:43:44 +0000 (19:43 +0000)]
sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
ed@FreeBSD.org [Mon, 28 Aug 2017 09:35:17 +0000 (09:35 +0000)]
Make _Static_assert() work with GCC in older C++ standards.
GCC only activates C11 keywords in C mode, not C++ mode. This means
that when targeting an older C++ standard, we cannot fall back to using
_Static_assert(). In this case, do define _Static_assert() as a macro
that uses a typedef'ed array.
Sebastian Huber [Fri, 27 Jul 2018 07:16:53 +0000 (09:16 +0200)]
ctype: Fix integer type for caseconv_entry::delta
The commit 46ba1675c457324b0eeef4670a09101ef3f34c50 accidently changed a
bit-field from signed to unsigned. The caseconv_entry::delta must be a
signed integer, see also "newlib/libc/ctype/caseconv.t".
Unfortunately, a standard GCC/Newlib build is done without
-Wsign-conversion. Using this warning option would have helped to avoid
this bug:
caseconv.t:2:22: warning: unsigned conversion from 'int' to 'unsigned int:17' changes value from '-32' to '131040' [-Wsign-conversion]
{0x0061, 25, TOUP, -32},
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Mark Geisert [Tue, 24 Jul 2018 05:31:59 +0000 (22:31 -0700)]
POSIX Asynchronous I/O support: other files
Updates to misc files to integrate AIO into the Cygwin source tree.
Much of it has to be done when adding any new syscalls. There are
some updates to limits.h for AIO-specific limits. And some doc mods.
Mark Geisert [Tue, 24 Jul 2018 05:31:58 +0000 (22:31 -0700)]
POSIX Asynchronous I/O support: fhandler files
This code is where the AIO implementation is wired into existing Cygwin
mechanisms for file and device I/O: the fhandler* functions. It makes
use of an existing internal routine prw_open to supply a "shadow fd"
that permits asynchronous operations on a file the user app accesses
via its own fd. This allows AIO to read or write at arbitrary locations
within a file without disturbing the app's file pointer. (This was
already the case with normal pread|pwrite; we're just adding "async"
to the mix.)
Mark Geisert [Tue, 24 Jul 2018 05:31:57 +0000 (22:31 -0700)]
POSIX Asynchronous I/O support: aio files
This is the core of the AIO implementation: aio.cc and aio.h. The
latter is used within the Cygwin DLL by aio.cc and the fhandler* modules,
as well as by user programs wanting the AIO functionality.
Ken Brown [Mon, 23 Jul 2018 14:10:03 +0000 (10:10 -0400)]
getfacl and setfacl: Align with Linux
Make getfacl print two colons instead of one after "other" and "mask".
Change the help text for setfacl to indicate that there can be either
one colon or two.
strcmp.S: Improve performance for misaligned strings
Replace the simple byte-wise compare in the misaligned case with a
dword compare with page boundary checks in place. For simplicity I've
chosen a 4K page boundary so that we don't have to query the actual
page size on the system.
This results in up to 3x improvement in performance in the unaligned
case on falkor and about 2.5x improvement on mustang as measured using
bench-strcmp in glibc.
This improved memcmp provides a fast path for compares up to 16 bytes
and then compares 16 bytes at a time, thus optimizing loads from both
sources. The glibc memcmp microbenchmark retains performance (with an
error of ~1ns) for smaller compare sizes and reduces up to 31% of
execution time for compares up to 4K on the APM Mustang. On Qualcomm
Falkor this improves to almost 48%, i.e. it is almost 2x improvement
for sizes of 2K and above.
The mutually misaligned inputs on aarch64 are compared with a simple
byte copy, which is not very efficient. Enhance the comparison
similar to strcmp by loading a double-word at a time. The peak
performance improvement (i.e. 4k maxlen comparisons) due to this on
the strncmp microbenchmark in glibc is as follows:
falkor: 3.5x (up to 72% time reduction)
cortex-a73: 3.5x (up to 71% time reduction)
cortex-a53: 3.5x (up to 71% time reduction)
All mutually misaligned inputs from 16 bytes maxlen onwards show
upwards of 15% improvement and there is no measurable effect on the
performance of aligned/mutually aligned inputs.
Tamar Christina [Wed, 11 Jul 2018 12:26:16 +0000 (13:26 +0100)]
Fix AArch32 semihosting SYS_EXIT call on semihosting v1.
The current SYS_EXIT has a bug that when making the call it always uses
the v2 calling convention. This is undefined behavior according to the
semihosting specification:
https://developer.arm.com/docs/100863/latest/semihosting-operations/sys_exit-0x18
This patch fixes it by making sure v1 passes the argument directly in the register instead
of in a block. And for v2 it does the same if the v2 extension isn't supported.
Szabolcs Nagy [Tue, 10 Jul 2018 16:14:32 +0000 (17:14 +0100)]
Remove float compare option from sincosf
PREFER_FLOAT_COMPARISON setting was not correct as it could raise
spurious exceptions. Fixing it is easy: just use ISLESS(x, y) instead
of abstop12(x) < abstop12(y) with appropriate non-signaling definition
for ISLESS. However it seems this setting is not very useful (there is
only minor performance difference on various architectures), so remove
this option for now.
Guard the entire operation with the FastPebLock critical section
used by RtlSetCurrentDirectory_U as well, thus eliminating the
race between concurrent chdir/fchdir/SetCurrentDirectory calls.
Streamline comment explaining the fallback method.
* fhandler_socket_local.cc (get_inet_addr_local): Change type from
'static int' to 'int' to be callable from syslog.cc.
* syslog.cc (connect_syslogd): Use get_inet_addr_local() instead of
getsockname() to retrieve name information of the syslogd socket.
Szabolcs Nagy [Tue, 3 Jul 2018 12:05:31 +0000 (13:05 +0100)]
Fix large ulp error in pow without fma very near 1.0
The !HAVE_FAST_FMA code path split r = z/c - 1 into r = rhi + rlo such
that when z = 1-tiny and c = 1 then rlo and rhi could have much larger
magnitude than r which later caused large rounding errors.
So do a nearest rounding instead of truncation at the split.
In newlib with default settings this was observable on some arm targets
that enable the new math code but has no fma.