sourceware.org Git - glibc.git/commit

author	Joe Ramsay <Joe.Ramsay@arm.com>
	Mon, 23 Sep 2024 14:32:14 +0000 (15:32 +0100)
committer	Wilco Dijkstra <wilco.dijkstra@arm.com>
	Mon, 23 Sep 2024 14:44:07 +0000 (15:44 +0100)
commit	5bc100bd4b7e00db3009ae93d25d303341545d23
tree	1aa1f7486b762b861a9292457a95f6cf2db23d6f	tree
parent	a15b1394b5eba98ffe28a02a392b587e4fe13c0d	commit \| diff

AArch64: Improve codegen in users of AdvSIMD log1pf helper

log1pf is quite register-intensive - use fewer registers for the
polynomial, and make various changes to shorten dependency chains in
parent routines.  There is now no spilling with GCC 14.  Accuracy moves
around a little - comments adjusted accordingly but does not require
regen-ulps.

Use the helper in log1pf as well, instead of having separate
implementations.  The more accurate polynomial means special-casing can
be simplified, and the shorter dependency chain avoids the usual dance
around v0, which is otherwise difficult.

There is a small duplication of vectors containing 1.0f (or 0x3f800000) -
GCC is not currently able to efficiently handle values which fit in FMOV
but not MOVI, and are reinterpreted to integer.  There may be potential
for more optimisation if this is fixed.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

sysdeps/aarch64/fpu/acoshf_advsimd.c		diff \| blob \| blame \| history
sysdeps/aarch64/fpu/asinhf_advsimd.c		diff \| blob \| blame \| history
sysdeps/aarch64/fpu/atanhf_advsimd.c		diff \| blob \| blame \| history
sysdeps/aarch64/fpu/log1pf_advsimd.c		diff \| blob \| blame \| history
sysdeps/aarch64/fpu/v_log1pf_inline.h		diff \| blob \| blame \| history