sourceware.org Git - glibc.git/log

Avoid cancellable I/O primitives in ld.so.

Neither the <dlfcn.h> entry points, nor lazy symbol resolution, nor
initial shared library load-up, are cancellation points, so ld.so
should exclusively use I/O primitives that are not cancellable.  We
currently achieve this by having the cancellation hooks compile as
no-ops when IS_IN(rtld); this patch changes to using exclusively
_nocancel primitives in the source code instead, which makes the
intent clearer and significantly reduces the amount of code compiled
under IS_IN(rtld) as well as IS_IN(libc) -- in particular,
elf/Makefile no longer thinks we require a copy of unwind.c in
rtld-libc.a.  (The older mechanism is preserved as a backstop.)

The bulk of the change is splitting up the files that define the
_nocancel I/O functions, so they don't also define the variants that
*are* cancellation points; after which, the existing logic for picking
out the bits of libc that need to be recompiled as part of ld.so Just
Works.  I did this for all of the _nocancel functions, not just the
ones used by ld.so, for consistency.

fcntl was a little tricky because it's only a cancellation point for
certain opcodes (F_SETLKW(64), which can block), and the existing
__fcntl_nocancel wasn't applying the FCNTL_ADJUST_CMD hook, which
strikes me as asking for trouble, especially as the only nontrivial
definition of FCNTL_ADJUST_CMD (for powerpc64) changes F_*LK* opcodes.
To fix this, fcntl_common moves to fcntl_nocancel.c along with
__fcntl_nocancel, and changes its name to the extern (but hidden)
symbol __fcntl_nocancel_adjusted, so that regular fcntl can continue
calling it.  __fcntl_nocancel now applies FCNTL_ADJUST_CMD; so that
both both fcntl.c and fcntl_nocancel.c can see it, the only nontrivial
definition moves from sysdeps/u/s/l/powerpc/powerpc64/fcntl.c to
.../powerpc64/sysdep.h and becomes entirely a macro, instead of a macro
that calls an inline function.

The nptl version of libpthread also changes a little, because its
"compat-routines" formerly included files that defined all the
_nocancel functions it uses; instead of continuing to duplicate them,
I exported the relevant ones from libc.so as GLIBC_PRIVATE.  Since the
Linux fcntl.c calls a function defined by fcntl_nocancel.c, it can no
longer be used from libpthread.so; instead, introduce a custom
forwarder, pt-fcntl.c, and export __libc_fcntl from libc.so as
GLIBC_PRIVATE.  The nios2-linux ABI doesn't include a copy of vfork()
in libpthread, and it was handling that by manipulating
libpthread-routines in .../linux/nios2/Makefile; it is cleaner to do
what other such ports do, and have a pt-vfork.S that defines no symbols.

Right now, it appears that Hurd does not implement _nocancel I/O, so
sysdeps/generic/not-cancel.h will forward everything back to the
regular functions.  This changed the names of some of the functions
that sysdeps/mach/hurd/dl-sysdep.c needs to interpose.

* elf/dl-load.c, elf/dl-misc.c, elf/dl-profile.c, elf/rtld.c
* sysdeps/unix/sysv/linux/dl-sysdep.c
Include not-cancel.h.  Use __close_nocancel instead of __close,
__open64_nocancel instead of __open, __read_nocancel instead of
__libc_read, and __write_nocancel instead of __libc_write.

* csu/check_fds.c (check_one_fd)
* sysdeps/posix/fdopendir.c (__fdopendir)
* sysdeps/posix/opendir.c (__alloc_dir): Use __fcntl_nocancel
        instead of __fcntl and/or __libc_fcntl.

* sysdeps/unix/sysv/linux/pthread_setname.c (pthread_setname_np)
* sysdeps/unix/sysv/linux/pthread_getname.c (pthread_getname_np)
        * sysdeps/unix/sysv/linux/i386/smp.h (is_smp_system):
Use __open64_nocancel instead of __open_nocancel.

* sysdeps/unix/sysv/linux/not-cancel.h: Move all of the
hidden_proto declarations to the end and issue them if either
IS_IN(libc) or IS_IN(rtld).
* sysdeps/unix/sysv/linux/Makefile [subdir=io] (sysdep_routines):
Add close_nocancel, fcntl_nocancel, nanosleep_nocancel,
open_nocancel, open64_nocancel, openat_nocancel, pause_nocancel,
read_nocancel, waitpid_nocancel, write_nocancel.

        * io/Versions [GLIBC_PRIVATE]: Add __libc_fcntl,
        __fcntl_nocancel, __open64_nocancel, __write_nocancel.
        * posix/Versions: Add __nanosleep_nocancel, __pause_nocancel.

        * nptl/pt-fcntl.c: New file.
        * nptl/Makefile (pthread-compat-wrappers): Remove fcntl.
        (libpthread-routines): Add pt-fcntl.
        * include/fcntl.h (__fcntl_nocancel_adjusted): New function.
        (__libc_fcntl): Remove attribute_hidden.
* sysdeps/unix/sysv/linux/fcntl.c (__libc_fcntl): Call
__fcntl_nocancel_adjusted, not fcntl_common.
        (__fcntl_nocancel): Move to new file fcntl_nocancel.c.
(fcntl_common): Rename to __fcntl_nocancel_adjusted; also move
to fcntl_nocancel.c.
* sysdeps/unix/sysv/linux/fcntl_nocancel.c: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/fcntl.c: Remove file.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h:
Define FCNTL_ADJUST_CMD here, as a self-contained macro.

* sysdeps/unix/sysv/linux/close.c: Move __close_nocancel to...
* sysdeps/unix/sysv/linux/close_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/nanosleep.c: Move __nanosleep_nocancel to...
* sysdeps/unix/sysv/linux/nanosleep_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/open.c: Move __open_nocancel to...
* sysdeps/unix/sysv/linux/open_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/open64.c: Move __open64_nocancel to...
* sysdeps/unix/sysv/linux/open64_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/openat.c: Move __openat_nocancel to...
* sysdeps/unix/sysv/linux/openat_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/openat64.c: Move __openat64_nocancel to...
* sysdeps/unix/sysv/linux/openat64_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/pause.c: Move __pause_nocancel to...
* sysdeps/unix/sysv/linux/pause_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/read.c: Move __read_nocancel to...
* sysdeps/unix/sysv/linux/read_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/waitpid.c: Move __waitpid_nocancel to...
* sysdeps/unix/sysv/linux/waitpid_nocancel.c: ...this new file.
* sysdeps/unix/sysv/linux/write.c: Move __write_nocancel to...
* sysdeps/unix/sysv/linux/write_nocancel.c: ...this new file.

        * sysdeps/unix/sysv/linux/nios2/Makefile: Don't override
        libpthread-routines.
        * sysdeps/unix/sysv/linux/nios2/pt-vfork.S: New file which
        defines nothing.

        * sysdeps/mach/hurd/dl-sysdep.c: Define __read instead of
        __libc_read, and __write instead of __libc_write.  Define
        __open64 in addition to __open.

(cherry picked from commit 329ea513b451ae8322aa7a24ed84da13992af2dd)

Add narrowing divide functions.

This patch adds the narrowing divide functions from TS 18661-1 to
glibc's libm: fdiv, fdivl, ddivl, f32divf64, f32divf32x, f32xdivf64
for all configurations; f32divf64x, f32divf128, f64divf64x,
f64divf128, f32xdivf64x, f32xdivf128, f64xdivf128 for configurations
with _Float64x and _Float128; __nldbl_ddivl for ldbl-opt.

The changes are mostly essentially the same as for the other narrowing
functions, so the description of those generally applies to this patch
as well.

Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.

* math/Makefile (libm-narrow-fns): Add div.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing divide functions.
* math/bits/mathcalls-narrow.h (div): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add div.
* math/math-narrow.h (CHECK_NARROW_DIV): New macro.
(NARROW_DIV_ROUND_TO_ODD): Likewise.
(NARROW_DIV_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fdivl): New
macro.
(__ddivl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fdiv and
ddiv.
(CFLAGS-nldbl-ddiv.c): New variable.
(CFLAGS-nldbl-fdiv.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_ddivl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_ddivl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fdiv, fdivl,
ddivl, fMdivfN, fMdivfNx, fMxdivfN and fMxdivfNx.
* math/auto-libm-test-in: Add tests of div.
* math/auto-libm-test-out-narrow-div: New generated file.
* math/libm-test-narrow-div.inc: New file.
* sysdeps/i386/fpu/s_f32xdivf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xdivf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fdiv.c: Likewise.
* sysdeps/ieee754/float128/s_f32divf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64divf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xdivf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xdivf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_ddivl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fdivl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-ddiv.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fdiv.c: Likewise.
* sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fdiv.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.

(cherry picked from commit 632a6cbe44cdd41dba7242887992cdca7b42922a)

Add narrowing multiply functions.

This patch adds the narrowing multiply functions from TS 18661-1 to
glibc's libm: fmul, fmull, dmull, f32mulf64, f32mulf32x, f32xmulf64
for all configurations; f32mulf64x, f32mulf128, f64mulf64x,
f64mulf128, f32xmulf64x, f32xmulf128, f64xmulf128 for configurations
with _Float64x and _Float128; __nldbl_dmull for ldbl-opt.

The changes are mostly essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well. f32xmulf64 for i386 cannot use precision control as used for
add and subtract, because that would result in double rounding for
subnormal results, so that uses round-to-odd with long double
intermediate result instead. The soft-fp support involves adding a
new FP_TRUNC_COOKED since soft-fp multiplication uses cooked inputs
and outputs.

Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.

* math/Makefile (libm-narrow-fns): Add mul.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing multiply functions.
* math/bits/mathcalls-narrow.h (mul): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add mul.
* math/math-narrow.h (CHECK_NARROW_MUL): New macro.
(NARROW_MUL_ROUND_TO_ODD): Likewise.
(NARROW_MUL_TRIVIAL): Likewise.
* soft-fp/op-common.h (FP_TRUNC_COOKED): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fmull): New
macro.
(__dmull): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fmul and
dmul.
(CFLAGS-nldbl-dmul.c): New variable.
(CFLAGS-nldbl-fmul.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_dmull.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dmull): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fmul, fmull,
dmull, fMmulfN, fMmulfNx, fMxmulfN and fMxmulfNx.
* math/auto-libm-test-in: Add tests of mul.
* math/auto-libm-test-out-narrow-mul: New generated file.
* math/libm-test-narrow-mul.inc: New file.
* sysdeps/i386/fpu/s_f32xmulf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xmulf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fmul.c: Likewise.
* sysdeps/ieee754/float128/s_f32mulf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64mulf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xmulf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xmulf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_dmull.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fmull.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dmul.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fmul.c: Likewise.
* sysdeps/ieee754/soft-fp/s_dmull.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fmul.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fmull.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.

(cherry picked from commit 69a01461ee1417578d2ba20aac935828b50f1118)

Add narrowing subtract functions.

This patch adds the narrowing subtract functions from TS 18661-1 to
glibc's libm: fsub, fsubl, dsubl, f32subf64, f32subf32x, f32xsubf64
for all configurations; f32subf64x, f32subf128, f64subf64x,
f64subf128, f32xsubf64x, f32xsubf128, f64xsubf128 for configurations
with _Float64x and _Float128; __nldbl_dsubl for ldbl-opt.

The changes are essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well.

Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.

* math/Makefile (libm-narrow-fns): Add sub.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing subtract functions.
* math/bits/mathcalls-narrow.h (sub): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add sub.
* math/math-narrow.h (CHECK_NARROW_SUB): New macro.
(NARROW_SUB_ROUND_TO_ODD): Likewise.
(NARROW_SUB_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fsubl): New
macro.
(__dsubl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fsub and
dsub.
(CFLAGS-nldbl-dsub.c): New variable.
(CFLAGS-nldbl-fsub.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_dsubl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dsubl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fsub, fsubl,
dsubl, fMsubfN, fMsubfNx, fMxsubfN and fMxsubfNx.
* math/auto-libm-test-in: Add tests of sub.
* math/auto-libm-test-out-narrow-sub: New generated file.
* math/libm-test-narrow-sub.inc: New file.
* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fsub.c: Likewise.
* sysdeps/ieee754/float128/s_f32subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dsub.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.

(cherry picked from commit 8d3f9e85cfa14e5f82a0e9e934b9fe1e4cb342bf)

Add narrowing add functions.

This patch adds the narrowing add functions from TS 18661-1 to glibc's
libm: fadd, faddl, daddl, f32addf64, f32addf32x, f32xaddf64 for all
configurations; f32addf64x, f32addf128, f64addf64x, f64addf128,
f32xaddf64x, f32xaddf128, f64xaddf128 for configurations with
_Float64x and _Float128; __nldbl_daddl for ldbl-opt.  As discussed for
the build infrastructure patch, tgmath.h support is deliberately
deferred, and FP_FAST_* macros are not applicable without optimized
function implementations.

Function implementations are added for all relevant pairs of formats
(including certain cases of a format and itself where more than one
type has that format).  The main implementations use round-to-odd, or
a trivial computation in the case where both formats are the same or
where the wider format is IBM long double (in which case we don't
attempt to be correctly rounding).  The sysdeps/ieee754/soft-fp
implementations use soft-fp, and are used automatically for
configurations without exceptions and rounding modes by virtue of
existing Implies files.  As previously discussed, optimized versions
for particular architectures are possible, but not included.

i386 gets a special version of f32xaddf64 to avoid problems with
double rounding (similar to the existing fdim version), since this
function must round just once without an intermediate rounding to long
double.  (No such special version is needed for any other function,
because the nontrivial functions use round-to-odd, which does the
intermediate computation with the rounding mode set to round-to-zero,
and double rounding is OK except in round-to-nearest mode, so is OK
for that intermediate round-to-zero computation.)  mul and div will
need slightly different special versions for i386 (using round-to-odd
on long double instead of precision control) because of the
possibility of inexact intermediate results in the subnormal range for
double.

To reduce duplication among the different function implementations,
math-narrow.h gets macros CHECK_NARROW_ADD, NARROW_ADD_ROUND_TO_ODD
and NARROW_ADD_TRIVIAL.

In the trivial cases and for any architecture-specific optimized
implementations, the overhead of the errno setting might be
significant, but I think that's best handled through compiler built-in
functions rather than providing separate no-errno versions in glibc
(and likewise there are no __*_finite entry points for these function
provided, __*_finite effectively being no-errno versions at present in
most cases).

Tested for x86_64 and x86, with both GCC 6 and GCC 7.  Tested for
mips64 (all three ABIs, both hard and soft float) and powerpc with GCC
7.  Tested with build-many-glibcs.py with both GCC 6 and GCC 7.

* math/Makefile (libm-narrow-fns): Add add.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing add functions.
* math/bits/mathcalls-narrow.h (add): Use __MATHCALL_NARROW .
* math/gen-auto-libm-tests.c (test_functions): Add add.
* math/math-narrow.h (CHECK_NARROW_ADD): New macro.
(NARROW_ADD_ROUND_TO_ODD): Likewise.
(NARROW_ADD_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__faddl): New
macro.
(__daddl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fadd and
dadd.
(CFLAGS-nldbl-dadd.c): New variable.
(CFLAGS-nldbl-fadd.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_daddl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_daddl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fadd, faddl,
daddl, fMaddfN, fMaddfNx, fMxaddfN and fMxaddfNx.
* math/auto-libm-test-in: Add tests of add.
* math/auto-libm-test-out-narrow-add: New generated file.
* math/libm-test-narrow-add.inc: New file.
* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fadd.c: Likewise.
* sysdeps/ieee754/float128/s_f32addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dadd.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_daddl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_faddl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.

(cherry picked from commit d8742dd82f6a00601155c69bad3012e905591e1f)

Add build infrastructure for narrowing libm functions.

TS 18661-1 defines libm functions that carry out an operation (+ - * /
sqrt fma) on their arguments and return a result rounded to a
(usually) narrower type, as if the original result were computed to
infinite precision and then rounded directly to the result type
without any intermediate rounding to the argument type.  For example,
fadd, faddl and daddl for addition.  These are the last remaining TS
18661-1 functions left to be added to glibc.  TS 18661-3 extends this
to corresponding functions for _FloatN and _FloatNx types.

As functions parametrized by two rather than one varying
floating-point types, these functions require infrastructure in glibc
that was not required for previous libm functions.  This patch
provides such infrastructure - excluding test support, and actual
function implementations, which will be in subsequent patches.

Declaring the functions uses a header bits/mathcalls-narrow.h, which
is included many times, for each relevant pair of types.  This will
end up containing macro calls of the form

__MATHCALL_NARROW (__MATHCALL_NAME (add), __MATHCALL_REDIR_NAME (add), 2);

for each family of narrowing functions.  (The structure of this macro
call, with the calls to __MATHCALL_NAME and __MATHCALL_REDIR_NAME
there rather than in the definition of __MATHCALL_NARROW, arises from
the names such as "add" *not* themselves being reserved identifiers -
meaning it's necessary to avoid any indirection that would result in a
user-defined "add" macro being expanded.)  Whereas for existing
functions declaring long double functions is disabled if _LIBC in the
case where they alias double functions, to facilitate defining the
long double functions as aliases of the double ones, there is no such
logic for the narrowing functions in this patch.  Rather, the files
defining such functions are expected to use #define to hide the
original declarations of the alias names, to avoid errors about
defining aliases with incompatible types.

math/Makefile support is added for building the functions (listed in
libm-narrow-fns, currently empty) for all relevant pairs of types.  An
internal header math-narrow.h is added for macros shared between
multiple function implementations - currently a ROUND_TO_ODD macro to
facilitate writing functions using the round-to-odd implementation
approach, and alias macros to create all the required function
aliases.  libc_feholdexcept_setroundf128 and libc_feupdateenv_testf128
are added for use when required (only for x86_64).  float128_private.h
support is added for ldbl-128 narrowing functions to be used for
_Float128.

Certain things are specifically omitted from this patch and the
immediate followups.  tgmath.h support is deferred; there remain
unresolved questions about how the type-generic macros for these
functions are supposed to work, especially in the case of arguments of
integer type.  The math.h / bits/mathcalls-narrow.h logic, and the
logic for determining what functions / aliases to define, will need
some adjustments to support the sqrt and fma functions, where
e.g. f32xsqrtf64 can just be an alias for sqrt rather than a separate
function.  TS 18661-1 defines FP_FAST_* macros but no support is
included for defining them (they won't in general be true without
architecture-specific optimized function versions).

For each of the function groups (add sub mul div sqrt fma) there are
always six functions present (e.g. fadd, faddl, daddl, f32addf64,
f32addf32x, f32xaddf64).  When _Float64x and _Float128 are supported,
there are seven more (e.g. f32addf64x, f32addf128, f64addf64x,
f64addf128, f32xaddf64x, f32xaddf128, f64xaddf128).  In addition, in
the ldbl-opt case there are function names such as __nldbl_daddl (an
alias for f32xaddf64, which is not a reserved name in TS 18661-1, only
in TS 18661-3), for calls to daddl to be mapped to in the
-mlong-double-64 case.  (Calls to faddl just get mapped to fadd, and
for sqrt and fma there won't be __nldbl_* functions because dsqrtl and
dfmal can just be mapped to sqrt and fma with -mlong-double-64.)

While there are six or thirteen functions present in each group (plus
__nldbl_* names only as an ABI, not an API), not all are distinct;
they fall in various groups of aliases.  There are two distinct
versions built if long double has the same format as double; four if
they have distinct formats but there is no _Float64x or _Float128
support; five if long double has binary128 format; seven when
_Float128 is distinct from long double.

Architecture-specific optimized versions are possible, but not
included in my patches.  For example, IA64 generally supports
narrowing the result of most floating-point instructions; Power ISA
2.07 (POWER8) supports double values as arguments to float
instructions, with the results narrowed as expected; Power ISA 3
(POWER9) supports round-to-odd for float128 instructions, so meaning
that approach can be used without needing to set and restore the
rounding mode and test "inexact".  I intend to leave any such
optimized versions to the architecture maintainers.  Generally in such
cases it would also make sense for calls to these functions to be
expanded inline (given -fno-math-errno); I put a suggestion for TS
18661-1 built-in functions at <https://gcc.gnu.org/wiki/SummerOfCode>.

Tested for x86_64 (this patch in isolation, as well as testing for
various configurations in conjunction with further patches).

* math/bits/mathcalls-narrow.h: New file.
* include/bits/mathcalls-narrow.h: Likewise.
* math/math-narrow.h: Likewise.
* math/math.h (__MATHCALL_NARROW_ARGS_1): New macro.
(__MATHCALL_NARROW_ARGS_2): Likewise.
(__MATHCALL_NARROW_ARGS_3): Likewise.
(__MATHCALL_NARROW_NORMAL): Likewise.
(__MATHCALL_NARROW_REDIR): Likewise.
(__MATHCALL_NARROW): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)]: Repeatedly include
<bits/mathcalls-narrow.h> with _Mret_, _Marg_ and __MATHCALL_NAME
defined.
[__GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
* math/Makefile (headers): Add bits/mathcalls-narrow.h.
(libm-narrow-fns): New variable.
(libm-narrow-types-basic): Likewise.
(libm-narrow-types-ldouble-yes): Likewise.
(libm-narrow-types-float128-yes): Likewise.
(libm-narrow-types-float128-alias-yes): Likewise.
(libm-narrow-types): Likewise.
(libm-routines): Add narrowing functions.
* sysdeps/i386/fpu/fenv_private.h [__x86_64__]
(libc_feholdexcept_setroundf128): New macro.
[__x86_64__] (libc_feupdateenv_testf128): Likewise.
* sysdeps/ieee754/float128/float128_private.h: Include
<math/math-narrow.h>.
[libc_feholdexcept_setroundf128] (libc_feholdexcept_setroundl):
Undefine and redefine.
[libc_feupdateenv_testf128] (libc_feupdateenv_testl): Likewise.
(libm_alias_float_ldouble): Undefine and redefine.
(libm_alias_double_ldouble): Likewise.

Signed-off-by: Pranav Kant <prka@google.com>

getaddrinfo: Fix leak with AI_ALL [BZ #28852]

Use realloc in convert_hostent_to_gaih_addrtuple and fix up pointers in
the result list so that a single block is maintained for
hostbyname3_r/hostbyname2_r and freed in gaih_inet. This result is
never merged with any other results, since the hosts database does not
permit merging.

Resolves BZ #28852.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: DJ Delorie <dj@redhat.com>

Optimize pthread_cond_timedwait to avoid unnecessary call to clock_gettime for CLOCK_MONOTONIC

getcwd: Set errno to ERANGE for size == 1 (CVE-2021-3999)

Cherry-picked from 23e0e8f5f1fb5ed150253d986ecccdc90c2dcd5e in main branch.
Test included with this commit is not cherry-picked because it requires more
changes.

No valid path returned by getcwd would fit into 1 byte, so reject the
size early and return NULL with errno set to ERANGE.  This change is
prompted by CVE-2021-3999, which describes a single byte buffer
underflow and overflow when all of the following conditions are met:

- The buffer size (i.e. the second argument of getcwd) is 1 byte
- The current working directory is too long
- '/' is also mounted on the current working directory

Sequence of events:

- In sysdeps/unix/sysv/linux/getcwd.c, the syscall returns ENAMETOOLONG
  because the linux kernel checks for name length before it checks
  buffer size

- The code falls back to the generic getcwd in sysdeps/posix

- In the generic func, the buf[0] is set to '\0' on line 250

- this while loop on line 262 is bypassed:

    while (!(thisdev == rootdev && thisino == rootino))

  since the rootfs (/) is bind mounted onto the directory and the flow
  goes on to line 449, where it puts a '/' in the byte before the
  buffer.

- Finally on line 458, it moves 2 bytes (the underflowed byte and the
  '\0') to the buf[0] and buf[1], resulting in a 1 byte buffer overflow.

- buf is returned on line 469 and errno is not set.

This resolves BZ #28769.

Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Qualys Security Advisory <qsa@qualys.com>
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

CVE-2018-19591: if_nametoindex: Fix descriptor for overlong name [BZ #23927]

CVE-2016-10739: getaddrinfo: Fully parse IPv4 address strings [BZ #20018]

Some tests in original commit are not included because they depend on headers that
are not present in GRTEv5 branch.

The IPv4 address parser in the getaddrinfo function is changed so that
it does not ignore trailing whitespace and all characters after it.
For backwards compatibility, the getaddrinfo function still recognizes
legacy name syntax, such as 192.000.002.010 interpreted as 192.0.2.8
(octal).

This commit does not change the behavior of inet_addr and inet_aton.
gethostbyname already had additional sanity checks (but is switched
over to the new __inet_aton_exact function for completeness as well).

To avoid sending the problematic query names over DNS, commit
6ca53a2453598804a2559a548a08424fca96434a ("resolv: Do not send queries
for non-host-names in nss_dns [BZ #24112]") is needed.

Typo in configure.ac

Fallback from 0778e25fe1f34789794689f99e25b0c5ff001795

Replace math-barriers with math_private

That's where the definition for math_force_eval was before refactoring

Expose __isinff128 for clang

x86_64: Add SSE sfp-exceptions

The exported x86_64 fenv.h functions operate on both i387 and SSE (since
they should work on both float, double, and long double) while the
internal libc_fe* set either SSE (float, double, and float128) or
i387 (long double).

The libgcc __sfp_handle_exceptions (used on float128 implementation),
however, will set either SEE or i387 exception depending of the
exception to raise. This broke the internal assumption of float128
where only SSE operations will be used.

This patch reimplements the libgcc __sfp_handle_exceptions to use only
SSE operations and sets libgcc to use it instead of its own
implementation.

And I think we should fix libgcc in a similar manner, since checking on
config/i386/64/sfp-machine.h it already only supports SSE rounding mode
and x86_64 ABI also expectes float128 to use SSE registers [1]
(although it is not clear on how future implementation might implement
it).

Checked on x86_64-linux-gnu.

[1] https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI

Sync configure.ac with configure script

Fallback from bade6276d16523f81a1dedf22e591730592f15d6

Get rid of WANT_FLOAT128 usage in floatn.h

This header is installed system-wide. It's not correct to introduce a new
macro WANT_FLOAT128 in this because then we are either forcing the compiler to
make it an inbuilt macro to make glibc expose all float128 functionality, or asking
our clients to -DWANT_FLOAT128 to get float128 functionality in glibc.

Given we are primarily going to have float128 enabled GRTE now, we don't need to have
guards for non-float128 cases.

-DWANT_FLOAT usage and enable float128 tests

x86: Respect --disable-float128 flag to disable FLOAT128 functionality

x86: Define __HAVE_FLOAT128 for Clang and use __builtin_*f128 code path

Clang supports __builtin_fabsf128 (despite not supporting _Float128) but
_not _builtin_fabsq.
By falling back to `typedef __float128 _Float128;`, the float128 code
will be buildable with Clang.

configure: Use same pattern to find headers for clang

math: x86: Use prefix for FP_INIT_ROUNDMODE

Not all compilers support the inline asm prefix '%v' to emit the avx
instruction if AVX is enable. Use a prefix instead.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Avoid error due to -Wimplicit-function-declaration

math: x86: Avoid the use of __libgcc_cmp_return__ for __gcc_CMPtype

Add -Wl,--undefined-version when using newer lld

to work around errors like

version script assignment of 'GLIBC_2.4' to symbol '__stack_chk_guard' failed: symbol not defined

Apply upstream commit __builtin_FILE commit to GRTEv5.

https://sourceware.org/git/?p=glibc.git;a=commit;h=e42ec822190056895e55e5140ce2304e67e34445

nptl: Make mmap and munmap in thread stack allocation interposable

b/238021577: __mmap and __munmap are not interposable. Call
interposable mmap and munmap instead so that we can capture thread stack
allocations.

Fix build of nptl/tst-thread_local1.cc with GCC 12

The test nptl/tst-thread_local1.cc fails to build with GCC mainline
because of changes to what libstdc++ headers implicitly include what
other headers:

tst-thread_local1.cc: In function 'int do_test()':
tst-thread_local1.cc:177:5: error: variable 'std::array<std::pair<const char*, std::function<void(void* (*)(void*))> >, 2> do_thread_X' has initializer but incomplete type
177 | do_thread_X
| ^~~~~~~~~~~

Fix this by adding an explicit include of <array>.

Tested with build-many-glibcs.py for aarch64-linux-gnu.

(cherry picked from commit 2ee9b24f47db8d0a8d0ccadb999335a1d4cfc364)

<string.h>: Define __CORRECT_ISO_CPP_STRING_H_PROTO for Clang [BZ #25232]

Without the asm redirects, strchr et al. are not const-correct.

libc++ has a wrapper header that works with and without
__CORRECT_ISO_CPP_STRING_H_PROTO (using a Clang extension). But when
Clang is used with libstdc++ or just C headers, the overloaded functions
with the correct types are not declared.

This change does not impact current GCC (with libstdc++ or libc++).

(cherry picked from commit 953ceff17a4a15b10cfdd5edc3c8cae4884c8ec3)

Makeconfig: Update clang_rt.crtbegin.o filename

Remove x86_64 specific lowlevellock/cancellation

The x86_64 specific implemention has CFI directives like
`.cfi_adjust_cfa_offset 128` which are incorrect when RBP is used as the
canonical frame address.

This follows the spirit of the following two commits by removing the
x86_64 specific implementation. The generic implementation will be used.

* eb76e5b465a4b7b569cde4b4f57d1fcb4695c1c6 ("nptl: Reinstate pthread_timedjoin_np as a cancellation point (BZ#24215)")
* c50e1c263ec15e98da3235e663049156fd1afcfa ("x86: Remove arch-specific low level lock implementation")

elf: Support DT_RELR relative relocation format

Adapted from
https://sourceware.org/pipermail/libc-alpha/2022-April/138085.html
([PATCH v11 0/7] Support DT_RELR relative relocation format),
which is expected to be included in glibc 2.36.

glibc 2.35 has a fair amount of rtld changes to avoid nested functions
(https://sourceware.org/PR27220). This patch is carefully crafted to
make the minimal changes.

Notebly, this commit

* works around b/208156916 by not bumping DT_NUM. DT_RELR and DT_RELRSZ
  take the l_info slots at DT_VERSYM+1 and DT_VERSYM+2.
* avoids changes to include/link.h
* removes the time travel compatibility check (error if DT_RELR is used
  without GLIBC_ABI_DT_RELR version need). This needs link.h change and
  the detected case cannot happen if we correctly use
  -Wl,-z,pack-relative-relocs.

configure: Don't check LD -v --help for LIBC_LINKER_FEATURE

When LIBC_LINKER_FEATURE is used to check a linker option with the equal
sign, it will likely fail because the LD -v --help output may look like
`-z lam-report=[none|warning|error]` while the needle is something like
`-z lam-report=warning`.

The LD -v --help filter doesn't save much time, so just remove it.

(cherry picked from commit 8438135d3481853e300e1043cfee3946dadb28b3)

Use libc_hidden_* for atoi (bug 15105).

Continuing the fixes for localplt test failures with -Os arising from
functions not being inlined in that case, this patch fixes such
failures for atoi by using libc_hidden_proto and libc_hidden_def.

Tested for x86_64 (both that it removes this particular localplt
failure for -Os, and that the testsuite continues to pass without
-Os).

[BZ #15105]
* stdlib/atoi.c (atoi): Use libc_hidden_def.
* include/stdlib.h [!_ISOMAC] (atoi): Use libc_hidden_proto.

(cherry picked from commit 20602c72fa54bc0923314820ec8148186096bf3b)

Use libc_hidden_* for tolower, toupper (bug 15105).

Continuing the fixes for localplt test failures with -Os arising from
functions not being inlined in that case, this patch fixes such
failures for tolower and toupper by using libc_hidden_proto and
libc_hidden_def.

Tested for x86_64 (both that it removes this particular localplt
failure for -Os, and that the testsuite continues to pass without
-Os).

2018-02-22 Joseph Myers <joseph@codesourcery.com>

[BZ #15105]
* ctype/ctype.c (tolower): Use libc_hidden_def.
(toupper): Likewise.
* include/ctype.h [!_ISOMAC] (tolower): Use libc_hidden_proto.
[!_ISOMAC] (toupper): Likewise.

(cherry picked from commit 54412d20618b7b93f136a168e788573575f8a7a6)

Use libc_hidden_* for argz_next, __argz_next (bug 15105).

Among other localplt test failures when building with -Os, there are
libc.so PLT references for argz_next and __argz_next. This is a
simple case of functions that are inlined for -O2 but not for -Os;
this patch adds libc_hidden_proto / libc_hidden_def for them to avoid
localplt failures even when not inlined.

Tested for x86_64 (both that it removes these particular localplt
failures for -Os - but other such failures remain so the bug can't yet
be closed - and that the testsuite continues to pass without -Os).

[BZ #15105]
* include/argz.h (argz_next): Use libc_hidden_proto.
(__argz_next): Likewise.
* string-argz-next.c (__argz_next): Use libc_hidden_def.
(argz_next): Use libc_hidden_weak.

(cherry picked from commit 055ac2a7eeb14755e946440af3d2cdfe95f18f8e)

Use libc_hidden_* for __cmsg_nxthdr (bug 15105).

Among other localplt test failures when building with -Os, there are
libc.so PLT references for __cmsg_nxthdr. This is a simple case of a
function that is inlined for -O2 but not for -Os; this patch adds
libc_hidden_proto / libc_hidden_def for it to avoid a localplt failure
even when it is not inlined.

Tested for x86_64 (both that it removes this particular localplt
failure for -Os - but other such failures remain so the bug can't yet
be closed - and that the testsuite continues to pass without -Os).

[BZ #15105]
* include/sys/socket.h [!_ISOMAC] (__cmsg_nxthdr): Use
libc_hidden_proto.
* sysdeps/unix/sysv/linux/cmsg_nxthdr.c (__cmsg_nxthdr): Use
libc_hidden_def.

(cherry picked from commit e4452a2d19279d4c90bcafe09ec3cbfd3efe9b6a)

Use libc_hidden_* for fputs (bug 15105).

Among other localplt test failures when building with -Os, there are
libc.so PLT references for fputs.  fputs calls normally get redirected
to _IO_fputs by a macro in include/stdio.h (and _IO_fputs in turn uses
libc_hidden_proto), but GCC can convert an fprintf call with a
constant string argument into an fputs call, which of course is then
unaffected by the macro redirection.  (I don't know why this issue
only appears with -Os.)

This patch duly adds a use of libc_hidden_proto for fputs.  I see no
obvious reason why the fputs macro redirection is needed at all, but
this patch does not change it.

Tested for x86_64 (both that it removes this particular localplt
failure for -Os - but other such failures remain so the bug can't yet
be closed - and that the testsuite continues to pass without -Os).

[BZ #15105]
* include/stdio.h [!_ISOMAC && IS_IN (libc)] (fputs): Use
libc_hidden_proto.
* libio/iofputs.c (fputs): Use libc_hidden_weak.

(cherry picked from commit 499b315324519f8deb5b42a143a76319934a3ab0)

Fix -Os gnu_dev_* linknamespace, localplt issues (bug 15105, bug 19463).

Building with -Os produces linknamespace and localplt failures for,
among other functions, gnu_dev_major, gnu_dev_minor and
gnu_dev_makedev.

The issue is that those functions are not inlined when building with
-Os. While one could force them to be inlined in that case, it seems
more natural to fix this issue similarly to other namespace issues.
Thus, this patch makes gnu_dev_* into weak aliases for hidden symbols
__gnu_dev_*; __gnu_dev_* are then defined as inlines in the internal
include/sys/sysmacros.h, and uses of gnu_dev_* (often via the macros
major, minor and makedev) for which there are namespace issues are
changed to use __gnu_dev_*; where there are no namespace issues, use
of libc_hidden_proto serves to avoid unnecessary local PLT entry use.

Tested for x86_64, (a) without -Os, to verify the testsuite continues
to pass without problems and that the functions called under their new
names continue to be inlined as expected in that case; (b) with -Os,
to verify that the linknamespace and localplt failures in question go
away (but because of other such failures present, neither of the
relevant bugs can yet be closed).

[BZ #15105]
[BZ #19463]
* include/sys/sysmacros.h [!_ISOMAC]
(__SYSMACROS_NEED_IMPLEMENTATION): Define macro.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC]
(_SYS_SYSMACROS_H_WRAPPER): Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (gnu_dev_major): Use
libc_hidden_proto.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (gnu_dev_minor): Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (gnu_dev_makedev):
Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__SYSMACROS_DECL_TEMPL):
Undefine and redefine to add use __gnu_dev_ prefix.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__SYSMACROS_IMPL_TEMPL):
Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__gnu_dev_major): Declare
and define as hidden inline function.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__gnu_dev_minor):
Likewise.
[!_SYS_SYSMACROS_H_WRAPPER && !_ISOMAC] (__gnu_dev_makedev):
Likewise.
* misc/makedev.c (OUT_OF_LINE_IMPL_TEMPL): Use __gnu_dev_ prefix.
(gnu_dev_major): Use weak_alias and libc_hidden_weak.
(gnu_dev_minor): Likewise.
(gnu_dev_makedev): Likewise.
* csu/check_fds.c (check_one_fd): Use __gnu_dev_makedev instead of
makedev.
* posix/wordexp.c (exec_comm_child): Likewise.
* sysdeps/mach/hurd/xmknodat.c (__xmknodat): Use __gnu_dev_minor
instead of minor and __gnu_dev_major instead of major.
* sysdeps/unix/sysv/linux/device-nrs.h (DEV_TTY_P): Use
__gnu_dev_major instead of major.
* sysdeps/unix/sysv/linux/pathconf.c (distinguish_extX): Use
__gnu_dev_major instead of gnu_dev_major and __gnu_dev_minor
instead of gnu_dev_minor.
* sysdeps/unix/sysv/linux/ptsname.c (MASTER_P): Likewise.
(SLAVE_P): Likewise.
(__ptsname_internal): Use __gnu_dev_minor instead of minor.
* sysdeps/unix/sysv/linux/ttyname.h (is_pty): Use __gnu_dev_major
instead of major.

(cherry picked from commit 8b4a118222c7ed41bc653943b542915946dff1dd)

install: Replace scripts/output-format.sed with objdump -f [BZ #26559]

GNU ld and gold have supported --print-output-format since 2011. glibc
requires binutils>=2.25 (2015), so if LD is GNU ld or gold, we can
assume the option is supported.

lld is by default a cross linker supporting multiple targets. It auto
detects the file format and does not need OUTPUT_FORMAT. It does not
support --print-output-format.

By parsing objdump -f, we can support all the three linkers.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 87d583c6e8cd0e49f64da76636ebeec033298b4d)

Use a better workaround for clang lack of _builtin_va_arg_pack

Set the retain attribute on _elf_set_element if CC supports [BZ #27492]

So that text_set_element/data_set_element/bss_set_element defined
variables will be retained by the linker.

Note: 'used' and 'retain' are orthogonal: 'used' makes sure the variable
will not be optimized out; 'retain' prevents section garbage collection
if the linker support SHF_GNU_RETAIN.

GNU ld 2.37 and LLD 13 will support -z start-stop-gc which allow C
identifier name sections to be GCed even if there are live
__start_/__stop_ references.

Without the change, there are some static linking problems, e.g.
_IO_cleanup (libio/genops.c) may be discarded by ld --gc-sections, so
stdout is not flushed on exit.

Note: GCC may warning 'retain' attribute ignored while __has_attribute(retain)
is 1 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99587).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit cd6ae7ea5431c2b8f16201fb0e2c413bf8d2df06)

powerpc: Use --no-tls-get-addr-optimize in test only if the linker supports it

LLD doesn't support --{,no-}tls-get-addr-optimize.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
(cherry picked from commit f9cd7d5d194c652e9ec31634da3fc8ef1bf06780)

elf: Drop elf/tls-macros.h in favor of __thread and tls_model attributes [BZ #28152] [BZ #28205]

elf/tls-macros.h was added for TLS testing when GCC did not support
__thread. __thread and tls_model attributes are mature now and have been
used by many newer tests.

Also delete tst-tls2.c which tests .tls_common (unused by modern GCC and
unsupported by Clang/LLD). .tls_common and .tbss definition are almost
identical after linking, so the runtime test doesn't add additional
coverage. Assembler and linker tests should be on the binutils side.

When LLD 13.0.0 is allowed in configure.ac
(https://sourceware.org/pipermail/libc-alpha/2021-August/129866.html),
`make check` result is on par with glibc built with GNU ld on aarch64
and x86_64.

As a future clean-up, TLS_GD/TLS_LD/TLS_IE/TLS_IE macros can be removed from
sysdeps/*/tls-macros.h. We can add optional -mtls-dialect={gnu2,trad}
tests to ensure coverage.

Tested on aarch64-linux-gnu, powerpc64le-linux-gnu, and x86_64-linux-gnu.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 33c50ef42878b07ee6ead8b3f1a81d8c2c74697c)

aarch64: Make elf_machine_{load_address,dynamic} robust [BZ #28203]

The AArch64 ABI is largely platform agnostic and does not specify
_GLOBAL_OFFSET_TABLE_[0] ([1]). glibc ld.so turns out to be probably the
only user of _GLOBAL_OFFSET_TABLE_[0] and GNU ld defines the value
to the link-time address _DYNAMIC. [2]

In 2012, __ehdr_start was implemented in GNU ld and gold in binutils
2.23. Using adrp+add / (-mcmodel=tiny) adr to access
__ehdr_start/_DYNAMIC gives us a robust way to get the load address and
the link-time address of _DYNAMIC.

[1]: From a psABI maintainer, https://bugs.llvm.org/show_bug.cgi?id=49672#c2
[2]: LLD's aarch64 port does not set _GLOBAL_OFFSET_TABLE_[0] to the
link-time address _DYNAMIC.
LLD is widely used on aarch64 Android and ChromeOS devices. Software
just works without the need for _GLOBAL_OFFSET_TABLE_[0].

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 43d06ed218fc8be58987bdfd00e21e5720f0b862)

elf: Unconditionally use __ehdr_start

We can consider __ehdr_start (from binutils 2.23 onwards)
unconditionally supported, since configure.ac requires binutils>=2.25.

The configure.ac check is related to an ia64 bug fixed by binutils 2.24.
See https://sourceware.org/pipermail/libc-alpha/2014-August/053503.html

Tested on x86_64-linux-gnu. Tested build-many-glibcs.py with
aarch64-linux-gnu and s390x-linux-gnu.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 302247c89121e8d4c7629e589edbb4974fff6edb)

wordexp: handle overflow in positional parameter number (bug 28011)

Use strtoul instead of atoi so that overflow can be detected.

Disable tests that need more-recent infrastructure

intl: Handle translation output codesets with suffixes [BZ #26383]

Commit 91927b7c7643 (Rewrite iconv option parsing [BZ #19519]) did not
handle cases where the output codeset for translations (via the `gettext'
family of functions) might have a caller specified encoding suffix such as
TRANSLIT or IGNORE. This led to a regression where translations did not
work when the codeset had a suffix.

This commit fixes the above issue by parsing any suffixes passed to
__dcigettext and adds two new test-cases to intl/tst-codeset.c to
verify correct behaviour. The iconv-internal function __gconv_create_spec
and the static iconv-internal function gconv_destroy_spec are now visible
internally within glibc and used in intl/dcigettext.c.

Rewrite iconv option parsing [BZ #19519]

This commit replaces string manipulation during `iconv_open' and iconv_prog
option parsing with a structured, flag based conversion specification. In
doing so, it alters the internal `__gconv_open' interface and accordingly
adjusts its uses.

This change fixes several hangs in the iconv program and therefore includes
a new test to exercise iconv_prog options that originally led to these hangs.
It also includes a new regression test for option handling in the iconv
function.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

iconv: Accept redundant shift sequences in IBM1364 [BZ #26224]

The IBM1364, IBM1371, IBM1388, IBM1390 and IBM1399 character sets
share converter logic (iconvdata/ibm1364.c) which would reject
redundant shift sequences when processing input in these character
sets. This led to a hang in the iconv program (CVE-2020-27618).

This commit adjusts the converter to ignore redundant shift sequences
and adds test cases for iconv_prog hangs that would be triggered upon
their rejection. This brings the implementation in line with other
converters that also ignore redundant shift sequences (e.g. IBM930
etc., fixed in commit 692de4b3960d).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

iconv: Fix incorrect UCS4 inner loop bounds (BZ#26923)

Previously, in UCS4 conversion routines we limit the number of
characters we examine to the minimum of the number of characters in the
input and the number of characters in the output. This is not the
correct behavior when __GCONV_IGNORE_ERRORS is set, as we do not consume
an output character when we skip a code unit. Instead, track the input
and output pointers and terminate the loop when either reaches its
limit.

This resolves assertion failures when resetting the input buffer in a step of
iconv, which assumes that the input will be fully consumed given sufficient
output space.

math/test-sinl-pseudo: Use stack protector only if available

This fixes commit 9333498794cde1d5cca518bad ("Avoid ldbl-96 stack
corruption from range reduction of pseudo-zero (bug 25487).").

Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487).

Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero
argument (an representation where all the significand bits, including
the explicit high bit, are zero, but the exponent is not zero, which
is not a valid representation for the long double type).

Although this is not a valid long double representation, existing
practice in this area (see bug 4586, originally marked invalid but
subsequently fixed) is that we still seek to avoid invalid memory
accesses as a result, in case of programs that treat arbitrary binary
data as long double representations, although the invalid
representations of the ldbl-96 format do not need to be consistently
handled the same as any particular valid representation.

This patch makes the range reduction detect pseudo-zero and unnormal
representations that would otherwise go to __kernel_rem_pio2, and
returns a NaN for them instead of continuing with the range reduction
process. (Pseudo-zero and unnormal representations whose unbiased
exponent is less than -1 have already been safely returned from the
function before this point without going through the rest of range
reduction.) Pseudo-zero representations would previously result in
the value passed to __kernel_rem_pio2 being all-zero, which is
definitely unsafe; unnormal representations would previously result in
a value passed whose high bit is zero, which might well be unsafe
since that is not a form of input expected by __kernel_rem_pio2.

Tested for x86_64.

posix: Sync gnulib regex implementation

This patch syncs the regex implementation with gnulib (commit 0ee5212).
Only two changes in GLIBC regex testing are required:

  1. posix/bug-regex28.c: as previously discussed [1] the change of
     expected results on the pattern should be safe.

  2. posix/PCRE.tests: the ERE (a)|\1 is malformed (in the sense that
     the \1 doesn't mean anything) and although current GLIBC accepts
     it has undefined behavior.  This patch removes the specific test.

This sync contains some patches from thread 'Regex: Make libc regex
more usable outside GLIBC.' [2] which have been pushed upstream in
gnulib.  This patches also fixes some regex issues (BZ #23233,
BZ #21163, BZ #18986, BZ #13762) and I did not add testcases for
both #23233 and #13762 because I couldn't think a simple way to
trigger the expected failure path to trigger them.

Checked on x86_64-linux-gnu and i686-linux-gnu.

[BZ #23233]
[BZ #21163]
[BZ #18986]
[BZ #13762]
* posix/Makefile (tests): Add bug-regex37 and bug-regex38.
* posix/PCRE.tests: Remove invalid test.
* posix/bug-regex28.c: Fix expected values for used syntax.
* posix/bug-regex37.c: New file.
* posix/bug-regex38.c: Likewise.
* posix/regcomp.c: Sync with gnulib.
* posix/regex.c: Likewise.
* posix/regex.h: Likewise.
* posix/regex_internal.c: Likewise.
* posix/regex_internal.h: Likewise.
* posix/regexec.c: Likewise.

[1] https://sourceware.org/ml/libc-alpha/2017-12/msg00807.html
[2] https://sourceware.org/ml/libc-alpha/2017-12/msg00237.html

Fix use-after-free in glob when expanding ~user (bug 25414)

The value of `end_name' points into the value of `dirname', thus don't
deallocate the latter before the last use of the former.

Fix a return type in elf unload test

Fix buffer overrun in EUC-KR conversion module (bz #24973)

The byte 0xfe as input to the EUC-KR conversion denotes a user-defined
area and is not allowed. The from_euc_kr function used to skip two bytes
when told to skip over the unknown designation, potentially running over
the buffer end.

gconv: Fix assertion failure in ISO-2022-JP-3 module (bug 27256)

The conversion loop to the internal encoding does not follow
the interface contract that __GCONV_FULL_OUTPUT is only returned
after the internal wchar_t buffer has been filled completely.  This
is enforced by the first of the two asserts in iconv/skeleton.c:

      /* We must run out of output buffer space in this
rerun.  */
      assert (outbuf == outerr);
      assert (nstatus == __GCONV_FULL_OUTPUT);

This commit solves this issue by queuing a second wide character
which cannot be written immediately in the state variable, like
other converters already do (e.g., BIG5-HKSCS or TSCII).

Reported-by: Tavis Ormandy <taviso@gmail.com>

Read f->func.cxa under the lock.

Fix bug where ld.so hashtable would retain strings passed to dlopen().

Extend elf/unload8 to test an additional load/unload pattern

Don't crash if /var/tmp doesn't exist

`xstat` is checked `stat64` crashing the program if the latter returns
failure. In this loop, we are trying to find one folder that satisfies
the condition, no reason to crash the program if one folder doesn't.

More aggressively prevent a buffer from being optimized out

The volatile global variable was first introduced in e86f9654c. I have
noticed the compiler still optimizing the buffer out on AArch64
presumably because the assignment is after all other observable
behaviors so it's still valid to eliminate it.

x86_64: Remove unneeded static PIE check for undefined weak diagnostic

https://sourceware.org/bugzilla/show_bug.cgi?id=21782 dropped an ld
diagnostic for R_X86_64_PC32 referencing an undefined weak symbol in
-pie links. Arguably keeping the diagnostic like other ports is more
correct, since statically resolving movl foo(%rip), %eax to the
link-time zero address produces a corrupted output.

It turns out that --enable-static-pie builds do not depend on the ld
behavior. GCC generates GOT indirection for weak declarations for
-fPIE/-fPIC, so what ld does with the PC-relative relocation doesn't
really matter.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

[PATCH 7/7] sin/cos slow paths: refactor sincos implementation

Refactor the sincos implementation - rather than rely on odd partial inlining
of preprocessed portions from sin and cos, explicitly write out the cases.
This makes sincos much easier to maintain and provides an additional 16-20%
speedup between 0 and 2^27. The overall speedup of sincos is 48% over this range.
Between 0 and PI it is 66% faster.

* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Cleanup ifdefs.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/s_sin.c (__sincos): Refactor using the same
logic as sin and cos.

[PATCH 6/7] sin/cos slow paths: refactor duplicated code into dosin

Refactor duplicated code into do_sin.  Since all calls to do_sin use copysign to
set the sign of the result, move it inside do_sin.  Small inputs use a separate
polynomial, so move this into do_sin as well (the check is based on the more
conservative case when doing large range reduction, but could be relaxed).

* sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Use TAYLOR_SIN for small
inputs.  Return correct sign.
(do_sincos): Remove small input check before do_sin, let do_sin set
the sign.
(__sin): Likewise.
(__cos): Likewise.

[PATCH 5/7] sin/cos slow paths: remove unused slowpath functions

Remove all unused slowpath functions.

* sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SLOW): Remove.
(do_cos_slow): Likewise.
(do_sin_slow): Likewise.
(reduce_and_compute): Likewise.
(slow): Likewise.
(slow1): Likewise.
(slow2): Likewise.
(sloww): Likewise.
(sloww1): Likewise.
(sloww2): Likewise.
(bslow): Likewise.
(bslow1): Likewise.
(bslow2): Likewise.
(cslow2): Likewise.

[PATCH 4/7] sin/cos slow paths: remove slow paths from huge range reduction

For huge inputs use the improved do_sincos function as well. Now no cases use
the correction factor returned by do_sin, do_cos and TAYLOR_SIN, so remove it.

* sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SIN): Remove cor parameter.
(do_cos): Remove corp parameter and calculations.
(do_sin): Likewise.
(do_sincos): Remove cor variable.
(__sin): Use do_sincos for huge inputs.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.
(reduce_and_compute_sincos): Remove unused function.

[PATCH 3/7] sin/cos slow paths: remove slow paths from small range reduction

This patch improves the accuracy of the range reduction.  When the input is
large (2^27) and very close to a multiple of PI/2, using 110 bits of PI is not
enough.  Improve range reduction accuracy to 136 bits.  As a result the special
checks for results close to zero can be removed.  The ULP of the polynomials is
at worst 0.55ULP, so there is no reason for the slow functions, and they can be
removed.

* sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to
reduce_sincos, improve accuracy to 136 bits.
(do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions.
(__sin): Use improved reduction and simplified do_sincos calculation.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.

[PATCH 2/7] sin/cos slow paths: remove large range reduction

This patch removes the large range reduction code and defers to the huge range
reduction code.  The first level range reducer supports inputs up to 2^27,
which is way too large given that inputs for sin/cos are typically small
(< 10), and optimizing for a smaller range would give a significant speedup.

Input values above 2^27 are practically never used, so there is no reason for
supporting range reduction between 2^27 and 2^48.  Removing it significantly
simplifies code and enables further speedups.  There is about a 2.3x slowdown
in this range due to __branred being extremely slow  (a better algorithm could
easily more than double performance).

* sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_2): Remove function.
(do_sincos_2): Likewise.
(__sin): Remove middle range reduction case.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Remove middle range
reduction case.

[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs

This series of patches removes the slow patchs from sin, cos and sincos.
Besides greatly simplifying the implementation, the new version is also much
faster for inputs up to PI (41% faster) and for large inputs needing range
reduction (27% faster).

ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most
of the range with mpsin and mpcos. The number of incorrectly rounded results
(ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5,
the average is ~850 per million between 0 and PI.

Tested on AArch64 and x86_64 with no regressions.

The first patch removes the slow paths for the cases where the input is small
and doesn't require range reduction. Update ULP tables for sin, cos and sincos
on AArch64 and x86_64.

* sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small
inputs.
(__cos): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.

locale: Align _nl_C_LC_CTYPE_class and _nl_C_LC_CTYPE_class32

Otherwise, programs that use character classification macros such as
isspace may observe unaligned pointers.

Change this offsetof computation to use c89 offsetof. Tested:

Update build process to create libnsl stub

Forward-port google-nsl-stub

Fix memory leak in TLS allocation

Add a test of TLS support that will fail if leaky

Let time and gettimeofday use vdso by removing old clang workaround

Use crt*.o files from llvm compiler-rt when building with clang

Do not use ppc-specific long double pack/unpack when compiling with clang

Remove old workaround in power7 logb functions, clang no longer crashes on the inline assembly

Additional fixes for llvm-as

Unlike GCC, llvm always uses an integrated assembler, which attempts to
recognized all `asm` statements written in the C code. glibc uses some
syntactically invalid asm statements to emit constants into assembly that
are later extracted with a sed or AWK script.

This change fixes two such invalid `asm` statements by wrapping the
output in a `.ascii` directive.. This does not break the sed/AWK (the same
special sequence is output) but it makes the statement syntactically valid.

See cf8e3f8757 for a previous fix for the same issue.

Add workaround for infinite looping in ppc vsyscall for sched_getcpu.

Add -Wno-incomplete-setjmp-declaration to prevent clang from unhelpfully complaining about __sigsetjmp, both in library build and testsuite runs.

Update passwd.borg handling to use passwd.borg.real

Add a case to async-signal-safe TLS to set static TLS instead of waiting for a dlopen that may not actually be happening.

Add an LD_DEBUG=tls option to help debug thread-local storage handling in ld.so

Remove an unneeded local refactor in _dl_update_slotinfo

Fix year 2039 bug for localtime with 64-bit time_t (bug 22639).

Bug 22639 reports localtime failing to handle time offset transitions
correctly in 2039 and later on platforms with 64-bit time_t.

The problem is the use of SECSPERDAY (constant 86400) in calculations
such as

    t = ((year - 1970) * 365
+ /* Compute the number of leapdays between 1970 and YEAR
      (exclusive).  There is a leapday every 4th year ...  */
+ ((year - 1) / 4 - 1970 / 4)
/* ... except every 100th year ... */
- ((year - 1) / 100 - 1970 / 100)
/* ... but still every 400th year.  */
+ ((year - 1) / 400 - 1970 / 400)) * SECSPERDAY;

where t is of type time_t and year is of type int.  Before my commit
92bd70fb85bce57ac47ba5d8af008736832c955a (an update from tzcode,
included in 2.26 and later releases), SECSPERDAY was obtained from a
file imported from tzcode, where the value included a cast to
int_fast32_t.  On 64-bit platforms, glibc defines int_fast32_t to be
long int, so 64-bit, but my patch resulted in it changing to int.
(The bug would probably have existed even before my patch for x32,
which has 64-bit time_t but 32-bit int_fast32_t, but I haven't
verified that.)

This patch fixes the problem by including a cast to time_t in the
definition of SECSPERDAY.  (64-bit time support for 32-bit systems
should move such code that isn't a public interface to using the
internal 64-bit version of time_t throughout.)

Tested for x86_64 and x86.

[BZ #22639]
* time/tzset.c (SECSPERDAY): Cast to time_t.
* time/tst-y2039.c: New file.
* time/Makefile (tests): Add tst-y2039.

Reduce __MAX_ALLOCA_CUTOFF to 8192

Make multi-arch ifunc support work with clang

Revert clang workaround for _begin that is no longer needed

Redesign the fastload support for additional performance

Add comments explaining the diff from cf8e3f8757

These comments should make it easier to see the (small) diff introduced
in cf8e3f8757. Without these comments, the diff may get list on a future
upstream merge.

Make gen-XX-const scripts work with llvm-as

The gen-as-const and gen-py-const scripts are used to generate integer constant
definitions from a list of constant C-expressions. This is achieved by
generating a C program with inline `asm` statements, that depend on
these constant expressions. During compilation, the constant expressions
are evaluated, and included in the inline asm. The build process
generates only the assembly, and then used `sed` to extract the values
from the assembly text.

This is clever. It allows the build process to extract the value of C
statements built under the target architecture. The implementation is a
bit fragile, but it is not immediately obvious to me how it could be
improved.

This change slightly modifies `gen-as-const` and `gen-py-const` to emit
valid assembly directives instead of invalid directives that were
previously emitted. Since the values are extracted via string parsing,
this has no effect on the values extracted. This is needed because the
LLVM assembler validates all statements before emitting them, whereas it
appears GCC will literally emit any `asm` directives without validation
or recognition.

Fix sense of a test in the static-linking version of ppc get_clockfreq

Makes it compile for AArch64

De-nesting fix in 83c02e85 changed function signature but AArch64 was untested.

Makes AArch64 assembly acceptable to clang

According to ARMv8 architecture reference manual section C7.2.188, SIMD MOV (to
general) instruction format is

MOV <Xd>, <Vn>.D[<index>]

gas appears to accept "<Vn>.2D[<index>]" as well, but clang's assembler does
not. C.f. https://community.arm.com/developer/ip-products/processors/f/cortex-a-forum/5214/aarch64-assembly-syntax-for-armclang

Include STATIC_PIE_BOOTSTRAP with !NESTING in powerpc64/dl-machine.h