[PATCH 3/7] sin/cos slow paths: remove slow paths from small range reduction
This patch improves the accuracy of the range reduction. When the input is
large (2^27) and very close to a multiple of PI/2, using 110 bits of PI is not
enough. Improve range reduction accuracy to 136 bits. As a result the special
checks for results close to zero can be removed. The ULP of the polynomials is
at worst 0.55ULP, so there is no reason for the slow functions, and they can be
removed.
* sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to
reduce_sincos, improve accuracy to 136 bits.
(do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions.
(__sin): Use improved reduction and simplified do_sincos calculation.
(__cos): Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.
[PATCH 2/7] sin/cos slow paths: remove large range reduction
This patch removes the large range reduction code and defers to the huge range
reduction code. The first level range reducer supports inputs up to 2^27,
which is way too large given that inputs for sin/cos are typically small
(< 10), and optimizing for a smaller range would give a significant speedup.
Input values above 2^27 are practically never used, so there is no reason for
supporting range reduction between 2^27 and 2^48. Removing it significantly
simplifies code and enables further speedups. There is about a 2.3x slowdown
in this range due to __branred being extremely slow (a better algorithm could
easily more than double performance).
[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs
This series of patches removes the slow patchs from sin, cos and sincos.
Besides greatly simplifying the implementation, the new version is also much
faster for inputs up to PI (41% faster) and for large inputs needing range
reduction (27% faster).
ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most
of the range with mpsin and mpcos. The number of incorrectly rounded results
(ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5,
the average is ~850 per million between 0 and PI.
Tested on AArch64 and x86_64 with no regressions.
The first patch removes the slow paths for the cases where the input is small
and doesn't require range reduction. Update ULP tables for sin, cos and sincos
on AArch64 and x86_64.
* sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small
inputs.
(__cos): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.
This patch assumes O_DIRECTORY works as defined by POSIX on opendir
implementation (aligning with other glibc code, for instance pwd). This
allows remove both the fallback code to handle system with missing or
broken O_DIRECTORY along with the Linux specific opendir.c which just
advertise the working flag.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, i686-linux-gnu,
sparcv9-linux-gnu, sparc64-linux-gnu, powerpc-linux-gnu, and
powerpc64le-linux-gnu.
Samuel Thibault [Mon, 2 Apr 2018 18:08:37 +0000 (18:08 +0000)]
hurd: Avoid local PLTs in libpthread.
* htl/cthreads-compat.c (__cthread_detach): Call __pthread_detach
instead of pthread_detach.
(__cthread_fork): Call __pthread_create instead of pthread_create.
(__cthread_keycreate): Call __pthread_key_create instead of
pthread_key_create.
(__cthread_getspecific): Call __pthread_getspecific instead of
pthread_getspecific.
(__cthread_setspecific): Call __pthread_setspecific instead of
pthread_setspecific.
* htl/pt-alloc.c (__pthread_alloc): Call __pthread_mutex_lock and
__pthread_mutex_unlock instead of pthread_mutex_lock and
pthread_mutex_unlock.
* htl/pt-cleanup.c (__pthread_get_cleanup_stack): Rename to
___pthread_get_cleanup_stack.
(__pthread_get_cleanup_stack): New strong alias.
* htl/pt-create.c: Include <pthreadP.h>.
(entry_point): Call __pthread_exit instead of pthread_exit.
(pthread_create): Rename to __pthread_create.
(pthread_create): New strong alias.
* htl/pt-detach.c (pthread_detach): Rename to __pthread_detach.
(pthread_detach): New strong alias.
(__pthread_detach): Call __pthread_cond_broadcast instead of
pthread_cond_broadcast.
* htl/pt-exit.c (__pthread_exit): Call __pthread_setcancelstate
instead of pthread_setcancelstate.
* htl/pt-testcancel.c: Include <pthreadP.h>.
(pthread_testcancel): Call __pthread_exit instead of pthread_exit.
* sysdeps/htl/pt-attr-getstack.c: Include <pthreadP.h>
(__pthread_attr_getstack): Call __pthread_attr_getstackaddr and
__pthread_attr_getstacksize instead of pthread_attr_getstackaddr and
pthread_attr_getstacksize.
* sysdeps/htl/pt-attr-getstackaddr.c (pthread_attr_getstackaddr):
Rename to __pthread_attr_getstackaddr.
(pthread_attr_getstackaddr): New strong alias.
* sysdeps/htl/pt-attr-getstacksize.c (pthread_attr_getstacksize):
Rename to __pthread_attr_getstacksize.
(pthread_attr_getstacksize): New strong alias.
* sysdeps/htl/pt-attr-setstack.c: Include <pthreadP.h>.
(pthread_attr_setstack): Rename to __pthread_attr_setstack.
(pthread_attr_setstack): New strong alias.
(__pthread_attr_setstack): Call __pthread_attr_getstacksize,
__pthread_attr_setstacksize and __pthread_attr_setstackaddr instead of
pthread_attr_getstacksize, pthread_attr_setstacksize and
pthread_attr_setstackaddr.
* sysdeps/htl/pt-attr-setstackaddr.c (pthread_attr_setstackaddr):
Rename to __pthread_attr_setstackaddr.
(pthread_attr_setstackaddr): New strong alias.
* sysdeps/htl/pt-attr-setstacksize.c (pthread_attr_setstacksize):
Rename to __pthread_attr_setstacksize.
(pthread_attr_setstacksize): New strong alias.
* sysdeps/htl/pt-cond-timedwait.c: Include <pthreadP.h>.
(__pthread_cond_timedwait_internal): Use __pthread_exit instead of
pthread_exit.
* sysdeps/htl/pt-key-create.c: Include <pthreadP.h>.
(__pthread_key_create): New hidden def.
* sysdeps/htl/pt-key.h: Include <pthreadP.h>.
* sysdeps/htl/pthreadP.h (_pthread_mutex_init,
__pthread_cond_broadcast, __pthread_create, __pthread_detach,
__pthread_exit, __pthread_key_create, __pthread_getspecific,
__pthread_setspecific, __pthread_setcancelstate,
__pthread_attr_getstackaddr, __pthread_attr_setstackaddr,
__pthread_attr_getstacksize, __pthread_attr_setstacksize,
__pthread_attr_setstack, ___pthread_get_cleanup_stack): New
declarations.
(__pthread_key_create, _pthread_mutex_init): New hidden declarations.
* sysdeps/mach/hurd/htl/pt-attr-setstackaddr.c
(pthread_attr_setstackaddr): Rename to __pthread_attr_setstackaddr.
(pthread_attr_setstackaddr): New strong alias.
* sysdeps/mach/hurd/htl/pt-attr-setstacksize.c
(pthread_attr_setstacksize): Rename to __pthread_attr_setstacksize.
(pthread_attr_setstacksize): New strong alias.
* sysdeps/mach/hurd/htl/pt-docancel.c: Include <pthreadP.h>.
(call_exit): Call __pthread_exit instead of pthread_exit.
* sysdeps/mach/hurd/htl/pt-mutex-init.c: Include <pthreadP.h>.
(_pthread_mutex_init): New hidden definition.
* sysdeps/mach/hurd/htl/pt-sysdep.c: Include <pthreadP.h>.
(_init_routine): Call __pthread_attr_init and __pthread_attr_setstack
instead of pthread_attr_init and pthread_attr_setstack.
Samuel Thibault [Sun, 1 Apr 2018 23:43:22 +0000 (01:43 +0200)]
hurd: Add hurd thread library
Contributed by
Agustina Arzille <avarzille@riseup.net>
Amos Jeffries <squid3@treenet.co.nz>
David Michael <fedora.dm0@gmail.com>
Marco Gerards <marco@gnu.org>
Marcus Brinkmann <marcus@gnu.org>
Neal H. Walfield <neal@gnu.org>
Pino Toscano <toscano.pino@tiscali.it>
Richard Braun <rbraun@sceen.net>
Roland McGrath <roland@gnu.org>
Samuel Thibault <samuel.thibault@ens-lyon.org>
Thomas DiModica <ricinwich@yahoo.com>
Thomas Schwinge <tschwinge@gnu.org>
* htl: New directory.
* sysdeps/htl: New directory.
* sysdeps/hurd/htl: New directory.
* sysdeps/i386/htl: New directory.
* sysdeps/mach/htl: New directory.
* sysdeps/mach/hurd/htl: New directory.
* sysdeps/mach/hurd/i386/htl: New directory.
* nscd/Depend, resolv/Depend, rt/Depend: Add htl dependency.
* sysdeps/mach/hurd/i386/Implies: Add mach/hurd/i386/htl imply.
* sysdeps/mach/hurd/i386/libpthread.abilist: New file.
This patch fixes 3dc214977 for sparc. Different than other architectures
SPARC kernel Kconfig does not define CONFIG_CLONE_BACKWARDS, however it
has the same ABI as if it did, implemented by sparc-specific code
(sparc_do_fork).
It also has a unique return value convention for clone:
Jesse Hathaway [Tue, 27 Mar 2018 21:17:59 +0000 (21:17 +0000)]
getlogin_r: return early when linux sentinel value is set
When there is no login uid Linux sets /proc/self/loginid to the sentinel
value of, (uid_t) -1. If this is set we can return early and avoid
needlessly looking up the sentinel value in any configured nss
databases.
Checked on aarch64-linux-gnu.
* sysdeps/unix/sysv/linux/getlogin_r.c (__getlogin_r_loginuid): Return
early when linux sentinel value is set.
Joseph Myers [Mon, 26 Mar 2018 21:18:28 +0000 (21:18 +0000)]
Unify umount function implementations (bug 16552).
Linux kernel architectures have various arrangements for umount
syscalls. There is a syscall that takes flags, and an older one that
does not. Newer architectures have only the one taking flags, under
the name umount2 (or under the name umount, in the ia64 case). Older
architectures may have both, under the names umount2 and umount (or
under the names umount and oldumount, in the alpha case). glibc then
has several similar implementations of the umount function (no flags)
in terms of either the __umount2 function, or the corresponding
syscall, or in terms of the old syscall under either of its names.
This patch simplifies the implementations in glibc by always using the
__umount2 function to implement the umount function on all systems
using the Linux kernel. The linux/generic implementation is moved to
sysdeps/unix/sysv/linux (without any changes to code or comments) and
all the other variants are removed. (This will have the effect of
causing the new syscall to be used in some cases that previously used
the old one, but as discussed for previous changes, such a change to
the underlying syscalls used is OK.)
There remain two variants of how the __umount2 function is
implemented, either in umount2.S, or, for ia64, in syscalls.list.
Samuel Thibault [Sat, 17 Mar 2018 00:28:41 +0000 (01:28 +0100)]
hurd: Initialize TLS and libpthread before signal thread start
* sysdeps/generic/libc-start.h [!SHARED] (ARCH_SETUP_TLS): Define to
__libc_setup_tls.
* sysdeps/unix/sysv/linux/powerpc/libc-start.h [!SHARED]
(ARCH_SETUP_TLS): Likewise.
* sysdeps/mach/hurd/libc-start.h: New file copied from
sysdeps/generic/libc-start.h, but define ARCH_SETUP_TLS to empty.
* csu/libc-start.c [!SHARED] (LIBC_START_MAIN): Call ARCH_SETUP_TLS instead
of __libc_setup_tls.
* sysdeps/mach/hurd/i386/init-first.c [!SHARED] (init1): Call
__libc_setup_tls before initializing libpthread and running _hurd_init which
starts the signal thread.
Samuel Thibault [Sat, 24 Mar 2018 23:48:01 +0000 (00:48 +0100)]
hurd: Fix accessing errno from rtld
Letting rtld access errno through TLS can not work at early stages since
TLS will not be initialized yet. When a private errno is not possible,
we thus have no other way than going through __errno_location.
* include/errno.h [IS_IN(rtld) && !RTLD_PRIVATE_ERRNO]: Do not use the
TLS declaration of errno.
The glibc-internal header frame.h was used in the old
debug/backtrace.c but is now unused. Similarly, there are some
sigcontextinfo.h macros that are used nowhere in glibc -
ADVANCE_STACK_FRAME and FIRST_FRAME_POINTER were used in the old
debug/backtrace.c, while SIGCONTEXT_EXTRA_ARGS, GET_FRAME, GET_STACK
and CALL_SIGHANDLER were unused even before the removal of that old
implementation (beyond uses of SIGCONTEXT_EXTRA_ARGS in definitions of
CALL_SIGHANDLER). This patch removes all the unused frame.h headers
and definitions of those macros.
Joseph Myers [Wed, 21 Mar 2018 17:25:30 +0000 (17:25 +0000)]
Use x86_64 backtrace as generic version.
No glibc configuration uses the present debug/backtrace.c, whereas
several #include the x86_64 version. The x86_64 version is
effectively a generic one (using _Unwind_Backtrace from libgcc, which
works much more reliably than the built-in functions used by
debug/backtrace.c). This patch moves it to debug/backtrace.c and
removes all the #includes of the x86_64 version from other
architectures which are no longer required.
I do not know whether all the other architecture-specific backtrace
implementations that are based on _Unwind_Backtrace are required, or
whether, where their differences from the generic version do something
useful, suitable hooks could be added to the generic version to reduce
the duplication involved.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
Joseph Myers [Tue, 20 Mar 2018 18:35:50 +0000 (18:35 +0000)]
Remove powerpc, sparc fdim inlines (bug 22987).
The powerpc and sparc bits/mathinline.h include inlines of fdim and
fdimf. These are not restricted to -fno-math-errno, but do not set
errno, and wrongly use ordered <= comparisons instead of the required
islessequal comparisons (this latter issue is latent on powerpc
because GCC wrongly uses unordered comparison instructions for
operations that should use ordered comparison instructions).
Since we wish to avoid such header inlines anyway, leaving it to the
compiler to inline such standard functions under appropriate
conditions, this patch fixes those issues by removing the inlines in
question (and thus removing the sparc bits/mathinline.h header which
had no other inlines left in it). I've filed
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85003> for adding
correct fdim inlines to GCC, since the function is simple enough that
a correct inline is a perfectly reasonable architecture-independent
optimization with -fno-math-errno and in the absence of implicit
excess precision.
Tested with build-many-glibcs.py for all its powerpc and sparc
configurations.
Joseph Myers [Tue, 20 Mar 2018 18:25:24 +0000 (18:25 +0000)]
Fix signed integer overflow in random_r (bug 17343).
Bug 17343 reports that stdlib/random_r.c has code with undefined
behavior because of signed integer overflow on int32_t. This patch
changes the code so that the possibly overflowing computations use
unsigned arithmetic instead.
Note that the bug report refers to "Most code" in that file. The
places changed in this patch are the only ones I found where I think
such overflow can occur.
Tested for x86_64 and x86.
[BZ #17343]
* stdlib/random_r.c (__random_r): Use unsigned arithmetic for
possibly overflowing computations.
Samuel Thibault [Tue, 20 Mar 2018 02:10:57 +0000 (03:10 +0100)]
Fix errno values
* manual/errno.texi (EOWNERDEAD, ENOTRECOVERABLE): Remove errno
values from Linux-specific section now that it is in the GNU section.
* sysdeps/gnu/errlist.c: Regenerate.
Joseph Myers [Tue, 20 Mar 2018 00:34:52 +0000 (00:34 +0000)]
Add narrowing subtract functions.
This patch adds the narrowing subtract functions from TS 18661-1 to
glibc's libm: fsub, fsubl, dsubl, f32subf64, f32subf32x, f32xsubf64
for all configurations; f32subf64x, f32subf128, f64subf64x,
f64subf128, f32xsubf64x, f32xsubf128, f64xsubf128 for configurations
with _Float64x and _Float128; __nldbl_dsubl for ldbl-opt.
The changes are essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
Samuel Thibault [Sun, 18 Mar 2018 18:43:04 +0000 (19:43 +0100)]
hurd: Fix O_DIRECTORY | O_NOFOLLOW
Appending / to the path to be looked up would make us always follow a final
symlink, even with O_NOTRANS (since the final resolution is after the
'/'). In the O_DIRECTORY | O_NOFOLLOW case, we thus have to really open
the node and stat it, which we already do anyway, and check for
directory type.
* hurd/hurdlookup.c (__hurd_file_name_lookup): Do not append '/' to
path when flags contains O_NOFOLLOW.
* hurd/lookup-retry.c (__hurd_file_name_lookup_retry): Return ENOTDIR
if flags contains O_DIRECTORY and the result is a directory.
Agustina Arzille [Sun, 18 Mar 2018 17:22:55 +0000 (18:22 +0100)]
hurd: Reimplement libc locks using mach's gsync
* hurd/Makefile (routines): Add hurdlock.
* hurd/Versions (GLIBC_PRIVATE): Added new entry to export the above
interface.
(HURD_CTHREADS_0.3): Remove __libc_getspecific.
* hurd/hurdpid.c: Include <lowlevellock.h>
(_S_msg_proc_newids): Use lll_wait to synchronize.
* hurd/hurdsig.c: (reauth_proc): Use __mutex_lock and __mutex_unlock.
* hurd/setauth.c: Include <hurdlock.h>, use integer for synchronization.
* mach/Makefile (lock-headers): Remove machine-lock.h.
* mach/lock-intern.h: Include <lowlevellock.h> instead of
<machine-lock.h>.
(__spin_lock_t): New type.
(__SPIN_LOCK_INITIALIZER): New macro.
(__spin_lock, __spin_unlock, __spin_try_lock, __spin_lock_locked,
__mutex_init, __mutex_lock_solid, __mutex_unlock_solid, __mutex_lock,
__mutex_unlock, __mutex_trylock): Use lll to implement locks.
* mach/mutex-init.c: Include <lowlevellock.h> instead of <cthreads.h>.
(__mutex_init): Initialize with lll.
* manual/errno.texi (EOWNERDEAD, ENOTRECOVERABLE): New errno values.
* sysdeps/mach/Makefile: Add libmachuser as dependencies for libs
needing lll.
* sysdeps/mach/hurd/bits/errno.h: Regenerate.
* sysdeps/mach/hurd/cthreads.c (__libc_getspecific): Remove function.
* sysdeps/mach/hurd/bits/libc-lock.h: Remove file.
* sysdeps/mach/hurd/setpgid.c: Include <lowlevellock.h>.
(__setpgid): Use lll for synchronization.
* sysdeps/mach/hurd/setsid.c: Likewise with __setsid.
* sysdeps/mach/bits/libc-lock.h: Include <tls.h> and <lowlevellock.h>
instead of <cthreads.h>.
(_IO_lock_inexpensive): New macro
(__libc_lock_recursive_t, __rtld_lock_recursive_t): New structures.
(__libc_lock_self0): New declaration.
(__libc_lock_owner_self): New macro.
(__libc_key_t): Remove type.
(_LIBC_LOCK_INITIALIZER): New macro.
(__libc_lock_define_initialized, __libc_lock_init, __libc_lock_fini,
__libc_lock_fini_recursive, __rtld_lock_fini_recursive,
__libc_lock_lock, __libc_lock_trylock, __libc_lock_unlock,
__libc_lock_define_initialized_recursive,
__rtld_lock_define_initialized_recursive,
__libc_lock_init_recursive, __libc_lock_trylock_recursive,
__libc_lock_lock_recursive, __libc_lock_unlock_recursive,
__rtld_lock_initialize, __rtld_lock_trylock_recursive,
__rtld_lock_lock_recursive, __rtld_lock_unlock_recursive
__libc_once_define, __libc_mutex_unlock): Reimplement with lll.
(__libc_lock_define_recursive, __rtld_lock_define_recursive,
_LIBC_LOCK_RECURSIVE_INITIALIZER, _RTLD_LOCK_RECURSIVE_INITIALIZER):
New macros.
Include <libc-lockP.h> to reimplement libc_key* with pthread_key*.
* hurd/hurdlock.c: New file.
* hurd/hurdlock.h: New file.
* mach/lowlevellock.h: New file
Samuel Thibault [Sun, 18 Mar 2018 01:11:56 +0000 (02:11 +0100)]
x86_64: Fix build with RTLD_PRIVATE_ERRNO defined to 1
* sysdeps/unix/sysv/linux/x86_64/sysdep.h: Always include
<dl-sysdep.h>. Test for value of RTLD_PRIVATE_ERRNO instead of
testing whether it is defined.
Samuel Thibault [Sat, 17 Mar 2018 22:27:34 +0000 (23:27 +0100)]
hurd: Replace threadvars with TLS
This gets rid of a lot of kludge and gets closer to other ports.
* hurd/Makefile (headers): Remove threadvar.h.
(inline-headers): Remove threadvar.h.
* hurd/Versions (GLIBC_2.0: Remove __hurd_sigthread_stack_base,
__hurd_sigthread_stack_end, __hurd_sigthread_variables,
__hurd_threadvar_max, __hurd_errno_location.
(HURD_CTHREADS_0.3): Add pthread_getattr_np, pthread_attr_getstack.
* hurd/hurd/signal.h: Do not include <hurd/threadvar.h>.
(_hurd_self_sigstate): Use THREAD_SELF to get _hurd_sigstate.
(_HURD_SIGNAL_H_EXTERN_INLINE): Use THREAD_SELF to get _hurd_sigstate,
unless TLS is not initialized yet, in which case we do not need a
critical section yet anyway.
* hurd/hurd/threadvar.h: Include <tls.h>, do not include
<machine-sp.h>.
(__hurd_sigthread_variables, __hurd_threadvar_max): Remove variables
declarations.
(__hurd_threadvar_index): Remove enum.
(_HURD_THREADVAR_H_EXTERN_INLINE): Remove macro.
(__hurd_threadvar_location_from_sp,__hurd_threadvar_location): Remove
inlines.
(__hurd_reply_port0): New variable declaration.
(__hurd_local_reply_port): New macro.
* hurd/hurdsig.c (__hurd_sigthread_variables): Remove variable.
(interrupted_reply_port_location): Add thread_t parameter. Use it
with THREAD_TCB to access thread-local variables.
(_hurdsig_abort_rpcs): Pass ss->thread to
interrupted_reply_port_location.
(_hurd_internal_post_signal): Likewise.
(_hurdsig_init): Use presence of cthread_fork instead of
__hurd_threadvar_stack_mask to start signal thread by hand.
Remove signal thread threadvar initialization.
* hurd/hurdstartup.c: Do not include <hurd/threadvar.h>
* hurd/sigunwind.c: Include <hurd/threadvar.h>
(_hurdsig_longjmp_from_handler): Use __hurd_local_reply_port instead
of threadvar.
* sysdeps/mach/hurd/Versions (libc.GLIBC_PRIVATE): Add
__libc_lock_self0.
(ld.GLIBC_2.0): Remove __hurd_sigthread_stack_base,
__hurd_sigthread_stack_end, __hurd_sigthread_variables.
(ld.GLIBC_PRIVATE): Add __libc_lock_self0.
* sysdeps/mach/hurd/cthreads.c: Add __libc_lock_self0.
* sysdeps/mach/hurd/dl-sysdep.c (errno, __hurd_sigthread_stack_base,
__hurd_sigthread_stack_end, __hurd_sigthread_variables, threadvars,
__hurd_threadvar_stack_offset, __hurd_threadvar_stack_mask): Do not
define variables.
* sysdeps/mach/hurd/errno-loc.c: Do not include <errno.h> and
<hurd/threadvar.h>.
[IS_IN(rtld)] (rtld_errno): New variable.
[IS_IN(rtld)] (__errno_location): New weak function.
[!IS_IN(rtld)]: Include "../../../csu/errno-loc.c".
* sysdeps/mach/hurd/errno.c: Remove file.
* sysdeps/mach/hurd/fork.c: Include <hurd/threadvar.h>
(__fork): Remove THREADVAR_SPACE macro and its use.
* sysdeps/mach/hurd/i386/init-first.c (__hurd_threadvar_max): Remove
variable.
(init): Do not initialize threadvar.
* sysdeps/mach/hurd/i386/libc.abilist (__hurd_threadvar_max): Remove
symbol.
* sysdeps/mach/hurd/i386/sigreturn.c (__sigreturn): Use
__hurd_local_reply_port instead of threadvar.
* sysdeps/mach/hurd/i386/tls.h (tcbhead_t): Add reply_port and
_hurd_sigstate fields.
(HURD_DESC_TLS, __LIBC_NO_TLS, THREAD_TCB): New macro.
* sysdeps/mach/hurd/i386/trampoline.c: Remove outdated comment.
* sysdeps/mach/hurd/libc-lock.h: Do not include <hurd/threadvar.h>.
(__libc_lock_owner_self): Use &__libc_lock_self0 and THREAD_SELF
instead of threadvar.
* sysdeps/mach/hurd/libc-tsd.h: Remove file.
* sysdeps/mach/hurd/mig-reply.c (GETPORT, reply_port): Remove macros.
(use_threadvar, global_reply_port): Remove variables.
(__hurd_reply_port0): New variable.
(__mig_get_reply_port): Use __hurd_local_reply_port and
__hurd_reply_port0 instead of threadvar.
(__mig_dealloc_reply_port): Likewise.
(__mig_init): Do not initialize threadvar.
* sysdeps/mach/hurd/profil.c: Fix comment.
Samuel Thibault [Sat, 17 Mar 2018 02:17:36 +0000 (03:17 +0100)]
hurd: add TLS support
* sysdeps/generic/thread_state.h (MACHINE_NEW_THREAD_STATE_FLAVOR):
Define macro.
* sysdeps/mach/thread_state.h (MACHINE_THREAD_STATE_FIX_NEW): New macro.
* sysdeps/mach/i386/thread_state.h
(MACHINE_NEW_THREAD_STATE_FLAVOR): New macro, defined to
i386_THREAD_STATE.
(MACHINE_THREAD_STATE_FLAVOR): Define to i386_REGS_SEGS_STATE instead of
i386_THREAD_STATE.
(MACHINE_THREAD_STATE_FIX_NEW): New macro, reads segments.
* sysdeps/mach/hurd/i386/trampoline.c (_hurd_setup_sighandler): Use
i386_REGS_SEGS_STATE instead of i386_THREAD_STATE.
* sysdeps/mach/hurd/i386/tls.h (TCB_ALIGNMENT, HURD_SEL_LDT): New
macros.
(_hurd_tls_fork): Add original thread parameter, Duplicate existing LDT
descriptor instead of creating a new one.
(_hurd_tls_new): New function, creates a new descriptor and updates tcb.
* mach/setup-thread.c: Include <ldsodefs.h>.
(__mach_setup_thread): Call _dl_allocate_tls, pass
MACHINE_NEW_THREAD_STATE_FLAVOR to __thread_set_state instead of
MACHINE_THREAD_STATE_FLAVOR, before getting
MACHINE_THREAD_STATE_FLAVOR, calling _hurd_tls_new, and setting
MACHINE_THREAD_STATE_FLAVOR with the result.
* hurd/hurdfault.c (_hurdsig_fault_init): Call
MACHINE_THREAD_STATE_FIX_NEW.
* sysdeps/mach/hurd/fork.c (__fork): Call _hurd_tls_fork for sigthread
too. Add original thread parameter.
Rafal Luzynski [Fri, 16 Mar 2018 21:55:11 +0000 (22:55 +0100)]
NEWS: Mention the locale data changes (bug 22848, 22937, 22963).
Alternative (nominative/genitive) month names have been added to the
Catalan and Czech locale data and the abbreviated alternative names to
Catalan and Greek.
Continuing the removals of inline functions from the x86
bits/mathinline.h, this patch removes an inline of __finite (which was
not actually architecture-specific at all beyond its
endianness-dependence).
This inline is not normally used with GCC 4.4 or later, because
isfinite now uses __builtin_isfinite except for -fsignaling-nans.
Allowing __builtin_isfinite etc. to work properly even for
-fsignaling-nans, by implementing versions of those built-in functions
that use integer arithmetic in GCC, is
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66462> (a patch was
committed but had to be reverted because it caused problems, and that
patch didn't address all formats for all architectures, only some, so
by itself would not have been sufficient to allow glibc to use
__builtin_isfinite unconditionally for new-enough GCC).
Wilco Dijkstra [Thu, 15 Mar 2018 18:21:58 +0000 (18:21 +0000)]
Remove all target specific __ieee754_sqrt(f/l) inlines
Remove the now unused target specific__ieee754_sqrt(f/l) inlines.
Also remove inlines of sqrt which are for really old GCC versions.
Removing these is desirable, under the general principle of leaving
such inlining to the compiler rather than trying to do it in installed
headers, especially when only very old compilers are affected.
Note that removing inlines for __ieee754_sqrt disables inlining in the
sqrt wrapper functions. Given the sqrt function will typically only be
called for negative arguments, it doesn't matter whether the inlining
happens or not.
Wilco Dijkstra [Thu, 15 Mar 2018 17:57:03 +0000 (17:57 +0000)]
Add support for sqrt asm redirects
This patch series cleans up the many uses of __ieee754_sqrt(f/l) in GLIBC.
The goal is to enable GCC to do the inlining, and if this fails call the
__ieee754_sqrt function. This is done by internally declaring sqrt with asm
redirects. The compat symbols and sqrt wrappers need to disable the redirect.
The redirect is also disabled if there are already redirects defined when
using -ffinite-math-only.
All math functions (but not math tests, non-library code and libnldbl) are
built with -fno-math-errno which means GCC will typically inline sqrt as a
single instruction. This means targets are no longer forced to add a special
inline for sqrt.
* include/math.h (sqrt): Declare with asm redirect.
(sqrtf): Likewise.
(sqrtl): Likewise.
(sqrtf128): Likewise.
* Makeconfig: Add -fno-math-errno for libc/libm, but build testsuite,
nonlib and libnldbl with -fmath-errno.
* math/w_sqrt_compat.c: Define NO_MATH_REDIRECT.
* math/w_sqrt_template.c: Likewise.
* math/w_sqrtf_compat.c: Likewise.
* math/w_sqrtl_compat.c: Likewise.
* sysdeps/i386/fpu/w_sqrt.c: Likewise.
* sysdeps/i386/fpu/w_sqrt_compat.c: Likewise.
* sysdeps/generic/math-type-macros-float128.h: Remove math.h and
complex.h.
Joseph Myers [Thu, 15 Mar 2018 18:26:35 +0000 (18:26 +0000)]
Remove more old-compilers parts of sysdeps/x86/fpu/bits/mathinline.h.
This patch removes further parts of sysdeps/x86/fpu/bits/mathinline.h
that are only of value for optimization with older compiler versions,
in accordance with general principles of preferring the let the
compiler deal with such inlining through built-in functions.
In general, GCC supports inlining all these functions as of version
4.3 or earlier. However, some inlines in GCC may have had excessively
restrictive conditions in past GCC versions (e.g. requiring
-ffast-math when the inline is valid under broader conditions). (In
particular, GCC had, before GCC 7, unnecessarily restrictive
conditions on when it could apply floor and ceil inlines corresponding
to the ones removed here. The same was true for rint, but
bits/mathinline.h *also* was excessively restrictive there.)
The removed sincos inlines are for __sincos etc. functions (not a
public interface and not currently used in this header either; not in
a part of the header ever used for building glibc itself). Likewise,
the atan2 inlines included one for __atan2l, also not a public
interface and not used for building glibc itself (calls inside glibc
generally use __ieee754_atan2l, for which there is a separate
__LIBC_INTERNAL_MATH_INLINES case in this header).
Wilco Dijkstra [Thu, 15 Mar 2018 15:44:58 +0000 (15:44 +0000)]
Use correct includes in benchtests
Currently the benchtests are run with internal GLIBC headers, which is incorrect.
Defining _ISOMAC in the makefile ensures the internal headers are bypassed.
Fix all tests which were relying on internal defines or includes.
As spotted by GNOME translation team, Greek language has the actually
visible difference between the abbreviated nominative and the abbreviated
genitive case for some month names. Examples:
A GNOME translator asked to use the same abbreviated month names
as provided by CLDR. This sounds reasonable. See the discussion:
https://bugzilla.gnome.org/show_bug.cgi?id=793645#c27
[BZ #22932]
* localedata/locales/lt_LT (abmon): Synchronize with CLDR.
Joseph Myers [Wed, 14 Mar 2018 18:26:03 +0000 (18:26 +0000)]
Remove old-GCC parts of x86 bits/mathinline.h.
In accordance with the general principle of preferring to let the
compiler optimize function calls based on their standard semantics
rather than putting inline definitions of such functions in installed
headers, this patch removes various such inline definitions in the x86
bits/mathinline.h that were already disabled for GCC 3.5 or later and
so were only used with very old compilers (for which good optimization
is particularly unimportant); along with those inlines, a definition
of __M_SQRT2, which was only used in such inline functions, is also
removed. This is similar to an early step in removing the string.h
inlines; I intend to follow up with further removals of
bits/mathinline.h inline definitions in appropriate logical groups
(with GCC bugs filed in cases where GCC doesn't already support
corresponding optimizations).
aarch64: Improve strncmp for mutually misaligned inputs
The mutually misaligned inputs on aarch64 are compared with a simple
byte copy, which is not very efficient. Enhance the comparison
similar to strcmp by loading a double-word at a time. The peak
performance improvement (i.e. 4k maxlen comparisons) due to this on
the strncmp microbenchmark is as follows:
falkor: 3.5x (up to 72% time reduction)
cortex-a73: 3.5x (up to 71% time reduction)
cortex-a53: 3.5x (up to 71% time reduction)
All mutually misaligned inputs from 16 bytes maxlen onwards show
upwards of 15% improvement and there is no measurable effect on the
performance of aligned/mutually aligned inputs.
* sysdeps/aarch64/strncmp.S (count): New macro.
(strncmp): Store misaligned length in SRC1 in COUNT.
(mutual_align): Adjust.
(misaligned8): Load dword at a time when it is safe.
Zack Weinberg [Thu, 22 Feb 2018 00:12:51 +0000 (19:12 -0500)]
[BZ 1190] Make EOF sticky in stdio.
C99 specifies that the EOF condition on a file is "sticky": once EOF
has been encountered, all subsequent reads should continue to return
EOF until the file is closed or something clears the "end-of-file
indicator" (e.g. fseek, clearerr). This is arguably a change from
C89, where the wording was ambiguous; the BSDs always had sticky EOF,
but the System V lineage would attempt to read from the underlying fd
again. GNU libc has followed System V for as long as we've been
using libio, but nowadays C99 conformance and BSD compatibility are
more important than System V compatibility.
You might wonder if changing the _underflow impls is sufficient to
apply the C99 semantics to all of the many stdio functions that
perform input. It should be enough to cover all paths to _IO_SYSREAD,
and the only other functions that call _IO_SYSREAD are the _seekoff
impls, which is OK because seeking clears EOF, and the _xsgetn impls,
which, as far as I can tell, are unused within glibc.
The test programs in this patch use a pseudoterminal to set up the
necessary conditions. To facilitate this I added a new test-support
function that sets up a pair of pty file descriptors for you; it's
almost the same as BSD openpty, the only differences are that it
allocates the optionally-returned tty pathname with malloc, and that
it crashes if anything goes wrong.
[BZ #1190]
[BZ #19476]
* libio/fileops.c (_IO_new_file_underflow): Return EOF immediately
if the _IO_EOF_SEEN bit is already set; update commentary.
* libio/oldfileops.c (_IO_old_file_underflow): Likewise.
* libio/wfileops.c (_IO_wfile_underflow): Likewise.
* support/support_openpty.c, support/tty.h: New files.
* support/Makefile (libsupport-routines): Add support_openpty.
* libio/tst-fgetc-after-eof.c, wcsmbs/test-fgetwc-after-eof.c:
New test cases.
* libio/Makefile (tests): Add tst-fgetc-after-eof.
* wcsmbs/Makefile (tests): Add tst-fgetwc-after-eof.
David Michael [Sun, 11 Mar 2018 23:21:44 +0000 (00:21 +0100)]
Lookup the startup server through /servers/startup
* sysdeps/mach/hurd/reboot.c: Include <hurd/paths.h>
(reboot): Lookup _SERVERS_STARTUP instead of calling proc_getmsgport to get a
port to the startup server.
Zack Weinberg [Sun, 11 Mar 2018 18:09:30 +0000 (14:09 -0400)]
nldbl-compat.c: Include math.h before nldbl-compat.h.
Jeff Law noticed that native PowerPC builds were broken by my having
made math_ldbl_opt.h not include math.h. nldbl-compat.c formerly got
math.h via libioP.h and math_ldbl_opt.h, *without* __NO_LONG_DOUBLE_MATH;
after my change it got it via nldbl-compat.h *with* __NO_LONG_DOUBLE_MATH,
but __NO_LONG_DOUBLE_MATH mode is forbidden on hosts that define
__HAVE_DISTINCT_FLOAT128, so the build breaks. This is the quick fix.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.c: Include math.h
before nldbl-compat.h.
Zack Weinberg [Wed, 7 Mar 2018 16:45:35 +0000 (16:45 +0000)]
Don't include math.h/math_private.h in math_ldbl_opt.h.
The sysdeps/ieee754/ldbl-opt version of math_ldbl_opt.h includes
math.h and math_private.h, despite not having any need for those
headers itself; the sysdeps/generic version doesn't. About 20 files
are relying on math_ldbl_opt.h to include math.h and/or math_private.h
for them, even though none of them necessarily used on a platform that
needs ldbl-opt support.
* sysdeps/ieee754/ldbl-opt/math_ldbl_opt.h: Don't include
math.h or math_private.h.
Zack Weinberg [Fri, 9 Mar 2018 14:42:04 +0000 (09:42 -0500)]
alpha/clone.S: Invoke .set noat/.set at around explicit uses of $at
On Alpha, the register $at is, by default, reserved for use by the
assembler, in the expansion of pseudo-instructions. It's also used
by the special calling convention for _mcount. We get warnings from
Alpha clone.S because the code to call _mcount isn't properly marked
up to tell the assembler not to use $at itself.
* sysdeps/unix/sysv/linux/alpha/clone.s (__clone): Wrap manual
uses of $at in .set noat / .set at.
Aurelien Jarno [Thu, 8 Mar 2018 23:14:27 +0000 (00:14 +0100)]
sparc32: Add nop before __startcontext to stop unwinding [BZ #22919]
On sparc32 tst-makecontext fails, as backtrace called within a context
created by makecontext to yield infinite backtrace.
Fix that the same way than nios2 by adding a nop just before
__startcontext. This is needed as otherwise FDE lookup just repeatedly
finds __setcontext's FDE in an infinite loop, due to the convention of
using 'address - 1' for FDE lookup.
Some SPE opcodes clashes with some recent PowerISA opcodes and
until recently gas did not complain about it. However binutils
recently changed it and now VLE configured gas does not support to
assembler some instruction that might class with VLE (HTM for
instance). It also does not help that glibc build hardware lock
elision support as default (regardless of assembler support).
Although runtime will not actually enables TLE on SPE hardware
(since kernel will not advertise it), I see little advantage on
adding HTM support on SPE built glibc. SPE uses an incompatible
ABI which does not allow share the same build with default
powerpc and HTM code slows down SPE without any benefict.
This patch fixes it by only building HTM when SPE configuration
is not used.
Checked with a powerpc-linux-gnuspe build. I also did some sniff
tests on a e500 hardware without any issue.
[BZ #22926]
* sysdeps/powerpc/powerpc32/sysdep.h (ABORT_TRANSACTION_IMPL): Define
empty for __SPE__.
* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-lock.c (__lll_lock_elision):
Do not build hardware transactional code for __SPE__.
* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
(__lll_trylock_elision): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
(__lll_unlock_elision): Likewise.
This patch refactors the ARCH_FORK macro and the required architecture
specific header to simplify the required architecture definitions
to provide the fork syscall semantic and proper document current
Linux clone ABI variant.
Instead of require the reimplementation of arch-fork.h header, this
patch changes the ARCH_FORK to an inline function with clone ABI
defined by kernel-features.h define. The generic kernel ABI meant
for newer ports is used as default and redefine if the architecture
requires.
Checked on x86_64-linux-gnu and i686-linux-gnu. Also with a build
for all the afected ABIs.