sourceware.org Git - glibc.git/log

Define MADV_COLD and MADV_PAGEOUT from Linux 5.4.

Linux 5.4 adds constants MADV_COLD and MADV_PAGEOUT (defined with the
same values on all architectures). This patch adds them to
bits/mman-linux.h.

Tested for x86_64.

Move _dl_open_check to its original place in dl_open_worker

This reverts the non-test change from commit d0093c5cefb7f7a4143f
("Call _dl_open_check after relocation [BZ #24259]"), given that
the underlying bug has been fixed properly in commit 61b74477fa7f63
("Remove all loaded objects if dlopen fails, ignoring NODELETE
[BZ #20839]").

Tested on x86-64-linux-gnu, with and without --enable-cet.

Change-Id: I995a6cfb89f25d2b0cf5e606428c2a93eb48fc33

Block signals during the initial part of dlopen

Lazy binding in a signal handler that interrupts a dlopen sees
intermediate dynamic linker state.  This has likely been always
unsafe, but with the new pending NODELETE state, this is clearly
incorrect.  Other threads are excluded via the loader lock, but the
current thread is not.  Blocking signals until right before ELF
constructors run is the safe thing to do.

Change-Id: Iad079080ebe7442c13313ba11dc2797953faef35

Remove all loaded objects if dlopen fails, ignoring NODELETE [BZ #20839]

This introduces a “pending NODELETE” state in the link map, which is
flipped to the persistent NODELETE state late in dlopen, via
activate_nodelete.    During initial relocation, symbol binding
records pending NODELETE state only.  dlclose ignores pending NODELETE
state.  Taken together, this results that a partially completed dlopen
is rolled back completely because new NODELETE mappings are unloaded.

Tested on x86_64-linux-gnu and i386-linux-gnu.

Change-Id: Ib2a3d86af6f92d75baca65431d74783ee0dbc292

Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]

This change splits the scope and TLS slotinfo updates in dlopen into
two parts: one to resize the data structures, and one to actually apply
the update.  The call to add_to_global_resize in dl_open_worker is moved
before the demarcation point at which no further memory allocations are
allowed.

_dl_add_to_slotinfo is adjusted to make the list update optional.  There
is some optimization possibility here because we could grow the slotinfo
list of arrays in a single call, one the largest TLS modid is known.

This commit does not fix the fatal meory allocation failure in
_dl_update_slotinfo.  Ideally, this error during dlopen should be
recoverable.

The update order of scopes and TLS data structures is retained, although
it appears to be more correct to fully initialize TLS first, and then
expose symbols in the newly loaded objects via the scope update.

Tested on x86_64-linux-gnu.

Change-Id: I240c58387dabda3ca1bcab48b02115175fa83d6c

Avoid late failure in dlopen in global scope update [BZ #25112]

The call to add_to_global in dl_open_worker happens after running ELF
constructors for new objects.  At this point, proper recovery from
malloc failure would be quite complicated: We would have to run the
ELF destructors and close all opened objects, something that we
currently do not do.

Instead, this change splits add_to_global into two phases,
add_to_global_resize (which can raise an exception, called before ELF
constructors run), and add_to_global_update (which cannot, called
after ELF constructors).  A complication arises due to recursive
dlopen: After the inner dlopen consumes some space, the pre-allocation
in the outer dlopen may no longer be sufficient.  A new member in the
namespace structure, _ns_global_scope_pending_adds keeps track of the
maximum number of objects that need to be added to the global scope.
This enables the inner add_to_global_resize call to take into account
the needs of an outer dlopen.

Most code in the dynamic linker assumes that the number of global
scope entries fits into an unsigned int (matching the r_nlist member
of struct r_scop_elem).  Therefore, change the type of
_ns_global_scope_alloc to unsigned int (from size_t), and add overflow
checks.

Change-Id: Ie08e2f318510d5a6a4bcb1c315f46791b5b77524

Lazy binding failures during dlopen/dlclose must be fatal [BZ #24304]

If a lazy binding failure happens during the execution of an ELF
constructor or destructor, the dynamic loader catches the error
and reports it using the dlerror mechanism.  This is undesirable
because there could be other constructors and destructors that
need processing (which are skipped), and the process is in an
inconsistent state at this point.  Therefore, we have to issue
a fatal dynamic loader error error and terminate the process.

Note that the _dl_catch_exception in _dl_open is just an inner catch,
to roll back some state locally.  If called from dlopen, there is
still an outer catch, which is why calling _dl_init via call_dl_init
and a no-exception is required and cannot be avoiding by moving the
_dl_init call directly into _dl_open.

_dl_fini does not need changes because it does not install an error
handler, so errors are already fatal there.

Change-Id: I6b1addfe2e30f50a1781595f046f44173db9491a

resolv: Implement trust-ad option for /etc/resolv.conf [BZ #20358]

This introduces a concept of trusted name servers, for which the
AD bit is passed through to applications. For untrusted name
servers (the default), the AD bit in responses are cleared, to
provide a safe default.

This approach is very similar to the one suggested by Pavel Šimerda
in <https://bugzilla.redhat.com/show_bug.cgi?id=1164339#c15>.

The DNS test framework in support/ is enhanced with support for
setting the AD bit in responses.

Tested on x86_64-linux-gnu.

Change-Id: Ibfe0f7c73ea221c35979842c5c3b6ed486495ccc

dlsym: Do not determine caller link map if not needed

Obtaining the link map is potentially very slow because it requires
iterating over all loaded objects in the current implementation. If
the caller supplied an explicit handle (i.e., not one of the RTLD_*
constants), the dlsym implementation does not need the identity of the
caller (except in the special case of auditing), so this change
avoids computing it in that case.

Even in the minimal case (dlsym called from a main program linked with
-dl), this shows a small speedup, perhaps around five percent. The
performance improvement can be arbitrarily large in principle (if
_dl_find_dso_for_object has to iterate over many link maps).

Change-Id: Ide5d9e2cc7ac25a0ffae8fb4c26def0c898efa29

libio: Disable vtable validation for pre-2.1 interposed handles [BZ #25203]

Commit c402355dfa7807b8e0adb27c009135a7e2b9f1b0 ("libio: Disable
vtable validation in case of interposition [BZ #23313]") only covered
the interposable glibc 2.1 handles, in libio/stdfiles.c. The
parallel code in libio/oldstdfiles.c needs similar detection logic.

Fixes (again) commit db3476aff19b75c4fdefbe65fcd5f0a90588ba51
("libio: Implement vtable verification [BZ #20191]").

Change-Id: Ief6f9f17e91d1f7263421c56a7dc018f4f595c21

ldbl-128ibm-compat: Add syslog functions

Similarly to __vfprintf_internal and __vfscanf_internal, the internal
implementation of syslog functions (__vsyslog_internal) takes a
'mode_flags' parameter used to select the format of long double
parameters. This patch adds variants of the syslog functions that set
'mode_flags' to PRINTF_LDBL_USES_FLOAT128, thus enabling the correct
printing of long double values on powerpc64le, when long double has IEEE
binary128 format (-mabi=ieeelongdouble).

Tested for powerpc64le.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add obstack printing functions

Similarly to the functions from the *printf family, this patch adds
implementations for __obstack_*printf* functions that set the
'mode_flags' parameter to PRINTF_LDBL_USES_FLOAT128, before making calls
to __vfprintf_internal (indirectly through __obstack_vprintf_internal).

Tested for powerpc64le.

Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Reuse tests for err.h and error.h functions

Commit IDs 9771e6cb5102 and 7597b0c7f711 added tests for the functions
from err.h and error.h that can take long double parameters.
Afterwards, commit ID f0eaf8627654 reused them on architectures that
changed the long double format from the same as double to something else
(i.e.: architectures that imply ldbl-opt). This patch reuses it again
for IEEE long double on powerpc64le.

Tested for powerpc64le.

Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add error.h functions

Use the recently added, internal functions, __error_at_line_internal and
__error_internal, to provide error.h functions that can take long double
arguments with IEEE binary128 format on platforms where long double can
also take double format or some non-IEEE format (currently, this means
powerpc64le).

Tested for powerpc64le.

Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add err.h functions

Use the recently added, internal functions, __vwarnx_internal and
__vwarn_internal, to provide err.h functions that can take long double
arguments with IEEE binary128 format on platforms where long double can
also take double format or some non-IEEE format (currently, this means
powerpc64le).

Tested for powerpc64le.

Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add argp_error and argp_failure

Use the recently added, internal functions, __argp_error_internal and
__argp_failure_internal, to provide argp_error and argp_failure that can
take long double arguments with IEEE binary128 format on platforms where
long double can also take double format or some non-IEEE format
(currently, this means powerpc64le).

Tested for powerpc64le.

Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>

sparc: Use atomic compiler builtins on sparc

This patch removes the arch-specific atomic instruction, relying on
compiler builtins.  The __sparc32_atomic_locks support is removed
and a configure check is added to check if compiler uses libatomic
to implement CAS.

It also removes the sparc specific sem_* and pthread_barrier_*
implementations.  It in turn allows buidling against a LEON3/LEON4
sparcv8 target, although it will still be incompatible with generic
sparcv9.

Checked on sparcv9-linux-gnu and sparc64-linux-gnu.  I also checked
with build against sparcv8-linux-gnu with -mcpu=leon3.

Tested-by: Andreas Larsson <andreas@gaisler.com>

Remove 32 bit sparc v7 support

The patch is straighforward:

  - The sparc32 v8 implementations are moved as the generic ones.

  - A configure test is added to check for either __sparc_v8__ or
    __sparc_v9__.

  - The triple names are simplified and sparc implies sparcv8.

The idea is to keep support on sparcv8 architectures that does support
CAS instructions, such as LEON3/LEON4.

Checked on a sparcv9-linux-gnu and sparc64-linux-gnu.

Tested-by: Andreas Larsson <andreas@gaisler.com>

S390: Fix handling of needles crossing a page in strstr z15 ifunc-variant. [BZ #25226]

If the specified needle crosses a page-boundary, the s390-z15 ifunc variant of
strstr truncates the needle which results in invalid results.

This is fixed by loading the needle beyond the page boundary to v18 instead of v16.
The bug is sometimes observable in test-strstr.c in check1 and check2 as the
haystack and needle is stored on stack. Thus the needle can be on a page boundary.

check2 is now extended to test haystack / needles located on stack, at end of page
and on two pages.

This bug was introduced with commit 6f47401bd5fc71209219779a0426170a9a7395b0
("S390: Add arch13 strstr ifunc variant.") and is already released in glibc 2.30.

nptl: Fix __PTHREAD_MUTEX_INITIALIZER for !__PTHREAD_MUTEX_HAVE_PREV

The nptl: Add struct_mutex.h added a wrong initializer for
architectures that uses the generic struct_mutex.h.

Checked on sparcv9-linux-gnu (where I noted the issue with the
nptl/tst-initializers1*).

Compile elf/rtld.c with -fno-tree-loop-distribute-patterns.

In GCC 10, the default at -O2 is now -ftree-loop-distribute-patterns.
This optimization causes GCC to "helpfully" convert the hand-written
loop in _dl_start into a call to memset, which is not available that
early in program startup. Similar problems in other places in GLIBC
have been addressed by explicitly building with
-fno-tree-loop-distribute-patterns, but this one may have been
overlooked previously because it only affects targets where
HAVE_BUILTIN_MEMSET is not defined.

This patch fixes a bug observed on nios2-linux-gnu target that caused
all programs to segv on startup.

hppa: Remove unrequired nptl headers

Now that both pthread_mutex_t and pthread_rwlock_t static initializer
are parametrized in their own headers HPPA pthread.h is identical to
generic nptl one.

Checked on hppa-linux-gnu.

Change-Id: I236cfceb5656cfcce42c9e367a4f6803e2abd88b

nptl: Add default pthread-offsets.h

This patch adds a default pthread-offsets.h based on default
thread definitions from struct_mutex.h and struct_rwlock.h.
The idea is to simplify new ports inclusion.

Checked with a build on affected abis.

Change-Id: I7785a9581e651feb80d1413b9e03b5ac0452668a

nptl: Add default pthreadtypes-arch.h

This patch adds a default pthreadtypes-arch.h, the idea is to simpify
new ports inclusion and an override is required only if the architecture
adds some arch-specific extensions or requirement.

The default values on the new generic header are based on current
architecture define value and they are not optimal compared to current
code requirements as below.

  - On 64 bits __SIZEOF_PTHREAD_BARRIER_T is defined as 32 while is
    sizeof (struct pthread_barrier) is 20 bytes.

  - On 32 bits __SIZEOF_PTHREAD_ATTR_T is defined as 36 while
    sizeof (struct pthread_attr) is 32.

The default values are not changed so the generic header could be
used by some architectures.

Checked with a build on affected abis.

Change-Id: Ie0cd586258a2650f715c1af0c9fe4e7063b0409a

nptl: Add struct_rwlock.h

This patch adds a new generic __pthread_rwlock_arch_t definition meant
to be used by new ports.  Its layout mimics the current usage on some
64 bits ports and it allows some ports to use the generic definition.
The arch __pthread_rwlock_arch_t definition is moved from
pthreadtypes-arch.h to another arch-specific header (struct_rwlock.h).

Also the static intialization macro for pthread_rwlock_t is set to use
an arch defined on (__PTHREAD_RWLOCK_INITIALIZER) which simplifies its
implementation.

The default pthread_rwlock_t layout differs from current ports with:

  1. Internal layout is the same for 32 bits and 64 bits.

  2. Internal flag is an unsigned short so it should not required
     additional padding to align for word boundary (if it is the case
     for the ABI).

Checked with a build on affected abis.

Change-Id: I776a6a986c23199929d28a3dcd30272db21cd1d0

nptl: Add struct_mutex.h

The current way of defining the common mutex definition for POSIX and
C11 on pthreadtypes-arch.h (added by commit 06be6368da16104be5) is
not really the best options for newer ports.  It requires define some
misleading flags that should be always defined as 0
(__PTHREAD_COMPAT_PADDING_MID and __PTHREAD_COMPAT_PADDING_END), it
exposes options used solely for linuxthreads compat mode
(__PTHREAD_MUTEX_USE_UNION and __PTHREAD_MUTEX_NUSERS_AFTER_KIND), and
requires newer ports to explicit define them (adding more boilerplate
code).

This patch adds a new default __pthread_mutex_s definition meant to
be used by newer ports.  Its layout mimics the current usage on both
32 and 64 bits ports and it allows most ports to use the generic
definition.  Only ports that use some arch-specific definition (such
as hardware lock-elision or linuxthreads compat) requires specific
headers.

For 32 bit, the generic definitions mimic the other 32-bit ports
of using an union to define the fields uses on adaptive and robust
mutexes (thus not allowing both usage at same time) and by using a
single linked-list for robust mutexes.  Both decisions seemed to
follow what recent ports have done and make the resulting
pthread_mutex_t/mtx_t object smaller.

Also the static intialization macro for pthread_mutex_t is set to use
a macro __PTHREAD_MUTEX_INITIALIZER where the architecture can redefine
in its struct_mutex.h if it requires additional fields to be
initialized.

Checked with a build on affected abis.

Change-Id: I30a22c3e3497805fd6e52994c5925897cffcfe13

nptl: Remove rwlock elision definitions

The new rwlock implementation added by cc25c8b4c1196 (2.25) removed
support for lock-elision. This patch removes remaining the
arch-specific unused definitions.

Checked with a build against all affected ABIs.

Change-Id: I5dec8af50e3cd56d7351c52ceff4aa3771b53cd6

nptl: Add tests for internal pthread_rwlock_t offsets

This patch new build tests to check for internal fields offsets for
internal pthread_rwlock_t definition. Althoug the '__data.__flags'
field layout should be preserved due static initializators, the patch
also adds tests for the futexes that may be used in a shared memory
(although using different libc version in such scenario is not really
supported).

Checked with a build against all affected ABIs.

Change-Id: Iccc103d557de13d17e4a3f59a0cad2f4a640c148

nptl: Cleanup mutex internal offset tests

The offsets of pthread_mutex_t __data.__nusers, __data.__spins,
__data.elision, __data.list are not required to be constant over
the releases. Only the __data.__kind is used for static
initializers.

This patch also adds an additional size check for __data.__kind.

Checked with a build against affected ABIs.

Change-Id: I7a4e48cc91b4c4ada57e9a5d1b151fb702bfaa9f

locale: Greek -> ASCII transliteration table [BZ #12031]

ru_UA locale: use copy "ru_RU" in LC_TIME (bug 25044)

Replacing incorrect abbreviated weekday names "Пнд", "Вто", "Срд"...
with correct ones "Пн", "Вт", "Ср"... makes the LC_TIME sections in
those two locales almost identical. The only remaining difference
was that ab_alt_mon elements in ru_UA were lowercase while in ru_RU
they had the first letter uppercase, the latter was pointed as
a better choice by a native speaker. This commit unifies LC_TIME
between ru_RU and ru_UA.

sysdeps/posix/getaddrinfo: Return early on invalid address family

Check address family before expensive function call (__check_pf).

sysdeps/posix: Simplify if expression in getaddrinfo

Small code cleanup for better readability.

Use Linux 5.4 in build-many-glibcs.py.

This patch makes build-many-glibcs.py use Linux 5.4.

Tested with build-many-glibcs.py (compilers and glibcs builds).

arm: Fix armv7 selection after 'Split BE/LE abilist'

It adds the missing Implies for armv7, armv6, armv6t2 after the
commit 1673ba87fefe019c. Without the Implies a build with the
compiler targeting the aforementioned architecture does not select
the arch-specific optimization including the ifunc selectors.

I checked with a build against armv5, armv6, armv6t2, armv7, and
armv7-neon for both LE and BE. For armv6 and armv7 I also checked
that both sysdeps selection and the resulting implementation built
is the expected ones.

ldbl-128ibm-compat: Add wide character scanning functions

Similarly to what was done for regular character scanning functions,
this patch uses the new mode mask, SCANF_LDBL_USES_FLOAT128, in the
'mode' argument of the wide characters scanning function,
__vfwscanf_internal (which is also extended to support scanning
floating-point values with IEEE binary128, by redirecting calls to
__wcstold_internal to __wcstof128_internal).

Tested for powerpc64le.

Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add regular character scanning functions

The 'mode' argument to __vfscanf_internal allows the selection of the
long double format for all long double arguments requested by the format
string. Currently, there are two possibilities: long double with the
same format as double or long double as something else. The 'something
else' format varies between architectures, and on powerpc64le, it means
IBM Extended Precision format.

In preparation for the third option of long double format on
powerpc64le, this patch uses the new mode mask,
SCANF_LDBL_USES_FLOAT128, which tells __vfscanf_internal to call
__strtof128_internal, instead of __strtold_internal, and save the output
into a _Float128 variable.

Tested for powerpc64le.

Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Test positional arguments

The format string can request positional parameters, instead of relying
on the order in which they appear as arguments. Since this has an
effect on how the type of each argument is determined, this patch
extends the test cases to use positional parameters with mixed double
and long double types, to verify that the IEEE long double
implementations of *printf work correctly in this scenario.

Tested for powerpc64le.

Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Test double values

A single format string can take double and long double parameters at the
same time. Internally, these parameters are routed to the same
function, which correctly reads them and calls the underlying functions
responsible for the actual conversion to string. This patch adds a new
case to test this scenario.

Tested for powerpc64le.

Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add wide character, fortified printing functions

Similarly to what was done for the regular character, fortified printing
functions, this patch combines the mode masks PRINTF_LDBL_USES_FLOAT128
and PRINTF_FORTIFY to provide wide character versions of fortified
printf functions. It also adds two flavors of test cases: one that
explicitly calls the fortified functions, and another that reuses the
non-fortified test, but defining _FORTIFY_SOURCE as 2. The first
guarantees that the implementations are actually being tested
(independently of what's in bits/wchar2.h), whereas the second
guarantees that the redirections calls the correct function in the IBM
and IEEE long double cases.

Tested for powerpc64le.

Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add regular character, fortified printing functions

Since the introduction of internal functions with explicit flags for the
printf family of functions, the 'mode' parameter can be used to select
which format long double parameters have (with the mode flags:
PRINTF_LDBL_IS_DBL and PRINTF_LDBL_USES_FLOAT128), as well as to select
whether to check for overflows (mode flag: PRINTF_FORTIFY).

This patch combines PRINTF_LDBL_USES_FLOAT128 and PRINTF_FORTIFY to
provide the IEEE binary128 version of printf-like function for platforms
where long double can take this format, in addition to the double format
and to some non-ieee format (currently, this means powerpc64le).

There are two flavors of test cases provided with this patch: one that
explicitly calls the fortified functions, for instance __asprintf_chk,
and another that reuses the non-fortified test, but defining
_FORTIFY_SOURCE as 2. The first guarantees that the implementations are
actually being tested (in bits/stdio2.h, vprintf gets redirected to
__vfprintf_chk, which would leave __vprintf_chk untested), whereas the
second guarantees that the redirections calls the correct function in
the IBM and IEEE long double cases.

Tested for powerpc64le.

Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add wide character printing functions

Similarly to what was done for regular character printing functions,
this patch uses the new mode mask, PRINTF_LDBL_USES_FLOAT128, in the
'mode' argument of the wide characters printing function,
__vfwprintf_internal (which is also extended to support printing
floating-point values with IEEE binary128, by saving floating-point
values into variables of type __float128 and adjusting the parameters to
__printf_fp and __printf_fphex as if it was a call from a wide-character
version of strfromf128 (even though such version does not exist)).

Tested for powerpc64le.

Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

ldbl-128ibm-compat: Add regular character printing functions

The 'mode' argument to __vfprintf_internal allows the selection of the
long double format for all long double arguments requested by the format
string.  Currently, there are two possibilities: long double with the
same format as double or long double as something else.  The 'something
else' format varies between architectures, and on powerpc64le, it means
IBM Extended Precision format.

In preparation for the third option of long double format on
powerpc64le, this patch uses the new mode mask,
PRINTF_LDBL_USES_FLOAT128, which tells __vfprintf_internal to save the
floating-point values into variables of type __float128 and adjusts the
parameters to __printf_fp and __printf_fphex as if it was a call from
strfromf128.

Many files from the stdio-common, wcsmbs, argp, misc, and libio
directories will have IEEE binary128 counterparts.  Setting the correct
compiler options to these files (original and counterparts) would
produce a large amount of repetitive Makefile rules.  To avoid this
repetition, this patch adds a Makefile routine that iterates over the
files adding or removing the appropriate flags.

Tested for powerpc64le.

Reviewed-By: Florian Weimer <fweimer@redhat.com>
Reviewed-By: Joseph Myers <joseph@codesourcery.com>
Reviewed-By: Paul E. Murphy <murphyp@linux.ibm.com>

Use DEPRECATED_SCANF macro for remaining C99-compliant scanf functions

When the commit

commit 03992356e6fedc5a5e9d32df96c1a2c79ea28a8f
Author: Zack Weinberg <zackw@panix.com>
Date: Sat Feb 10 11:58:35 2018 -0500

Use C99-compliant scanf under _GNU_SOURCE with modern compilers.

added the DEPRECATED_SCANF macro to select when redirections of *scanf
functions to their ISO C99 compliant versions should happen, it
accidentally missed doing it for vfwscanf, vwscanf, and vswscanf.

Tested for powerpc64le and with build-many-glibcs (i686-linux-gnu and
nios2-linux-gnu are failing with current master, and with this patch,
but I didn't see a regression).

Change-Id: I706b344a3fb50be017cdab9251d9da18a3ba8c60

misc: Set generic pselect as ENOSYS

The generic pselect implementation has the very specific race condition
that motived the creation of the pselect syscall (no atomicity in
signal mask set/reset). Using it as generic implementation is
counterproductive Also currently only microblaze uses it as fallback
when used on kernel prior 3.15.

This patch moves the generic implementation to a microblaze specific
one, sets the generic internal as a ENOSYS, and cleanups the Linux
generic implementation.

The microblaze implementation mimics the previous Linux generic one,
where it either uses pselect6 directly if __ASSUME_PSELECT or a
first try pselect6 then the fallback otherwise.

Checked on x86_64-linux-gnu and microblaze-linux-gnu.

Remove duplicate inline implementation of issignalingf

Very recent commit 854e91bf6b4221f424ffa13b9ef50f35623b7b74 enabled
inline of issignalingf() in general (__issignalingf in include/math.h).
There is another implementation for an inline use of issignalingf
(issignalingf_inline in sysdeps/ieee754/flt-32/math_config.h)
which could instead make use of the new enablement.

Replace the use of issignalingf_inline with __issignaling. Using
issignaling (instead of __issignalingf) will allow future enhancements
to the type-generic implementation, issignaling, to be automatically
adopted.

The implementations are slightly different, and compile to slightly
different code, but I measured no significant performance difference.

The second implementation was brought to my attention by:
Suggested-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Joseph Myers <joseph@codesourcery.com>

Don't use a custom wrapper macro around __has_include (bug 25189).

This causes issues when using clang with -frewrite-includes to e.g.,
submit the translation unit to a distributed compiler.

In my case, I was building Firefox using sccache.

See [1] for a reduced test-case since I initially thought this was a
clang bug, and [2] for more context.

Apparently doing this is invalid C++ per [cpp.cond], which mentions [3]:

> The #ifdef and #ifndef directives, and the defined conditional
> inclusion operator, shall treat __has_include and __has_cpp_attribute
> as if they were the names of defined macros. The identifiers
> __has_include and __has_cpp_attribute shall not appear in any context
> not mentioned in this subclause.

[1]: https://bugs.llvm.org/show_bug.cgi?id=43982
[2]: https://bugs.llvm.org/show_bug.cgi?id=37990
[3]: http://eel.is/c++draft/cpp.cond#7.sentence-2

Change-Id: Id4b8ee19176a9e4624b533087ba870c418f27e60

Enable inlining issignalingf within glibc

issignalingf is a very small function used in some areas where
better performance (and smaller code) might be helpful.

Create inline implementation for issignalingf.

Reviewed-by: Joseph Myers <joseph@codesourcery.com>

Introduce DL_LOOKUP_FOR_RELOCATE flag for _dl_lookup_symbol_x

This will allow changes in dependency processing during non-lazy
binding, for more precise processing of NODELETE objects: During
initial relocation in dlopen, the fate of NODELETE objects is still
unclear, so objects which are depended upon by NODELETE objects
cannot immediately be marked as NODELETE.

Change-Id: Ic7b94a3f7c4719a00ca8e6018088567824da0658

rtld: Check __libc_enable_secure before honoring LD_PREFER_MAP_32BIT_EXEC (CVE-2019-19126) [BZ #25204]

The problem was introduced in glibc 2.23, in commit
b9eb92ab05204df772eb4929eccd018637c9f3e9
("Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT").

Enhance _dl_catch_exception to allow disabling exception handling

In some cases, it is necessary to introduce noexcept regions
where raised dynamic loader exceptions (e.g., from lazy binding)
are fatal, despite being nested in a code region with an active
exception handler. This change enhances _dl_catch_exception with
to provide such a capability. The existing function is reused,
so that it is not necessary to introduce yet another function with
a similar purpose.

Change-Id: Iec1bf642ff95a349fdde8040e9baf851ac7b8904

hurd: Suppress GCC 10 -Warray-bounds warning in init-first.c [BZ #25097]

The trampoline code should really be rewritten in assembler because
this is all very undefined at the C level.

Change-Id: Ided58244ca0ee48892519faac5ac222a4e02dec4

linux: Add comment on affinity set sizes to tst-skeleton-affinity.c

Change-Id: Ic6ec48f75f3a0576d3121befd04531382c92afb4

Avoid zero-length array at the end of struct link_map [BZ #25097]

l_audit ends up as an internal array with _rtld_global, and GCC 10
warns about this.

This commit does not change the layout of _rtld_global, so it is
suitable for backporting. Future changes could allocate more of the
audit state dynamically and remove it from always-allocated data
structures, to optimize the common case of inactive auditing.

Change-Id: Ic911100730f9124d4ea977ead8e13cee64b84d45

Introduce link_map_audit_state accessor function

To improve GCC 10 compatibility, it is necessary to remove the l_audit
zero-length array from the end of struct link_map. In preparation of
that, this commit introduces an accessor function for the audit state,
so that it is possible to change the representation of the audit state
without adjusting the code that accesses it.

Tested on x86_64-linux-gnu. Built on i686-gnu.

Change-Id: Id815673c29950fc011ae5301d7cde12624f658df

Properly initialize audit cookie for the dynamic loader [BZ #25157]

The l_audit array is indexed by audit module, not audit function.

Change-Id: I180eb3573dc1c57433750f5d8cb18271460ba5f2

nios2: Work around backend bug triggered by csu/libc-tls.c (GCC PR 92499)

Change-Id: If5df5b05d15f0418af821a9ac8cc0fad53437b10

Redefine _IO_iconv_t to store a single gconv step pointer [BZ #25097]

libio can only deal with gconv conversions which consist of a single
step.  Not using __gconv_info simplifies the data structures somewhat.

This eliminates a new GCC 10 warning about subscribing an inner
zero-length array.

Tested on x86_64-linux-gnu with mainline GCC.  Built with
build-many-glibcs.py, also with mainline GCC.  Due to GCC PR 92039,
there are failures left on 32-bit architectures with float128 support.

Change-Id: I8b4c489b619a53154712ff32e1b6f13bb92d4203

Add new script for plotting string benchmark JSON output

Add a script for visualizing the JSON output generated by existing
glibc string microbenchmarks.

Overview:
plot_strings.py is capable of plotting benchmark results in the
following formats, which are controlled with the -p or --plot argument:
1. absolute timings (-p time): plot the timings as they are in the
input benchmark results file.
2. relative timings (-p rel): plot relative timing difference with
respect to a chosen ifunc (controlled with -b argument).
3. performance relative to max (-p max): for each varied parameter
value, plot 1/timing as the percentage of the maximum value out of
the plotted ifuncs.
4. throughput (-p thru): plot varied parameter value over timing

For all types of graphs, there is an option to explicitly specify
the subset of ifuncs to plot using the --ifuncs parameter.

For plot types 1. and 4. one can hide/expose exact benchmark figures
using the --values flag.

When plotting relative timing differences between ifuncs, the first
ifunc listed in the input JSON file is the baseline, unless the
baseline implementation is explicitly chosen with the --baseline
parameter. For the ease of reading, the script marks the statistically
insignificant range on the graphs. The default is +-5% but this
value can be controlled with the --threshold parameter.

To accommodate for the heterogeneity in benchmark results files,
one can control i.e the x-axis scale, the resolution (dpi) of the
generated figures or the key to access the varied parameter value
in the JSON file. The corresponding options are --logarithmic,
--resolution or --key. The --key parameter ensures that plot_strings.py
works with all files which pass JSON schema validation. The schema
can be chosen with the --schema parameter.

If a window manager is available, one can enable interactive
figure display using the --display flag.

Finally, one can use the --grid flag to enable grid lines in the
generated figures.

Implementation:
plot_strings.py traverses the JSON tree until a 'results' array
is found and generates a separate figure for each such array.
The figure is then saved to a file in one of the available formats
(controlled with the --extension parameter).

As the tree is traversed, the recursive function tracks the metadata
about the test being run, so that each figure has a unique and
meaningful title and filename.

While plot_strings.py works with existing benchmarks, provisions
have been made to allow adding more structure and metadata to these
benchmarks. Currently, many benchmarks produce multiple timing values
for the same value of the varied parameter (typically 'length').
Mutiple data points for the same parameter usually mean that some other
parameter was varied as well, for example, if memmove's src and dst
buffers overlap or not (see bench-memmove-walk.c and
bench-memmove-walk.out).

Unfortunately, this information is not exposed in the benchmark output
file, so plot_strings.py has to resort to computing the geometric mean
of these multiple values. In the process, useful information about the
benchmark configuration is lost. Also, averaging the timings for
different alignments can hide useful characterstics of the benchmarked
ifuncs.

Testing:
plot_strings.py has been tested on all existing string microbenchmarks
which produce results in JSON format. The script was tested on both
Windows 10 and Ubuntu 16.04.2 LTS. It runs on both python 2 and 3
(2.7.12 and 3.5.12 tested).

Useful commands:
1. Plot timings for all ifuncs in bench-strlen.out:
$ ./plot_strings.py bench-strlen.out

2. Display help:
$ ./plot_strings.py -h

3. Plot throughput for __memset_avx512_unaligned_erms and
__memset_avx512_unaligned. Save the generated figure in pdf format to
'results/'. Use logarithmic x-axis scale, show grid lines and expose
the performance numbers:
$ ./plot_strings.py bench.out -o results/ -lgv -e pdf -p thru \
-i __memset_avx512_unaligned_erms __memset_avx512_unaligned

4. Plot relative timings for all ifuncs in bench.out with __generic_memset
as baseline. Display percentage difference threshold of +-10%:
$ ./plot_strings.py bench.out -p rel  -b __generic_memset -t 10

Discussion:
1. I would like to propose relaxing the benchout_strings.schema.json
to allow specifying either a 'results' array with 'timings' (as before)
or a 'variants' array. See below example:

{
"timing_type": "hp_timing",
"functions": {
  "memcpy": {
   "bench-variant": "default",
   "ifuncs": ["generic_memcpy", "__memcpy_thunderx"],
   "variants": [
    {
     "name": "powers of 2",
     "variants": [
      {
       "name": "both aligned",
       "results": [
        {
         "length": 1,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        },
        {
         "length": 2,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        },
...
        {
         "length": 65536,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        }]
      },
      {
       "name": "dst misaligned",
       "results": [
        {
         "length": 1,
         "align1": 0,
         "align2": 0,
         "timings": [x, y]
        },
        {
         "length": 2,
         "align1": 0,
         "align2": 1,
         "timings": [x, y]
        },
...

'variants' array consists of objects such that each object has a 'name'
attribute to describe the configuration of a particular test in the
benchmark. This can be a description, for example, of how the parameter
was varied or what was the buffer alignment tested. The 'name' attribute
is then followed by another 'variants' array or a 'results' array.

The nesting of variants allows arbitrary grouping of benchmark timings,
while allowing description of these groups. Using recusion, it is
possible to proceduraly create titles and filenames for the figures being
generated.

support: Fix support_set_small_thread_stack_size to build on Hurd

PTHREAD_STACK_MIN comes from <limits.h>, so include it explicitly.
However, it is not defined on Hurd, so compensate for that as well.

Built on x86_64-linux-gnu, i686-linux-gnu, i686-gnu.

Change-Id: Ifacc888ef86731c2639721b0932ae59583bd6b3e
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>

login: Use pread64 in utmp implementation

This reduces the possible error scenarios considerably because
no longer can file seek fail, leaving the file descriptor in an
inconsistent state and out of sync with the cache.

As a result, it is possible to avoid setting file_offset to -1
to make an error persistent. Instead, subsequent calls will retry
the operation and report any errors returned by the kernel.

This change also avoids reading the file from the start if pututline
is called multiple times, to work around lock acquisition failures
due to timeouts.

Change-Id: If21ea0c162c38830a89331ea93cddec14c0974de

Clarify purpose of assert in _dl_lookup_symbol_x

Only one of the currently defined flags is incompatible with versioned
symbol lookups, so it makes sense to check for that flag and not its
complement.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
Change-Id: I3384349cef90cfd91862ebc34a4053f0c0a99404

aarch64: Increase small and medium cases for __memcpy_generic

Increase the upper bound on medium cases from 96 to 128 bytes.
Now, up to 128 bytes are copied unrolled.

Increase the upper bound on small cases from 16 to 32 bytes so that
copies of 17-32 bytes are not impacted by the larger medium case.

Benchmarking:
The attached figures show relative timing difference with respect
to 'memcpy_generic', which is the existing implementation.
'memcpy_med_128' denotes the the version of memcpy_generic with
only the medium case enlarged. The 'memcpy_med_128_small_32' numbers
are for the version of memcpy_generic submitted in this patch, which
has both medium and small cases enlarged. The figures were generated
using the script from:
https://www.sourceware.org/ml/libc-alpha/2019-10/msg00563.html

Depending on the platform, the performance improvement in the
bench-memcpy-random.c benchmark ranges from 6% to 20% between
the original and final version of memcpy.S

Tested against GLIBC testsuite and randomized tests.

login: Introduce matches_last_entry to utmp processing

This simplifies internal_getut_nolock and fixes a regression,
introduced in commit be6b16d975683e6cca57852cd4cfe715b2a9d8b1
("login: Acquire write lock early in pututline [BZ #24882]")
in pututxline because __utmp_equal can only compare process-related
utmp entries.

Fixes: be6b16d975683e6cca57852cd4cfe715b2a9d8b1
Change-Id: Ib8a85002f7f87ee41590846d16d7e52bdb82f5a5

slotinfo in struct dtv_slotinfo_list should be flexible array [BZ #25097]

GCC 10 will warn about subscribing inner length zero arrays. Use a GCC
extension in csu/libc-tls.c to allocate space for the static_slotinfo
variable. Adjust nptl_db so that the type description machinery does
not attempt to determine the size of the flexible array member slotinfo.

Change-Id: I51be146a7857186a4ede0bb40b332509487bdde8

Fix clock_nanosleep when interrupted by a signal

This patch fixes the time64 support (added by 2e44b10b42d) where it
misses the remaining argument updated if __NR_clock_nanosleep
returns EINTR.

Checked on i686-linux-gnu on 4.15 kernel (no time64 support) and
on 5.3 kernel (with time64 support).

Reviewed-by: Alistair Francis <alistair23@gmail.com>

libio/tst-fopenloc: Use xsetlocale, xfopen, and xfclose

support: Add xsetlocale function

Declare asctime_r, ctime_r, gmtime_r, localtime_r for C2X.

C2X adds the asctime_r, ctime_r, gmtime_r and localtime_r functions.
This patch duly adds __GLIBC_USE (ISOC2X) to the conditions under
which <time.h> declares them.

Tested for x86_64.

y2038: linux: Provide __ppoll64 implementation

This patch provides new __ppoll64 explicit 64 bit function for handling polling
events (with struct timespec specified timeout) for a set of file descriptors.
Moreover, a 32 bit version - __ppoll has been refactored to internally use
__ppoll64.

The __ppoll is now supposed to be used on systems still supporting 32 bit time
(__TIMESIZE != 64) - hence the necessary conversion to 64 bit struct
__timespec64.

The new ppoll_time64 syscall available from Linux 5.1+ has been used, when
applicable.

The Linux kernel checks if passed tv_nsec value overflows, so there is no need
to repeat it in the glibc.

When ppoll syscall on systems supporting 32 bit time ABI is used, the check is
performed if passed data (which may have 64 bit tv_sec) fits into 32 bit range.

Build tests:
- The code has been tested on x86_64/x86 (native compilation):
make PARALLELMFLAGS="-j8" && make check PARALLELMFLAGS="-j8" && \\
make xcheck PARALLELMFLAGS="-j8"

- The glibc has been build tested (make PARALLELMFLAGS="-j8") for
x86 (i386), x86_64-x32, and armv7

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
  https://github.com/lmajewski/meta-y2038 and run tests:
  https://github.com/lmajewski/y2038-tests/commits/master

- Use of cross-test-ssh.sh for ARM (armv7):
  make PARALLELMFLAGS="-j8" test-wrapper='./cross-test-ssh.sh root@192.168.7.2' xcheck

Linux kernel, headers and minimal kernel version for glibc build test
matrix:
- Linux v5.1 (with ppoll_time64) and glibc build with v5.1 as
  minimal kernel version (--enable-kernel="5.1.0")
  The __ASSUME_TIME64_SYSCALLS flag defined.

- Linux v5.1 and default minimal kernel version
  The __ASSUME_TIME64_SYSCALLS not defined, but kernel supports ppoll_time64
  syscall.

- Linux v4.19 (no ppoll_time64 support) with default minimal kernel version for
  contemporary glibc
  This kernel doesn't support ppoll_time64 syscall, so the fallback to ppoll is
  tested.

Above tests were performed with Y2038 redirection applied as well as without
(so the __TIMESIZE != 64 execution path is checked as well).

No regressions were observed.

linux: Reduce stack size for nptl/tst-thread-affinity-pthread

And related tests. These tests create a thread for each core, so
they may fail due to address space limitations with the default
stack size.

Change-Id: Ieef44a7731f58d3b7d6638cce4ccd31126647551

support: Add support_set_small_thread_stack_size

And support_small_stack_thread_attribute

Change-Id: I1cf79a469984f8f30a4a947ee9ec2a5e74de8926

Fix array bounds violation in regex matcher (bug 25149)

If the regex has more subexpressions than the number of elements allocated
in the regmatch_t array passed to regexec then proceed_next_node may
access the regmatch_t array outside its bounds.

No testcase added because even without this bug it would then crash in
pop_fail_stack which is bug 11053.

sysdeps/clock_nanosleep: Use clock_nanosleep_time64 if avaliable

The clock_nanosleep syscall is not supported on newer 32-bit platforms (such
as RV32). To fix this issue let's use clock_nanosleep_time64 if it is
avaliable.

Remove hppa pthreadP.h

It just contains duplicated defitions provided by other generic
nptl headers.

Checked with run-built-tests=no against hppa-linux-gnu.

Change-Id: I95f55d5b7b7ae528c81cd2394d57ce92398189bf

login: Acquire write lock early in pututline [BZ #24882]

It has been reported that due to lack of fairness in POSIX file
locking, the current reader-to-writer lock upgrade can result in
lack of forward progress. Acquiring the write lock directly
hopefully avoids this issue if there are only writers.

This also fixes bug 24882 due to the cache revalidation in
__libc_pututline.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Change-Id: I57e31ae30719e609a53505a0924dda101d46372e

nptl: Add missing placeholder abi symbol from nanosleep move

Adds the __libpthread_version_placeholder symbol with the same version
of nanosleep/__nanosleep that was removed by 79a547b162657b3f and that
is not provided by other symbols.

login: Remove double-assignment of fl.l_whence in try_file_lock

Since l_whence is the second member of struct flock, it is written
twice. The double-assignment is technically undefined behavior due to
the lack of a sequence point.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Change-Id: I2baf9e70690e723c61051b25ccbd510aec15976c

hurd: Use __clock_gettime in _hurd_select

The __gettimeofday references caused check-localplt failures after
commit 5e46749c64d5.

Fixes: 5e46749c64d51f50f8511ed99c1266d7c13e182b
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Change-Id: Ia6da4045157a5bbccc67d79e881d7592e6f8a890

hurd: Remove lingering references to the time function

They cause a check-localplt failure after commit f9a7554009cf38f39.

Fixes: f9a7554009cf38f390e74fcabc5b49f974f72382
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Change-Id: I37bc20f3449b9e358f32879ed231720c969965b4

math: enhance the endloop condition of function handle_input_flag

In the function handle_input_flag, the end-loop condition is not
correct, because when the loop variable i equals 16
(num_input_flag_types), then input_flags[16] will be out of bounds.
(This issue is only relevant with invalid input files to
gen-auto-libm-tests.)

nptl: Refactor thrd_sleep in terms of clock_nanosleep

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Refactor nanosleep in terms of clock_nanosleep

The generic version is straightforward. For Hurd, its nanosleep
implementation is moved to clock_nanosleep with adjustments from
generic unix implementation.

The generic clock_nanosleep unix version is also removed since
it calls nanosleep.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

nptl: Move nanosleep implementation to libc

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu. I also checked
the libpthread.so .gnu.version_d entries for every ABI affected and
all of them contains the required versions (including for architectures
which exports __nanosleep with a different version).

Reviewed-by: Florian Weimer <fweimer@redhat.com>

posix: Sync regex with gnulib

It sync with gnulib commit 6cfb4302b3e1da14d706198b693558290e9b00f4
and contains the fixes:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=32915b2a8a43825720755113bdffe9f67a591748
https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=48f07576b8cd935b48e1050551f45ab1a79b9f01
https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=5e407aba1f775d51b25481cb55f324c9868f62d7
https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=4e02b30c761c76d04057fa5f6bba71401f9310cd
https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=79f8ee4e389f8cb1339f8abed9a7d29816e2a2d4

Checked on x86_64-linux-gnu and i686-linux-gnu.

Add mnw language code [BZ #25139]

Add new locale: mnw_MM (Mon language spoken in Myanmar) [BZ #25139]

S390: Fp comparison are now raising FE_INVALID with gcc 10.

The s390 gcc bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918
"S390: Floating point comparisons don't raise invalid for unordered operands."
is fixed with gcc 10. Thus we conditionally set FIX_COMPARE_INVALID
to 0 or 1.

linux: pselect: Remove CALL_PSELECT6 macro

Nothing defines CALL_PSELECT6 in the current tree, so remove it.

Tested with:
- make PARALLELMFLAGS="-j8" && make xcheck PARALLELMFLAGS="-j8" (x86_64)
- scripts/build-many-glibcs.py

Fix run-one-test so that it runs elf tests

The `test' make target passes a trailing slash in the subdir argument. This
does not play well with elf/rtld-Rules which looks for `elf' without any
trailing slash, and therefore doesn't find a match when running an elf test
individually. This commit removes the trailing slash from the invocation.

Reviewed-by: DJ Delorie <dj@redhat.com>

nptl: Fix niggles with pthread_clockjoin_np

Joseph Myers spotted[1] that 69ca4b54c151cec42ccca5e05790efc1a8206b47 added
pthread_clockjoin_np to sysdeps/nptl/pthread.h but not to its hppa-specific
equivalent sysdeps/unix/sysv/linux/hppa/pthread.h.

Rafal Luzynski spotted[2] typos in the NEWS entry and manual updates too.

Florian Weimer spotted[3] that the clockid parameter was not using a
reserved identifier in pthread.h.

[1] https://sourceware.org/ml/libc-alpha/2019-11/msg00016.html
[2] https://sourceware.org/ml/libc-alpha/2019-11/msg00019.html
[3] https://sourceware.org/ml/libc-alpha/2019-11/msg00022.html

Reviewed-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com>
Reviewed-by: Florian Weimer <fw@deneb.enyo.de>

hppa: Align __clone stack argument to 8 bytes (Bug 25066)

The hppa architecture requires strict alignment for loads and stores.
As a result, the minimum stack alignment that will work is 8 bytes.
This patch adjusts __clone() to align the stack argument passed to it.
It also adjusts slightly some formatting.

This fixes the nptl/tst-tls1 test.

y2038: linux: Provide __futimens64 implementation

This patch provides new __futimens64 explicit 64 bit function for
setting access and modification time of file (by using its file descriptor).
Moreover, a 32 bit version - __futimens has been refactored to internally use
__futimens64.

The __futimens is now supposed to be used on systems still supporting
32 bit time (__TIMESIZE != 64) - hence the necessary conversions to 64 bit
struct __timespec64.
When pointer to struct __timespec64 is NULL - the file access and modification
time is set to the current one (by the kernel) and no conversions from struct
timespec to __timespec64 are performed.

The __futimens64 reuses __utimensat64_helper defined for __utimensat64.

The test procedure for __futimens64 is the same as for __utimensat64 conversion
patch.

y2038: linux: Provide __utimensat64 implementation

This patch provides new __utimensat64 explicit 64 bit function for
setting access and modification time of a file. Moreover, a 32 bit version
- __utimensat has been refactored to internally use __utimensat64.

The __utimensat is now supposed to be used on systems still supporting
32 bit time (__TIMESIZE != 64) - hence the necessary conversions to 64 bit
struct __timespec64.
When pointer to struct __timespec64 is NULL - the file access and modification
time is set to the current one and no conversions from struct timespec to
__timespec64 are performed.

The new utimensat_time64 syscall available from Linux 5.1+ has been used,
when applicable.
The new helper function - __utimensat64_helper - has been introduced to
facilitate code re-usage on function providing futimens syscall handling.
The Linux kernel checks if passed tv_nsec value overflows, so there is no
need to repeat it in glibc.
When utimensat syscall on systems supporting 32 bit time ABI is used,
the check is performed if passed data (which may have 64 bit tv_sec) fits
into 32 bit range.

Build tests:
- The code has been tested on x86_64/x86 (native compilation):
make PARALLELMFLAGS="-j8" && make xcheck PARALLELMFLAGS="-j8"

- The glibc has been build tested (make PARALLELMFLAGS="-j8") for
x86 (i386), x86_64-x32, and armv7

Run-time tests:
- Run specific tests on ARM/x86 32bit systems (qemu):
https://github.com/lmajewski/meta-y2038 and run tests:
https://github.com/lmajewski/y2038-tests/commits/master

- Use of cross-test-ssh.sh for ARM (armv7):
make PARALLELMFLAGS="-j8" test-wrapper='./cross-test-ssh.sh root@192.168.7.2' xcheck

Linux kernel, headers and minimal kernel version for glibc build test
matrix:
- Linux v5.1 (with utimensat_time64) and glibc build with v5.1 as
minimal kernel version (--enable-kernel="5.1.0")
The __ASSUME_TIME64_SYSCALLS flag defined.

- Linux v5.1 and default minimal kernel version
The __ASSUME_TIME64_SYSCALLS not defined, but kernel supports utimensat_time64
syscall.

- Linux v4.19 (no utimensat_time64 support) with default minimal kernel
version for contemporary glibc
This kernel doesn't support utimensat_time64 syscall, so the fallback
to utimensat is tested.

The above tests were performed with Y2038 redirection applied as well as
without (so the __TIMESIZE != 64 execution path is checked as well).

No regressions were observed.

nptl: Add pthread_timedjoin_np, pthread_clockjoin_np NULL timeout test

Passing NULL as the timeout parameter to pthread_timedjoin_np has resulted
in it behaving like pthread_join for a long time. Since that is now the
documented behaviour, we ought to test that both it and the new
pthread_clockjoin_np support it.

Checked on x86_64-linux-gnu.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

nptl: Add pthread_clockjoin_np

Introduce pthread_clockjoin_np as a version of pthread_timedjoin_np that
accepts a clockid_t parameter to indicate which clock the timeout should be
measured against. This mirrors the recently-added POSIX-proposed "clock"
wait functions.

Checked on x86_64-linux-gnu.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

manual: Add documentation for pthread_tryjoin_np and pthread_timedjoin_np

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

nptl: Convert tst-join3 to use libsupport

Checked on x86_64-linux-gnu.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Sync time/mktime.c with gnulib

This syncs with gnulib commit 9e78024bad107fe786cc3e5e328a475921ea0873.
* time/mktime.c: Update URL in comment.

Sync timespec-{add,sub} with gnulib

It sync with gnulib commit 06011ed74e978613422aca43c0bd92dc44213933.

Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>