sourceware.org Git - glibc.git/log

elf: Remove THREAD_GSCOPE_IN_TCB

All the ports now have THREAD_GSCOPE_IN_TCB set to 1. Remove all
support for !THREAD_GSCOPE_IN_TCB, along with the definition itself.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210915171110.226187-4-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

htl: Reimplement GSCOPE

This is a new implementation of GSCOPE which largely mirrors its NPTL
counterpart. Same as in NPTL, instead of a global flag shared between
threads, there is now a per-thread GSCOPE flag stored in each thread's
TCB. This makes entering and exiting a GSCOPE faster at the expense of
making THREAD_GSCOPE_WAIT () slower.

The largest win is the elimination of many redundant gsync_wake () RPC
calls; previously, even simplest programs would make dozens of fully
redundant gsync_wake () calls.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210915171110.226187-3-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

htl: Move thread table to ld.so

The next commit is going to introduce a new implementation of
THREAD_GSCOPE_WAIT which needs to access the list of threads.
Since it must be usable from the dynamic laoder, we have to move
the symbols for the list of threads into the loader.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210915171110.226187-2-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Redirect fma calls to __fma in libm

include/math.h has a mechanism to redirect internal calls to various
libm functions, that can often be inlined by the compiler, to call
non-exported __* names for those functions in the case when the calls
aren't inlined, with the redirection being disabled when
NO_MATH_REDIRECT.  Add fma to the functions to which this mechanism is
applied.

At present, libm-internal fma calls (generally to __builtin_fma*
functions) are only done when it's known the call will be inlined,
with alternative code not relying on an fma operation being used in
the caller otherwise.  This patch is in preparation for adding the TS
18661 / C2X narrowing fma functions to glibc; it will be natural for
the narrowing function implementations to call the underlying fma
functions unconditionally, with this either being inlined or resulting
in an __fma* call.  (Using two levels of round-to-odd computation like
that, in the case where there isn't an fma hardware instruction, isn't
optimal but is certainly a lot simpler for the initial implementation
than writing different narrowing fma implementations for all the
various pairs of formats.)

Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by the patch (using
<https://sourceware.org/pipermail/libc-alpha/2021-September/130991.html>
to fix installed library stripping in build-many-glibcs.py).  Also
tested for x86_64.

time: Fix compile error in itimer test affecting hurd

The recent change to use __KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64 to avoid
doing 64-bit checks on some platforms broke the test for hurd where
__KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64 is not defined.  With error:

    tst-itimer.c: In function 'do_test':
    tst-itimer.c:103:11: error: '__KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64' undeclared (first use in this function)
      103 |       if (__KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64)
  |           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    tst-itimer.c:103:11: note: each undeclared identifier is reported only once for each function it appears in

Define a support helper to detect when setitimer and getitimer support
64-bit time_t.

Fixes commit 6e8a0aac2f ("time: Fix overflow itimer tests on 32-bit
systems").

Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Joseph Myers <joseph@codesourcery.com>

mach lll_lock/unlock: Explicitly request private locking

0 was actually LLL_PRIVATE, so this does not actually change the code.

elf: Replace most uses of THREAD_GSCOPE_IN_TCB

While originally this definition was indeed used to distinguish between
the cases where the GSCOPE flag was stored in TCB or not, it has since
become used as a general way to distinguish between HTL and NPTL.

THREAD_GSCOPE_IN_TCB will be removed in the following commits, as HTL,
which currently is the only port that does not put the flag into TCB,
will get ported to put the GSCOPE flag into the TCB as well. To prepare
for that change, migrate all code that wants to distinguish between HTL
and NPTL to use PTHREAD_IN_LIBC instead, which is a better choice since
the distinction mostly has to do with whether libc has access to the
list of thread structures and therefore can initialize thread-local
storage.

The parts of code that actually depend on whether the GSCOPE flag is in
TCB are left unchanged.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210907133325.255690-2-bugaevc@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Add MADV_POPULATE_READ and MADV_POPULATE_WRITE from Linux 5.14 to bits/mman-linux.h

Linux 5.14 adds constants MADV_POPULATE_READ and MADV_POPULATE_WRITE
(with the same values on all architectures). Add these to glibc's
bits/mman-linux.h.

Tested for x86_64.

Update kernel version to 5.14 in tst-mman-consts.py

This patch updates the kernel version in the test tst-mman-consts.py
to 5.14. (There are no new MAP_* constants covered by this test in
5.14 that need any other header changes.)

Tested with build-many-glibcs.py.

configure: Fix check for INSERT in linker script

GCC/Clang use local access when referencing a const variable,
so the conftest.so may have no dynamic relocation.
LLD reports `error: unable to insert .foo after .rela.dyn` when the
destination section does not exist.

Use a non-const int to ensure that .rela.dyn exists.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

iconvconfig: Fix behaviour with --prefix [BZ #28199]

The consolidation of configuration parsing broke behaviour with
--prefix, where the prefix bled into the modules cache. Accept a
prefix which, when non-NULL, is prepended to the path when looking for
configuration files but only the original directory is added to the
modules cache.

This has no effect on the codegen of gconv_conf since it passes NULL.

Reported-by: Patrick McCarty <patrick.mccarty@intel.com>
Reported-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>

nptl: Fix race between pthread_kill and thread exit (bug 12889)

A new thread exit lock and flag are introduced. They are used to
detect that the thread is about to exit or has exited in
__pthread_kill_internal, and the signal is not sent in this case.

The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived
from a downstream test originally written by Marek Polacek.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

nptl: pthread_kill, pthread_cancel should not fail after exit (bug 19193)

This closes one remaining race condition related to bug 12889: if
the thread already exited on the kernel side, returning ESRCH
is not correct because that error is reserved for the thread IDs
(pthread_t values) whose lifetime has ended. In case of a
kernel-side exit and a valid thread ID, no signal needs to be sent
and cancellation does not have an effect, so just return 0.

sysdeps/pthread/tst-kill4.c triggers undefined behavior and is
removed with this commit.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

benchtests: Remove redundant assert.h

This patch removed redundant "#include <assert.h>" from
bench-memset-large.c and bench-memset-walk.c.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

benchtests: Enable scripts/plot_strings.py to read stdin

This patch enables scripts/plot_strings.py to read a benchmark result
file from stdin.
To keep backward compatibility, that is to keep accepting multiple of
benchmark result files in argument, blank argument doesn't mean stdin,
but '-' does.
Therefore nargs parameter of ArgumentParser.add_argument() method is
not changed to '?', but keep '+'.

ex:
  $ jq '.' bench-memset.out | plot_strings.py -
  $ jq '.' bench-memset.out | plot_strings.py - bench-memset-large.out
  $ plot_strings.py bench-memset.out bench-memset-large.out

error ex:
  $ jq '.' bench-memset.out | plot_strings.py

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

Add narrowing square root functions

This patch adds the narrowing square root functions from TS 18661-1 /
TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64,
f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x,
f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128,
f64xsqrtf128 for configurations with _Float64x and _Float128;
__f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case
(for calls to fsqrtl and dsqrtl when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.

The changes are mostly similar to those for the other narrowing
functions previously added, so the description of those generally
applies to this patch as well.  However, the not-actually-narrowing
cases (where the two types involved in the function have the same
floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather
than needing a separately built not-actually-narrowing function such
as was needed for add / sub / mul / div.  Thus, there is no
__nldbl_dsqrtl name for ldbl-opt because no such name was needed
(whereas the other functions needed such a name since the only other
name for that entry point was e.g. f32xaddf64, not reserved by TS
18661-1); the headers are made to arrange for sqrt to be called in
that case instead.

The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because
they were observed to be needed in GCC 7 testing of
riscv32-linux-gnu-rv32imac-ilp32.  The other sysdeps/ieee754/soft-fp/
files added didn't need such DIAG_* in any configuration I tested with
build-many-glibcs.py, but if they do turn out to be needed in more
files with some other configuration / GCC version, they can always be
added there.

I reused the same test inputs in auto-libm-test-in as for
non-narrowing sqrt rather than adding extra or separate inputs for
narrowing sqrt.  The tests in libm-test-narrow-sqrt.inc also follow
those for non-narrowing sqrt.

Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float).  The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).

_Static_assert needs two arguments for compatibility with GCC before 9

This macro definition enforces two arguments even with newer compilers
that accept the single-argument form, too.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

testrun.sh: Add support for --tool=rpctrace

rpctrace(1) is a Hurd RPC tracer tool, which is used similar to how
strace(1) is used on GNU/Linux.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-Id: <20210907133325.255690-6-bugaevc@gmail.com>
Acked-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

Update syscall lists for Linux 5.14

Linux 5.14 has two new syscalls, memfd_secret (on some architectures
only) and quotactl_fd. Update syscall-names.list and regenerate the
arch-syscall.h headers with build-many-glibcs.py update-syscalls.

Tested with build-many-glibcs.py.

Fix failing nss/tst-nss-files-hosts-long with local resolver

When a local resolver like unbound is listening on the IPv4 loopback
address 127.0.0.1, the nss/tst-nss-files-hosts-long test fails. This is
due to:
- the default resolver in the absence of resolv.conf being 127.0.0.1
- the default DNS NSS database configuration in the absence of
nsswitch.conf being 'hosts: dns [!UNAVAIL=return] file'

This causes the requests for 'test4' and 'test6' to first be sent to the
local resolver, which responds with NXDOMAIN in the likely case those
records do no exist. In turn that causes the access to /etc/hosts to be
skipped, which is the purpose of that test.

Fix that by providing a simple nsswitch.conf file forcing access to
/etc/hosts for that test. I have tested that the only changed result in
the testsuite is that test.

MIPS: Setup errno for {f,l,}xstat

{f,l,}xstat stub for MIPS is using INTERNAL_SYSCALL
to do xstat syscall for glibc ver, However it leaves
errno untouched and thus giving bad errno output.

Setup errno properly when syscall returns non-zero.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Use Linux 5.14 in build-many-glibcs.py

This patch makes build-many-glibcs.py use Linux 5.14.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

locale: Add missing second argument to _Static_assert in C-collate-seq.c

Update hppa libm-test-ulps

Add generic C.UTF-8 locale (Bug 17318)

We add a new C.UTF-8 locale. This locale is not builtin to glibc, but
is provided as a distinct locale. The locale provides full support for
UTF-8 and this includes full code point sorting via STRCMP-based
collation (strcmp or wcscmp).

The collation uses a new keyword 'codepoint_collation' which drops all
collation rules and generates an empty zero rules collation to enable
STRCMP usage in collation. This ensures that we get full code point
sorting for C.UTF-8 with a minimal 1406 bytes of overhead (LC_COLLATE
structure information and ASCII collating tables).

The new locale is added to SUPPORTED. Minimal test data for specific
code points (minus those not supported by collate-test) is provided in
C.UTF-8.in, and this verifies code point sorting is working reasonably
across the range. The locale was tested manually with the full set of
code points without failure.

The locale is harmonized with locales already shipping in various
downstream distributions. A new tst-iconv9 test is added which verifies
the C.UTF-8 locale is generally usable.

Testing for fnmatch, regexec, and recomp is provided by extending
bug-regex1, bugregex19, bug-regex4, bug-regex6, transbug, tst-fnmatch,
tst-regcomp-truncated, and tst-regex to use C.UTF-8.

Tested on x86_64 or i686 without regression.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Add 'codepoint_collation' support for LC_COLLATE.

Support a new directive 'codepoint_collation' in the LC_COLLATE
section of a locale source file. This new directive causes all
collation rules to be dropped and instead STRCMP (strcmp or
wcscmp) is used for collation of the input character set. This
is required to allow for a C.UTF-8 that contains zero collation
rules (minimal size) and sorts using code point sorting.

To date the only implementation of a locale with zero collation
rules is the C/POSIX locale. The C/POSIX locale provides
identity tables for _NL_COLLATE_COLLSEQMB and
_NL_COLLATE_COLLSEQWC that map to ASCII even though it has zero
rules. This has lead to existing fnmatch, regexec, and regcomp
implementations that require these tables. It is not correct
to use these tables when nrules == 0, but the conservative fix
is to provide these tables when nrules == 0. This assures that
existing static applications using a new C.UTF-8 locale with
'codepoint_collation' at least have functional range expressions
with ASCII e.g. [0-9] or [a-z]. Such static applications would
not have the fixes to fnmatch, regexec and regcomp that avoid
the use of the tables when nrules == 0. Future fixes to fnmatch,
regexec, and regcomp would allow range expressions to use the
full set of code points for such ranges.

Tested on x86_64 and i686 without regression.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

AArch64: Update A64FX memset not to degrade at 16KB

This patch updates unroll8 code so as not to degrade at the peak
performance 16KB for both FX1000 and FX700.

Inserted 2 instructions at the beginning of the unroll8 loop,
cmp and branch, are a workaround that is found heuristically.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

Revert "AArch64: Update A64FX memset not to degrade at 16KB"

Because of wrong commit author. Will recommit it with right author.

This reverts commit 23777232c23f80809613bdfa329f63aadf992922.

Remove "Contributed by" lines

We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Port shared code information from the wiki

Since the shared code now has special status with respect to
copyrights, port them into a more structured format in the source tree
and add a python function that parses and returns a dictionary with
the information.

I need this to exclude these files from the Contributed-by changes and
I reckon it would be useful to know these files for future tooling.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

AArch64: Update A64FX memset not to degrade at 16KB

This patch updates unroll8 code so as not to degrade at the peak
performance 16KB for both FX1000 and FX700.

Inserted 2 instructions at the beginning of the unroll8 loop,
cmp and branch, are a workaround that is found heuristically.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>

posix: remove some iso-8859-encoded characters

With the increasing adoption of UTF-8, modern editors may (will?)
replace iso-8859-encoded characters in the range 0x80..0xff with
their UTF-8 equivalent, as will mailers and other tools. This breaks
our testsuite and corrupts patches.

So, this patch starts replacing these problematic characters with
\OCTal sequences instead (adding support for those in tst-fnmatch.c)
or with plain ASCII characters (PTESTS).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

configure: Allow LD to be LLD 13.0.0 or above [BZ #26558]

When using LLD (LLVM linker) as the linker, configure prints a confusing
message.

*** These critical programs are missing or too old: GNU ld

LLD>=13.0.0 can build glibc --enable-static-pie. (8.0.0 needs one
workaround for -Wl,-defsym=_begin=0. 9.0.0 works with
--disable-static-pie).

XFAIL two tests sysdeps/x86/tst-ifunc-isa-* which have the BZ #28154
issue (LLD follows the PowerPC port of GNU ld for ifunc by placing
IRELATIVE relocations in .rela.dyn, triggering a glibc ifunc fragility).

The set of dynamic symbols is the same with GNU ld and LLD,
modulo unused SHN_ABS version node symbols.

For comparison, gold does not support --enable-static-pie
yet (--no-dynamic-linker is unsupported BZ #22221), yet
has 6 failures more than LLD. gold linked libc.so has
larger .dynsym differences with GNU ld and LLD
(non-default version symbols are changed to default versions
by a version script BZ #28196).

hurd msync: Drop bogus test

MS_SYNC is actually 0, so we cannot test that both MS_SYNC and MS_ASYNC
are set.

hurd: Fix typo in msync

== has higher priority than &

x86-64: Use testl to check __x86_string_control

Use testl, instead of andl, to check __x86_string_control to avoid
updating __x86_string_control.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

i686: Don't include multiarch memove in libc.a

On i686, there is no multiarch memove in libc.a, don't include multiarch
memove in ifunc-impl-list.c in libc.a.

support: Add support_wait_for_thread_exit

Allow #pragma GCC in headers in conformtest

No "#pragma GCC" pragma allows macro-expansion of its arguments, so no
namespace issues arise from use of such pragmas in installed headers.
Ignore them in conformtest tests of header namespace.

Tested for x86_64, in conjunction with Paul's patch
<https://sourceware.org/pipermail/libc-alpha/2021-August/130571.html>
adding use of such pragmas to installed headers shared with gnulib.

nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232)

A mapped temporary file and a semaphore is used to synchronize the
pid information on the created file, the semaphore is updated once
the file contents is flushed.

Checked on x86_64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Use support_open_dev_null_range io/tst-closefrom, misc/tst-close_range, and posix/tst-spawn5 (BZ #28260)

It ensures a continuous range of file descriptor and avoid hitting
the RLIMIT_NOFILE.

Checked on x86_64-linux-gnu.

support: Add support_open_dev_null_range

It returns a range of file descriptor referring to the '/dev/null'
pathname. The function takes care of restarting the open range
if a file descriptor is found within the specified range and
also increases RLIMIT_NOFILE if required.

Checked on x86_64-linux-gnu.

llio.texi: Wording fixes in description of closefrom()

Fix two problems.

Rather than "larger than", better English is "greater than".

Then there is a wordinig error on the following line: "then lowfd"
appears to be cruft.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

Fix error message in memmove test to display correct src pointer

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Skip tst-auditlogmod-* if the linker doesn't support --depaudit [BZ #28151]

gold and ld.lld do not support --audit or --depaudit.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

powerpc: Use --no-tls-get-addr-optimize in test only if the linker supports it

LLD doesn't support --{,no-}tls-get-addr-optimize.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

x86-64: Remove assembler AVX512DQ check

The minimum GNU binutils requirement is 2.25 which supports AVX512DQ.
Remove assembler AVX512DQ check.

x86-64: Remove compiler -mavx512f check

The minimum GCC requirement is GCC 6.2 which supports -mavx512f. Remove
compiler -mavx512f check. Tested with GCC 6.4.1 on Linux/x86-64.

Use __executable_start as the lowest address for profiling [BZ #28153]

Glibc assumes that ENTRY_POINT is the lowest address for which we need
to keep profiling records and BFD linker uses a linker script to place
the input sections.

Starting from GCC 4.6, the main function is placed in .text.startup
section and starting from binutils 2.22, BFD linker with

commit add44f8d5c5c05e08b11e033127a744d61c26aee
Author: Alan Modra <amodra@gmail.com>
Date:   Thu Nov 25 03:03:02 2010 +0000

            * scripttempl/elf.sc: Group .text.exit, text.startup and .text.hot
            sections.

places .text.startup section before .text section, which leave the main
function out of profiling records.

Starting from binutils 2.15, linker provides __executable_start to mark
the lowest address of the executable.  Use __executable_start as the
lowest address to keep the main function in profiling records. This fixes
[BZ #28153].

Tested on Linux/x86-64, Linux/x32 and Linux/i686 as well as with
build-many-glibcs.py.

hurd: Fix errlist error mapping

On the Hurd, the errno values don't start at 0, so _sys_errlist_internal
needs index remapping. The _sys_errlist_internal definition already properly
uses ERR_MAP, but __get_errlist and __get_errname were not.

hurd: Remove old test-err_np.c file

This is not referenced any more and includes a non-existing file.

Fix iconv build with GCC mainline

Current GCC mainline produces -Wstringop-overflow errors building some
iconv converters, as discussed at
<https://gcc.gnu.org/pipermail/gcc/2021-July/236943.html>. Add an
__builtin_unreachable call as suggested so that GCC can see the case
that would involve a buffer overflow is unreachable; because the
unreachability depends on valid conversion state being passed into the
function from previous conversion steps, it's not something the
compiler can reasonably deduce on its own.

Tested with build-many-glibcs.py that, together with
<https://sourceware.org/pipermail/libc-alpha/2021-August/130244.html>,
it restores the glibc build for powerpc-linux-gnu.

rtld: copy terminating null in tunables_strdup (bug 28256)

Avoid triggering a false positive from valgrind by copying the terminating
null in tunables_strdup. At this point the heap is still clean, but
valgrind is stricter here.

mtrace: Fix output with PIE and ASLR [BZ #22716]

Record only the relative address of the caller in mtrace file. Use
LD_TRACE_PRELINKING to get the executable as well as binary vs
executable load offsets so that we may compute a base to add to the
relative address in the mtrace file. This allows us to get a valid
address to pass to addr2line in all cases.

Fixes BZ #22716.

Co-authored-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
Reviewed-by: DJ Delorie <dj@redhat.com>

x86-64: Optimize load of all bits set into ZMM register [BZ #28252]

Optimize loads of all bits set into ZMM register in AVX512 SVML codes
by replacing

vpbroadcastq .L_2il0floatpacket.16(%rip), %zmmX

and

vmovups .L_2il0floatpacket.13(%rip), %zmmX

with
vpternlogd $0xff, %zmmX, %zmmX, %zmmX

This fixes BZ #28252.

Update string/test-memmove.c to cover 16KB copy

elf: Fix missing colon in LD_SHOW_AUXV output [BZ #28253]

This commit adds a missing colon in the AT_MINSIGSTKSZ entry in
the _dl_show_auxv function.

x86: fix Autoconf caching of instruction support checks [BZ #27991]

The Autoconf documentation for the AC_CACHE_CHECK macro states:

The commands-to-set-it must have no side effects except for setting
the variable cache-id, see below.

However, the tests for support of -msahf and -mmovbe were embedded in
the commands-to-set-it for lib_cv_include_x86_isa_level. This had the
consequence that libc_cv_have_x86_lahf_sahf and libc_cv_have_x86_movbe
were not defined whenever lib_cv_include_x86_isa_level was read from
cache. These variables' being undefined meant that their unquoted use
in later test expressions led to the 'test' built-in's misparsing its
arguments and emitting errors like "test: =: unexpected operator" or
"test: =: unary operator expected", depending on the particular shell.

This commit refactors the tests for LAHF/SAHF and MOVBE instruction
support into their own AC_CACHE_CHECK macro invocations to obey the
rule that the commands-to-set-it must have no side effects other than
setting the variable named by cache-id.

Signed-off-by: Matt Whitlock <sourceware@mattwhitlock.name>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

arm: Simplify elf_machine_{load_address,dynamic}

and drop reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time
address of _DYNAMIC. &__ehdr_start is a better way to get the load address.

This is similar to commits b37b75d269883a2c553bb7019a813094eb4e2dd1
(x86-64) and 43d06ed218fc8be58987bdfd00e21e5720f0b862 (aarch64).

Reviewed-by: Joseph Myers <joseph@codesourcery.com>

riscv: Drop reliance on _GLOBAL_OFFSET_TABLE_[0]

&__ehdr_start is a better way to get the load address.

This is similar to commits b37b75d269883a2c553bb7019a813094eb4e2dd1
(x86-64) and 43d06ed218fc8be58987bdfd00e21e5720f0b862 (aarch64).

Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>

Remove sysdeps/*/tls-macros.h

They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing. Now
that we have migrated to __thread and tls_model attributes, these macros
are unused and the tls-macros.h files can retire.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

x86_64: Simplify elf_machine_{load_address,dynamic}

and drop reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time
address of _DYNAMIC. &__ehdr_start is a better way to get the load address.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Drop elf/tls-macros.h in favor of __thread and tls_model attributes [BZ #28152] [BZ #28205]

elf/tls-macros.h was added for TLS testing when GCC did not support
__thread. __thread and tls_model attributes are mature now and have been
used by many newer tests.

Also delete tst-tls2.c which tests .tls_common (unused by modern GCC and
unsupported by Clang/LLD). .tls_common and .tbss definition are almost
identical after linking, so the runtime test doesn't add additional
coverage. Assembler and linker tests should be on the binutils side.

When LLD 13.0.0 is allowed in configure.ac
(https://sourceware.org/pipermail/libc-alpha/2021-August/129866.html),
`make check` result is on par with glibc built with GNU ld on aarch64
and x86_64.

As a future clean-up, TLS_GD/TLS_LD/TLS_IE/TLS_IE macros can be removed from
sysdeps/*/tls-macros.h. We can add optional -mtls-dialect={gnu2,trad}
tests to ensure coverage.

Tested on aarch64-linux-gnu, powerpc64le-linux-gnu, and x86_64-linux-gnu.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

hurd: Drop fmh kludge

Gnumach's 0650a4ee30e3 implements support for high bits being set in the
mask parameter of vm_map. This allows to remove the fmh kludge that was
masking away the address range by mapping a dumb area there.

time: Fix overflow itimer tests on 32-bit systems

On the port of OpenRISC I am working on and it appears the rv32 port
we have sets __TIMESIZE == 64 && __WORDSIZE == 32. This causes the
size of time_t to be 8 bytes, but the tv_sec in the kernel is still 32-bit
causing truncation.

The truncations are unavoidable on these systems so skip the
testing/failures by guarding with __KERNEL_OLD_TIMEVAL_MATCHES_TIMEVAL64.

Also, futher in the tests and in other parts of code checking for time_t
overflow does not work on 32-bit systems when time_t is 64-bit. As
suggested by Adhemerval, update the in_time_t_range function to assume
32-bits by using int32_t.

This also brings in the header for stdint.h so we can update other
usages of __int32_t to int32_t as suggested by Adhemerval.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

mips: increase stack alignment in clone to match the ABI

In "mips: align stack in clone [BZ #28223]"
(commit 1f51cd9a860ee45eee8a56fb2ba925267a2a7bfe) I made a mistake: I
misbelieved one "word" was 2-byte and "doubleword" should be 4-byte.
But in MIPS ABI one "word" is defined 32-bit (4-byte), so "doubleword" is
8-byte [1], and "quadword" is 16-byte [2].

[1]: "System V Application Binary Interface: MIPS(R) RISC Processor
Supplement, 3rd edition", page 3-31
[2]: "MIPSpro(TM) 64-Bit Porting and Transition Guide", page 23

mips: align stack in clone [BZ #28223]

The MIPS O32 ABI requires 4 byte aligned stack, and the MIPS N64 and N32
ABI require 8 byte aligned stack. Previously if the caller passed an
unaligned stack to clone the the child misbehaved.

Fixes bug 28223.

librt: add test (bug 28213)

This test implements following logic:
1) Create POSIX message queue.
   Register a notification with mq_notify (using NULL attributes).
   Then immediately unregister the notification with mq_notify.
   Helper thread in a vulnerable version of glibc
   should cause NULL pointer dereference after these steps.
2) Once again, register the same notification.
   Try to send a dummy message.
   Test is considered successfulif the dummy message
   is successfully received by the callback function.

Signed-off-by: Nikita Popov <npv1310@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

mtrace: Use a static buffer for printing [BZ #25947]

Use a static buffer for mtrace printing now that it no longer adds to
default libc footprint.

Reviewed-by: DJ Delorie <dj@redhat.com>

hurd mmap: Reduce the requested max vmprot

When the memory object is read-only, the kernel would be right in
refusing max vmprot containing VM_PROT_WRITE.

Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

hurd mmap: Factorize MAP_SHARED flag check

Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>

aarch64: Make elf_machine_{load_address,dynamic} robust [BZ #28203]

The AArch64 ABI is largely platform agnostic and does not specify
_GLOBAL_OFFSET_TABLE_[0] ([1]). glibc ld.so turns out to be probably the
only user of _GLOBAL_OFFSET_TABLE_[0] and GNU ld defines the value
to the link-time address _DYNAMIC. [2]

In 2012, __ehdr_start was implemented in GNU ld and gold in binutils
2.23. Using adrp+add / (-mcmodel=tiny) adr to access
__ehdr_start/_DYNAMIC gives us a robust way to get the load address and
the link-time address of _DYNAMIC.

[1]: From a psABI maintainer, https://bugs.llvm.org/show_bug.cgi?id=49672#c2
[2]: LLD's aarch64 port does not set _GLOBAL_OFFSET_TABLE_[0] to the
link-time address _DYNAMIC.
LLD is widely used on aarch64 Android and ChromeOS devices. Software
just works without the need for _GLOBAL_OFFSET_TABLE_[0].

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

elf: Unconditionally use __ehdr_start

We can consider __ehdr_start (from binutils 2.23 onwards)
unconditionally supported, since configure.ac requires binutils>=2.25.

The configure.ac check is related to an ia64 bug fixed by binutils 2.24.
See https://sourceware.org/pipermail/libc-alpha/2014-August/053503.html

Tested on x86_64-linux-gnu. Tested build-many-glibcs.py with
aarch64-linux-gnu and s390x-linux-gnu.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

hurd: Add support for AT_NO_AUTOMOUNT

[5/5] AArch64: Improve A64FX memset medium loops

Simplify the code for memsets smaller than L1. Improve the unroll8 and
L1_prefetch loops.

Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>

[4/5] AArch64: Improve A64FX memset by removing unroll32

Remove unroll32 code since it doesn't improve performance.

Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>

[3/5] AArch64: Improve A64FX memset for remaining bytes

Simplify handling of remaining bytes. Avoid lots of taken branches and complex
whilelo computations, instead unconditionally write vectors from the end.

Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>

[2/5] AArch64: Improve A64FX memset for large sizes

Improve performance of large memsets. Simplify alignment code. For zero memset
use DC ZVA, which almost doubles performance. For non-zero memsets use the
unroll8 loop which is about 10% faster.

Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>

[1/5] AArch64: Improve A64FX memset for small sizes

Improve performance of small memsets by reducing instruction counts and
improving code alignment. Bench-memset shows 35-45% performance gain for
small sizes.

Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>

Use binutils 2.37 branch in build-many-glibcs.py

This patch makes build-many-glibcs.py use binutils 2.37 branch.

Tested with build-many-glibcs.py (compilers and glibcs builds).

Add PTRACE_GET_RSEQ_CONFIGURATION from Linux 5.13 to sys/ptrace.h

Linux 5.13 adds a PTRACE_GET_RSEQ_CONFIGURATION constant, with an
associated ptrace_rseq_configuration structure.

Add this constant to the various sys/ptrace.h headers in glibc, with
the structure in bits/ptrace-shared.h (named struct
__ptrace_rseq_configuration in glibc, as with other such structures).

Tested for x86_64, and with build-many-glibcs.py.

librt: fix NULL pointer dereference (bug 28213)

Helper thread frees copied attribute on NOTIFY_REMOVED message
received from the OS kernel.  Unfortunately, it fails to check whether
copied attribute actually exists (data.attr != NULL).  This worked
earlier because free() checks passed pointer before actually
attempting to release corresponding memory.  But
__pthread_attr_destroy assumes pointer is not NULL.

So passing NULL pointer to __pthread_attr_destroy will result in
segmentation fault.  This scenario is possible if
notification->sigev_notify_attributes == NULL (which means default
thread attributes should be used).

Signed-off-by: Nikita Popov <npv1310@gmail.com>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

powerpc64: Add checks for Altivec and VSX in ifunc selection

We'd like to support processors without Altivec or VSX, so check
the relevant hwcap bits before selecting them.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

powerpc64: Check cacheline size before using optimised memset routines

A number of optimised memset routines assume the cacheline size is 128B,
so we better check before using them.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

powerpc64: Replace some PPC_FEATURE_HAS_VSX with PPC_FEATURE_ARCH_2_06

We use PPC_FEATURE_HAS_VSX to select a number of POWER7 optimised
functions. These functions don't use any VSX instructions, so
PPC_FEATURE_ARCH_2_06 seems like a better fit.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

Linux: Fix fcntl, ioctl, prctl redirects for _TIME_BITS=64 (bug 28182)

__REDIRECT and __THROW are not compatible with C++ due to the ordering of the
__asm__ alias and the throw specifier. __REDIRECT_NTH has to be used
instead.

Fixes commit 8a40aff86ba5f64a3a84883e539cb67b ("io: Add time64 alias
for fcntl"), commit 82c395d91ea4f69120d453aeec398e30 ("misc: Add
time64 alias for ioctl"), commit b39ffab860cd743a82c91946619f1b8158
("Linux: Add time64 alias for prctl").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Add INADDR_DUMMY from Linux 5.13 to netinet/in.h

Linux 5.13 adds an INADDR_DUMMY definition; add a corresponding
definition to glibc's netinet/in.h. (This isn't strictly a new kernel
interface, rather a value defined in RFC 7600.)

Tested for x86_64.

tst-mxfast: Don't run with mcheck

The test may not show predictable behaviour with -lmcheck since the
padding won't always guarantee fastbin usage.

rt: Set the correct message queue for tst-mqueue10

Checked on x86_64-linux-gnu.

Update sparc libm-test-ulps

linux: Add sparck brk implementation

It turned that the generic implementation of brk() does not work
for sparc, since on failure kernel will just return the previous
input value without setting the conditional register.

This patches adds back a sparc32 and sparc64 implementation removed
by 720480934ab9107.

Checked on sparc64-linux-gnu and sparcv9-linux-gnu.

test-dlclose-exit-race: avoid hang on pthread_create error

This test depends on the "last" function being called in a
different thread than the "first" function, as "last" posts
a semaphore that "first" is waiting on. However, if pthread_create
fails - for example, if running in an older container before
the clone3()-in-container-EPERM fixes - exit() is called in the
same thread as everything else, the semaphore never gets posted,
and first hangs.

The fix is to pre-post that semaphore before a single-threaded
exit.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

gethosts: Remove unused argument _type

The generated code is unchanged.

hurd: Avoid spurious warning

Compilers missing some flow analysis may think ss may be used
uninitialized.

gaiconf_init: Avoid double-free in label and precedence lists

labellist and precedencelist could get freed a second time if there
are allocation failures, so set them to NULL to avoid a double-free.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

copy_and_spawn_sgid: Avoid double calls to close()

If close() on infd and outfd succeeded, reset the fd numbers so that
we don't attempt to close them again.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

iconv_charmap: Close output file when done

Reviewed-by: Arjun Shankar <arjun@redhat.com>

gconv_parseconfdir: Fix memory leak

The allocated `conf` would leak if we have to skip over the file due
to the underlying filesystem not supporting dt_type.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

ldconfig: avoid leak on empty paths in config file

Reviewed-by: Arjun Shankar <arjun@redhat.com>

Fix build of nptl/tst-thread_local1.cc with GCC 12

The test nptl/tst-thread_local1.cc fails to build with GCC mainline
because of changes to what libstdc++ headers implicitly include what
other headers:

tst-thread_local1.cc: In function 'int do_test()':
tst-thread_local1.cc:177:5: error: variable 'std::array<std::pair<const char*, std::function<void(void* (*)(void*))> >, 2> do_thread_X' has initializer but incomplete type
177 | do_thread_X
| ^~~~~~~~~~~

Fix this by adding an explicit include of <array>.

Tested with build-many-glibcs.py for aarch64-linux-gnu.