sourceware.org Git - glibc.git/log

Linux: Include <dl-auxv.h> in dl-sysdep.c only for SHARED

Otherwise, <dl-auxv.h> on POWER ends up being included twice,
once in dl-sysdep.c, once in dl-support.c. That leads to a linker
failure due to multiple definitions of _dl_cache_line_size.

Fixes commit d96d2995c1121d3310102afda2deb1f35761b5e6
("Revert "Linux: Consolidate auxiliary vector parsing").

Revert "Linux: Consolidate auxiliary vector parsing"

This reverts commit 8c8510ab2790039e58995ef3a22309582413d3ff. The
revert is not perfect because the commit included a bug fix for
_dl_sysdep_start with an empty argv, introduced in commit
2d47fa68628e831a692cba8fc9050cef435afc5e ("Linux: Remove
DL_FIND_ARG_COMPONENTS"), and this bug fix is kept.

The revert is necessary because the reverted commit introduced an
early memset call on aarch64, which leads to crash due to lack of TCB
initialization.

String: Ensure 'MIN_PAGE_SIZE' is multiple of 'getpagesize'

When 'TEST_LEN' was defined as (4096 * 3) the allocation size Would
not be a multiple of system page size if system page size > 4096.

Use binutils 2.38 branch in build-many-glibcs.py

This patch makes build-many-glibcs.py use binutils 2.38 branch.

Tested with build-many-glibcs.py (compilers and glibcs builds).

elf: Remove LD_USE_LOAD_BIAS

It is solely for prelink with PIE executables [1].

[1] https://sourceware.org/legacy-ml/libc-hacker/2003-11/msg00127.html

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

malloc: Remove LD_TRACE_PRELINKING usage from mtrace

The fix for BZ#22716 replacde LD_TRACE_LOADED_OBJECTS with
LD_TRACE_PRELINKING so mtrace could record executable address
position.

To provide the same information, LD_TRACE_LOADED_OBJECTS is
extended where a value or '2' also prints the executable address
as well.  It avoid adding another loader environment variable
to be used solely for mtrace.  The vDSO will be printed as
a default library (with '=>' pointing the same name), which is
ok since both mtrace and ldd already handles it.

The mtrace script is changed to also parse the new format.  To
correctly support PIE and non-PIE executables, both the default
mtrace address and the one calculated as used (it fixes mtrace
for non-PIE exectuable as for BZ#22716 for PIE).

Checked on x86_64-linux-gnu.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

elf: Remove prelink support

Prelinked binaries and libraries still work, the dynamic tags
DT_GNU_PRELINKED, DT_GNU_LIBLIST, DT_GNU_CONFLICT just ignored
(meaning the process is reallocated as default).

The loader environment variable TRACE_PRELINKING is also removed,
since it used solely on prelink.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

Linux: Consolidate auxiliary vector parsing

And optimize it slightly.

The large switch statement in _dl_sysdep_start can be replaced with
a large array.  This reduces source code and binary size.  On
i686-linux-gnu:

Before:

   text    data     bss     dec     hex filename
   7791      12       0    7803    1e7b elf/dl-sysdep.os

After:

   text    data     bss     dec     hex filename
   7135      12       0    7147    1beb elf/dl-sysdep.os

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Linux: Assume that NEED_DL_SYSINFO_DSO is always defined

The definition itself is still needed for generic code.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Linux: Remove DL_FIND_ARG_COMPONENTS

The generic definition is always used since the Native Client
port has been removed.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Linux: Remove HAVE_AUX_SECURE, HAVE_AUX_XID, HAVE_AUX_PAGESIZE

They are always defined.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Merge dl-sysdep.c into the Linux version

The generic version is the de-facto Linux implementation. It
requires an auxiliary vector, so Hurd does not use it.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

hppa: Fix bind-now audit (BZ #28857)

On hppa, a function pointer returned by la_symbind is actually a function
descriptor has the plabel bit set (bit 30).  This must be cleared to get
the actual address of the descriptor.  If the descriptor has been bound,
the first word of the descriptor is the physical address of theA function,
otherwise, the first word of the descriptor points to a trampoline in the
PLT.

This patch also adds a workaround on tests because on hppa (and it seems
to be the only ABI I have see it), some shared library adds a dynamic PLT
relocation to am empty symbol name:

$ readelf -r elf/tst-audit25mod1.so
[...]
Relocation section '.rela.plt' at offset 0x464 contains 6 entries:
Offset     Info    Type            Sym.Value  Sym. Name + Addend
00002008  00000081 R_PARISC_IPLT                508
[...]

It breaks some assumptions on the test, where a symbol with an empty
name ("") is passed on la_symbind.

Checked on x86_64-linux-gnu and hppa-linux-gnu.

x86-64: Optimize bzero

memset with zero as the value to set is by far the majority value (99%+
for Python3 and GCC).

bzero can be slightly more optimized for this case by using a zero-idiom
xor for broadcasting the set value to a register (vector or GPR).

Co-developed-by: Noah Goldstein <goldstein.w.n@gmail.com>

benchtests: Add benches for bzero

Add bench-bzero-large.c, bench-bzero-walk.c and bench-bzero.c.

linux: fix accuracy of get_nprocs and get_nprocs_conf [BZ #28865]

get_nprocs() and get_nprocs_conf() use various methods to obtain an
accurate number of processors.  Re-introduce __get_nprocs_sched() as
a source of information, and fix the order in which these methods are
used to return the most accurate information.  The primary source of
information used in both functions remains unchanged.

This also changes __get_nprocs_sched() error return value from 2 to 0,
but all its users are already prepared to handle that.

Old fallback order:
  get_nprocs:
    /sys/devices/system/cpu/online -> /proc/stat -> 2
  get_nprocs_conf:
    /sys/devices/system/cpu/ -> /proc/stat -> 2

New fallback order:
  get_nprocs:
    /sys/devices/system/cpu/online -> /proc/stat -> sched_getaffinity -> 2
  get_nprocs_conf:
    /sys/devices/system/cpu/ -> /proc/stat -> sched_getaffinity -> 2

Fixes: 342298278e ("linux: Revert the use of sched_getaffinity on get_nproc")
Closes: BZ #28865
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

x86: Remove SSSE3 instruction for broadcast in memset.S (SSE2 Only)

commit b62ace2740a106222e124cc86956448fa07abf4d
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Sun Feb 6 00:54:18 2022 -0600

x86: Improve vec generation in memset-vec-unaligned-erms.S

Revert usage of 'pshufb' in broadcast logic as it is an SSSE3
instruction and memset.S is restricted to only SSE2 instructions.

benchtests: Sort benches in Makefile

Put one bench per line and sort them.

Benchtests: Add length zero benchmark for memset in bench-memset.c

Zero is a relevant size for some workloads (roughly 5% of uses for
GCC) so we should be testing it's performance as well.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86: Improve vec generation in memset-vec-unaligned-erms.S

No bug.

Split vec generation into multiple steps. This allows the
broadcast in AVX2 to use 'xmm' registers for the L(less_vec)
case. This saves an expensive lane-cross instruction and removes
the need for 'vzeroupper'.

For SSE2 replace 2x 'punpck' instructions with zero-idiom 'pxor' for
byte broadcast.

Results for memset-avx2 small (geomean of N = 20 benchset runs).

size, New Time, Old Time, New / Old
   0,    4.100,    3.831,     0.934
   1,    5.074,    4.399,     0.867
   2,    4.433,    4.411,     0.995
   4,    4.487,    4.415,     0.984
   8,    4.454,    4.396,     0.987
  16,    4.502,    4.443,     0.987

All relevant string/wcsmbs tests are passing.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector tan/tanf to libmvec microbenchmark

Add vector tan/tanf and input files to libmvec microbenchmark.

libmvec-tan-inputs:
  90% Normal random distribution
  range: (-DBL_MAX, DBL_MAX)
  mean: 0.0
  sigma: 5.0
  10% uniform random distribution in range (-1000.0, 1000.0)

libmvec-tanf-inputs:
  90% Normal random distribution
  range: (-FLT_MAX, FLT_MAX)
  mean: 0.0f
  sigma: 5.0f
  10% uniform random distribution in range (-1000.0f, 1000.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector erfc/erfcf to libmvec microbenchmark

Add vector erfc/erfcf and input files to libmvec microbenchmark.

libmvec-erfc-inputs:
  90% Normal random distribution
  range: (-6.0, 6.0)
  mean: 0.0
  sigma: 1.0
  10% uniform random distribution in range (-5.9, 5.9)

libmvec-erfcf-inputs:
  90% Normal random distribution
  range: (-4.0f, 4.0f)
  mean: 0.0f
  sigma: 1.0f
  10% uniform random distribution in range (-3.9f, 3.9f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector asinh/asinhf to libmvec microbenchmark

Add vector asinh/asinhf and input files to libmvec microbenchmark.

libmvec-asinh-inputs:
  90% Normal random distribution
  range: (-DBL_MAX, DBL_MAX)
  mean: 0.0
  sigma: 2.0
  10% uniform random distribution in range (-1.0e6, 1.0e6)

libmvec-asinhf-inputs:
  90% Normal random distribution
  range: (-FLT_MAX, FLT_MAX)
  mean: 0.0f
  sigma: 2.0f
  10% uniform random distribution in range (-1.0e6f, 1.0e6f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector tanh/tanhf to libmvec microbenchmark

Add vector tanh/tanhf and input files to libmvec microbenchmark.

libmvec-tanh-inputs:
  90% Normal random distribution
  range: (-19.0, 19.0)
  mean: 0.0
  sigma: 2.0
  10% uniform random distribution in range (-16.0, 16.0)

libmvec-tanhf-inputs:
  90% Normal random distribution
  range: (-10.0f, 10.0f)
  mean: 0.0f
  sigma: 2.0f
  10% uniform random distribution in range (-8.0f, 8.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector erf/erff to libmvec microbenchmark

Add vector erf/erff and input files to libmvec microbenchmark.

libmvec-erf-inputs:
  90% Normal random distribution
  range: (-6.0, 6.0)
  mean: 0.0
  sigma: 1.0
  10% uniform random distribution in range (-5.9, 5.9)

libmvec-erff-inputs:
  90% Normal random distribution
  range: (-4.0f, 4.0f)
  mean: 0.0f
  sigma: 1.0f
  10% uniform random distribution in range (-3.9f, 3.9f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector acosh/acoshf to libmvec microbenchmark

Add vector acosh/acoshf and input files to libmvec microbenchmark.

libmvec-acosh-inputs:
  90% Normal random distribution
  range: (1.0, DBL_MAX)
  mean: 1.0
  sigma: 8.0
  10% uniform random distribution in range (1.0, 1.0e6)

libmvec-acoshf-inputs:
  90% Normal random distribution
  range: (1.0f, FLT_MAX)
  mean: 1.0f
  sigma: 4.0f
  10% uniform random distribution in range (1.0f, 1.0e6f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector atanh/atanhf to libmvec microbenchmark

Add vector atanh/atanhf and input files to libmvec microbenchmark.

libmvec-atanh-inputs:
  90% Normal random distribution
  range: (-1.0, 1.0)
  mean: 0.0
  sigma: 1.0
  10% uniform random distribution in range (-1.0, 1.0)

libmvec-atanhf-inputs:
  90% Normal random distribution
  range: (-1.0f, 1.0f)
  mean: 0.0f
  sigma: 1.0f
  10% uniform random distribution in range (-1.0f, 1.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector log1p/log1pf to libmvec microbenchmark

Add vector log1p/log1pf and input files to libmvec microbenchmark.

libmvec-log1p-inputs:
  70% Normal random distribution
  range: (-1.0, DBL_MAX)
  mean: 0.0
  sigma: 50.0
  30% uniform random distribution in range (-1.0, 1.0e6)

libmvec-log1pf-inputs:
  70% Normal random distribution
  range: (-1.0f, FLT_MAX)
  mean: 0.0f
  sigma: 50.0f
  30% uniform random distribution in range (-1.0f, 1.0e6f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector log2/log2f to libmvec microbenchmark

Add vector log2/log2f and input files to libmvec microbenchmark.

libmvec-log2-inputs:
  70% Normal random distribution
  range: (0.0, DBL_MAX)
  mean: 1.0
  sigma: 50.0
  30% uniform random distribution in range (0.0, 1.0e6)

libmvec-log2f-inputs:
  70% Normal random distribution
  range: (0.0f, FLT_MAX)
  mean: 1.0f
  sigma: 50.0f
  30% uniform random distribution in range (0.0f, 1.0e6f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector log10/log10f to libmvec microbenchmark

Add vector log10/log10f and input files to libmvec microbenchmark.

libmvec-log10-inputs:
  70% Normal random distribution
  range: (0.0, DBL_MAX)
  mean: 1.0
  sigma: 50.0
  30% uniform random distribution in range (0.0, 1.0e6)

libmvec-log10f-inputs:
  70% Normal random distribution
  range: (0.0f, FLT_MAX)
  mean: 1.0f
  sigma: 50.0f
  30% uniform random distribution in range (0.0f, 1.0e6f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector atan2/atan2f to libmvec microbenchmark

Add vector atan2/atan2f and input files to libmvec microbenchmark.

libmvec-atan2-inputs:
  arg1:
    90% Normal random distribution
    range: (-DBL_MAX, DBL_MAX)
    mean: 0.0
    sigma: 4.0
    10% uniform random distribution in range (-1.0e6, 1.0e6)
  arg2:
    90% Normal random distribution
    range: (-DBL_MAX, DBL_MAX)
    mean: 0.0
    sigma: 4.0
    10% uniform random distribution in range (-1.0e6, 1.0e6)

libmvec-atan2f-inputs:
  arg1:
    90% Normal random distribution
    range: (-FLT_MAX, FLT_MAX)
    mean: 0.0f
    sigma: 4.0f
    10% uniform random distribution in range (-1.0e6f, 1.0e6f)
  arg2:
    90% Normal random distribution
    range: (-FLT_MAX, FLT_MAX)
    mean: 0.0f
    sigma: 4.0f
    10% uniform random distribution in range (-1.0e6f, 1.0e6f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector cbrt/cbrtf to libmvec microbenchmark

Add vector cbrt/cbrtf and input files to libmvec microbenchmark.

libmvec-cbrt-inputs:
  90% Normal random distribution
  range: (-DBL_MAX, DBL_MAX)
  mean: 0.0
  sigma: 10.0
  10% uniform random distribution in range (-1000.0, 1000.0)

libmvec-cbrtf-inputs:
  90% Normal random distribution
  range: (-FLT_MAX, FLT_MAX)
  mean: 0.0f
  sigma: 10.0f
  10% uniform random distribution in range (-1000.0f, 1000.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector sinh/sinhf to libmvec microbenchmark

Add vector sinh/sinhf and input files to libmvec microbenchmark.

libmvec-sinh-inputs:
  90% Normal random distribution
  range: (-710.0, 710.0)
  mean: 0.0
  sigma: 32.0
  10% uniform random distribution in range (-500.0, 500.0)

libmvec-sinhf-inputs:
  90% Normal random distribution
  range: (-89.0f, 89.0f)
  mean: 0.0f
  sigma: 16.0f
  10% uniform random distribution in range (-50.0f, 50.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector expm1/expm1f to libmvec microbenchmark

Add vector expm1/expm1f and input files to libmvec microbenchmark.

libmvec-expm1-inputs:
  90% Normal random distribution
  range: (-708.0, 709.0)
  mean: 0.0
  sigma: 16.0
  10% uniform random distribution in range (-500.0, 500.0)

libmvec-expm1f-inputs:
  90% Normal random distribution
  range: (-87.0f, 88.0f)
  mean: 0.0f
  sigma: 8.0f
  10% uniform random distribution in range (-50.0f, 50.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector cosh/coshf to libmvec microbenchmark

Add vector cosh/coshf and input files to libmvec microbenchmark.

libmvec-cosh-inputs:
  90% Normal random distribution
  range: (-710.0, 710.0)
  mean: 0.0
  sigma: 32.0
  10% uniform random distribution in range (-500.0, 500.0)

libmvec-coshf-inputs:
  90% Normal random distribution
  range: (-89.0f, 89.0f)
  mean: 0.0f
  sigma: 16.0f
  10% uniform random distribution in range (-50.0f, 50.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector exp10/exp10f to libmvec microbenchmark

Add vector exp10/exp10f and input files to libmvec microbenchmark.

libmvec-exp10-inputs:
  90% Normal random distribution
  range: (-307.0, 308.0)
  mean: 0.0
  sigma: 16.0
  10% uniform random distribution in range (-250.0, 250.0)

libmvec-exp10f-inputs:
  90% Normal random distribution
  range: (-37.0f, 38.0f)
  mean: 0.0f
  sigma: 8.0f
  10% uniform random distribution in range (-25.0f, 25.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector exp2/exp2f to libmvec microbenchmark

Add vector exp2/exp2f and input files to libmvec microbenchmark.

libmvec-exp2-inputs:
  90% Normal random distribution
  range: (-1022.0, 1024.0)
  mean: 0.0
  sigma: 16.0
  10% uniform random distribution in range (-1000.0, 1000.0)

libmvec-exp2f-inputs:
  90% Normal random distribution
  range: (-126.0f, 128.0f)
  mean: 0.0f
  sigma: 8.0f
  10% uniform random distribution in range (-100.0f, 100.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector hypot/hypotf to libmvec microbenchmark

Add vector hypot/hypotf and input files to libmvec microbenchmark.

libmvec-hypot-inputs:
  arg1:
    90% Normal random distribution
    range: (-DBL_MAX, DBL_MAX)
    mean: 0.0
    sigma: 10.0
    10% uniform random distribution in range (-1000.0, 1000.0)
  arg1:
    90% Normal random distribution
    range: (-DBL_MAX, DBL_MAX)
    mean: 0.0
    sigma: 10.0
    10% uniform random distribution in range (-1000.0, 1000.0)

libmvec-hypotf-inputs:
  arg1:
    90% Normal random distribution
    range: (-FLT_MAX, FLT_MAX)
    mean: 0.0f
    sigma: 10.0f
    10% uniform random distribution in range (-1000.0f, 1000.0f)
  arg2:
    90% Normal random distribution
    range: (-FLT_MAX, FLT_MAX)
    mean: 0.0f
    sigma: 10.0f
    10% uniform random distribution in range (-1000.0f, 1000.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector asin/asinf to libmvec microbenchmark

Add vector asin/asinf and input files to libmvec microbenchmark.

libmvec-asin-inputs:
  90% Normal random distribution
  range: (-1.0, 1.0)
  mean: 0.0
  sigma: 1.0
  10% uniform random distribution in range (-1.0, 1.0)

libmvec-asinf-inputs:
  90% Normal random distribution
  range: (-1.0f, 1.0f)
  mean: 0.0f
  sigma: 1.0f
  10% uniform random distribution in range (-1.0f, 1.0f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector atan/atanf to libmvec microbenchmark

Add vector atan/atanf and input files to libmvec microbenchmark.

libmvec-atan-inputs:
  arg1:
    90% Normal random distribution
    range: (-DBL_MAX, DBL_MAX)
    mean: 0.0
    sigma: 4.0
    10% uniform random distribution in range (-1.0e6, 1.0e6)
  arg2:
    90% Normal random distribution
    range: (-DBL_MAX, DBL_MAX)
    mean: 0.0
    sigma: 4.0
    10% uniform random distribution in range (-1.0e6, 1.0e6)

libmvec-atanf-inputs:
  arg1:
    90% Normal random distribution
    range: (-FLT_MAX, FLT_MAX)
    mean: 0.0f
    sigma: 4.0f
    10% uniform random distribution in range (-1.0e6f, 1.0e6f)
  arg2:
    90% Normal random distribution
    range: (-FLT_MAX, FLT_MAX)
    mean: 0.0f
    sigma: 4.0f
    10% uniform random distribution in range (-1.0e6f, 1.0e6f)

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Replace tst-audit24bmod2.so with tst-audit24bmod2

Replace tst-audit24bmod2.so with tst-audit24bmod2 to silence:

make[2]: Entering directory '/export/gnu/import/git/gitlab/x86-glibc/elf'
Makefile:2201: warning: overriding recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so'
../Makerules:765: warning: ignoring old recipe for target '/export/build/gnu/tools-build/glibc-gitlab/build-x86_64-linux/elf/tst-audit24bmod2.so'

x86_64/multiarch: Sort sysdep_routines and put one entry per line

string: Sort headers, routines, tests and tests-translation

Sort headers, routines, tests and tests-translation. Put one entry per
line.

x86: Improve L to support L(XXX_SYMBOL (YYY, ZZZ))

Benchtests: move 'alloc_bufs' from loop in bench-memset.c

One buf allocation is sufficient. Calling `alloc_bufs' in the loop
just adds unnecessary syscall overhead.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Fix strcmp-evex.S

Change "movl %edx, %rdx" to "movl %edx, %edx" in:

commit 8418eb3ff4b781d31c4ed5dc6c0bd7356bc45db9
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Mon Jan 10 15:35:39 2022 -0600

x86: Optimize strcmp-evex.S

x86-64: Fix strcmp-avx2.S

Change "movl %edx, %rdx" to "movl %edx, %edx" in:

commit b77b06e0e296f1a2276c27a67e1d44f2cfa38d45
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Mon Jan 10 15:35:38 2022 -0600

x86: Optimize strcmp-avx2.S

x86-64: Add vector acos/acosf to libmvec microbenchmark

Add vector acos/acosf and input files to libmvec microbenchmark.

libmvec-acos-inputs:
  90% Normal random distribution
  range: (-1.0, 1.0)
  mean: 0.0
  sigma: 1.0
  10% uniform random distribution in range (-1.0, 1.0)

libmvec-acosf-inputs:
  90% Normal random distribution
  range: (-1.0f, 1.0f)
  mean: 0.0f
  sigma: 1.0f
  10% uniform random distribution in range (-1.0f, 1.0f)

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>

benchtests: Add more coverage for strcmp and strncmp benchmarks

Add more small and medium sized tests for strcmp and strncmp.

As well for strcmp add option for more direct control of
alignment. Previously alignment was being pushed to the end of the
page. While this is the most difficult case to implement, it is far
from the common case and so shouldn't be the only benchmark.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>

x86: Optimize strcmp-evex.S

Optimization are primarily to the loop logic and how the page cross
logic interacts with the loop.

The page cross logic is at times more expensive for short strings near
the end of a page but not crossing the page. This is done to retest
the page cross conditions with a non-faulty check and to improve the
logic for entering the loop afterwards. This is only particular cases,
however, and is general made up for by more than 10x improvements on
the transition from the page cross -> loop case.

The non-page cross cases as well are nearly universally improved.

test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>

x86: Optimize strcmp-avx2.S

Optimization are primarily to the loop logic and how the page cross
logic interacts with the loop.

The page cross logic is at times more expensive for short strings near
the end of a page but not crossing the page. This is done to retest
the page cross conditions with a non-faulty check and to improve the
logic for entering the loop afterwards. This is only particular cases,
however, and is general made up for by more than 10x improvements on
the transition from the page cross -> loop case.

The non-page cross cases are improved most for smaller sizes [0, 128]
and go about even for (128, 4096]. The loop page cross logic is
improved so some more significant speedup is seen there as well.

test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>

string: Improve coverage in test-strcmp.c and test-strncmp.c

Add additional test cases for small / medium sizes.

Add tests in test-strncmp.c where `n` is near ULONG_MAX or LONG_MIN to
test for overflow bugs in length handling.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>

string/test-str*cmp: remove stupid_[strcmp, strncmp, wcscmp, wcsncmp].

These implementations just add to test duration. Since we have
simple_* implementations we already have a safe reference
implementation.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>

linux: Fix missing __convert_scm_timestamps (BZ #28860)

Commit 948ce73b31 made recvmsg/recvmmsg to always call
__convert_scm_timestamps for 64 bit time_t symbol, so adjust it to
always build it for __TIMESIZE != 64.

It fixes build for architecture with 32 bit time_t support when
configured with minimum kernel of 5.1.

linux: __get_nprocs_sched: do not feed CPU_COUNT_S with garbage [BZ #28850]

Pass the actual number of bytes returned by the kernel.

Fixes: 33099d72e41c ("linux: Simplify get_nprocs")
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>

posix: Fix tst-spawn6 terminal handling (BZ #28853)

The test changes the current foreground process group, which might
break testing depending of how the make check is issued.  For instance:

  nohup make -j1 test t=posix/tst-spawn6 | less

Will set 'make' and 'less' to be in the foreground process group in
the current session.  When tst-spawn6 new child takes over it becomes
the foreground process and 'less' is stopped and backgrounded which
interrupts the 'make check' command.

To fix it a pseudo-terminal is allocated, the test starts in new
session (so there is no controlling terminal associated), and the
pseudo-terminal is set as the controlling one (similar to what
login_tty does).

Checked on x86_64-linux-gnu.

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Regenerate configure

Open master branch for glibc 2.36 development

Create ChangeLog.old/ChangeLog.24.

Prepare for glibc 2.35 release.

Update version.h, and include/features.h.

Regenerate configure.

Update install.texi, and regenerate INSTALL.

Update NEWS bug list.

Update NEWS.

Moved LD_AUDIT notes into requirements section since the LAV_CURRENT
bump is a requirements change that impacts loading old audit modules
or new audit modules on older loaders.

Update translations.

Linux: Use ptrdiff_t for __rseq_offset

This matches the data size initial-exec relocations use on most
targets.

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Fix elf/tst-audit25a with default bind now toolchains

This test relies on lazy binding for the executable so request that
explicitly in case the toolchain defaults to bind now.

posix: Replace posix_spawnattr_tc{get,set}pgrp_np with posix_spawn_file_actions_addtcsetpgrp_np

The posix_spawnattr_tcsetpgrp_np works on a file descriptor (the
controlling terminal), so it would make more sense to actually fit
it on the file actions API.

Also, POSIX_SPAWN_TCSETPGROUP is not really required since it is
implicit by the presence of tcsetpgrp file action.

The posix/tst-spawn6.c is also fixed when TTY can is not present.

Checked on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

or1k: Define PI_STATIC_AND_HIDDEN

PI_STATIC_AND_HIDDEN means that references to static functions, data
and symbols with hidden visibility do not need any run-time relocations
after the final link, with the build flags used by glibc.

OpenRISC follows this so enabled PI_STATIC_AND_HIDDEN by adding
configure.ac and generating configure.

Suggested-by: Florian Weimer <fweimer@redhat.com>

SET_RELHOOK: merge i386 and x86_64, and move to sysdeps/mach/hurd/x86

It is not Hurd-specific, but H.J. Lu wants it there.

Also, dc.a can be used to avoid hardcoding .long vs .quad and thus use
the same implementation for i386 and x86_64.

elf: Fix runtime linker auditing on aarch64 (BZ #26643)

The rtld audit support show two problems on aarch64:

  1. _dl_runtime_resolve does not preserve x8, the indirect result
      location register, which might generate wrong result calls
      depending of the function signature.

  2. The NEON Q registers pushed onto the stack by _dl_runtime_resolve
     were twice the size of D registers extracted from the stack frame by
     _dl_runtime_profile.

While 2. might result in wrong information passed on the PLT tracing,
1. generates wrong runtime behaviour.

The aarch64 rtld audit support is changed to:

  * Both La_aarch64_regs and La_aarch64_retval are expanded to include
    both x8 and the full sized NEON V registers, as defined by the
    ABI.

  * dl_runtime_profile needed to extract registers saved by
    _dl_runtime_resolve and put them into the new correctly sized
    La_aarch64_regs structure.

  * The LAV_CURRENT check is change to only accept new audit modules
    to avoid the undefined behavior of not save/restore x8.

  * Different than other architectures, audit modules older than
    LAV_CURRENT are rejected (both La_aarch64_regs and La_aarch64_retval
    changed their layout and there are no requirements to support multiple
    audit interface with the inherent aarch64 issues).

  * A new field is also reserved on both La_aarch64_regs and
    La_aarch64_retval to support variant pcs symbols.

Similar to x86, a new La_aarch64_vector type to represent the NEON
register is added on the La_aarch64_regs (so each type can be accessed
directly).

Since LAV_CURRENT was already bumped to support bind-now, there is
no need to increase it again.

Checked on aarch64-linux-gnu.

Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

elf: Issue la_symbind for bind-now (BZ #23734)

The audit symbind callback is not called for binaries built with
-Wl,-z,now or when LD_BIND_NOW=1 is used, nor the PLT tracking callbacks
(plt_enter and plt_exit) since this would change the expected
program semantics (where no PLT is expected) and would have performance
implications (such as for BZ#15533).

LAV_CURRENT is also bumped to indicate the audit ABI change (where
la_symbind flags are set by the loader to indicate no possible PLT
trace).

To handle powerpc64 ELFv1 function descriptor, _dl_audit_symbind
requires to know whether bind-now is used so the symbol value is
updated to function text segment instead of the OPD (for lazy binding
this is done by PPC64_LOAD_FUNCPTR on _dl_runtime_resolve).

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
powerpc64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

elf: Fix initial-exec TLS access on audit modules (BZ #28096)

For audit modules and dependencies with initial-exec TLS, we can not
set the initial TLS image on default loader initialization because it
would already be set by the audit setup. However, subsequent thread
creation would need to follow the default behaviour.

This patch fixes it by setting l_auditing link_map field not only
for the audit modules, but also for all its dependencies. This is
used on _dl_allocate_tls_init to avoid the static TLS initialization
at load time.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

elf: Add la_activity during application exit

la_activity is not called during application exit, even though
la_objclose is.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

localedata: Adjust C.UTF-8 to align with C/POSIX.

We have had one downstream report from Canonical [1] that
an rrdtool test was broken by the differences in LC_TIME
that we had in the non-builtin C locale (C.UTF-8). If one
application has an issue there are going to be others, and
so with this commit we review and fix all the issues that
cause the builtin C locale to be different from C.UTF-8,
which includes:
* mon_decimal_point should be empty e.g. ""
- Depends on mon_decimal_point_wc fix.
* negative_sign should be empty e.g. ""
* week should be aligned with the builtin C/POSIX locale
* d_fmt corrected with escaped slashes e.g. "%m//%d//%y"
* yesstr and nostr should be empty e.g. ""
* country_ab2 and country_ab3 should be empty e.g. ""

We bump LC_IDENTIFICATION version and adjust the date to
indicate the change in the locale.

A new tst-c-utf8-consistency test is added to ensure
consistency between C/POSIX and C.UTF-8.

Tested on x86_64 and i686 without regression.

[1] https://sourceware.org/pipermail/libc-alpha/2022-January/135703.html

Co-authored-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

localedef: Fix handling of empty mon_decimal_point (Bug 28847)

The handling of mon_decimal_point is incorrect when it comes to
handling the empty "" value.  The existing parser in monetary_read()
will correctly handle setting the non-wide-character value and the
wide-character value e.g. STR_ELEM_WC(mon_decimal_point) if they are
set in the locale definition.  However, in monetary_finish() we have
conflicting TEST_ELEM() which sets a default value (if the locale
definition doesn't include one), and subsequent code which looks for
mon_decimal_point to be NULL to issue a specific error message and set
the defaults. The latter is unused because TEST_ELEM() always sets a
default.  The simplest solution is to remove the TEST_ELEM() check,
and allow the existing check to look to see if mon_decimal_point is
NULL and set an appropriate default.  The final fix is to move the
setting of mon_decimal_point_wc so it occurs only when
mon_decimal_point is being set to a default, keeping both values
consistent. There is no way to tell the difference between
mon_decimal_point_wc having been set to the empty string and not
having been defined at all, for that distinction we must use
mon_decimal_point being NULL or "", and so we must logically set
the default together with mon_decimal_point.

Lastly, there are more fixes similar to this that could be made to
ld-monetary.c, but we avoid that in order to fix just the code
required for mon_decimal_point, which impacts the ability for C.UTF-8
to set mon_decimal_point to "", since without this fix we end up with
an inconsistent setting of mon_decimal_point set to "", but
mon_decimal_point_wc set to "." which is incorrect.

Tested on x86_64 and i686 without regression.
Reviewed-by: Florian Weimer <fweimer@redhat.com>

malloc: Fix tst-mallocalign1 macro spacing.

Reported by Andreas Schwab <schwab@linux-m68k.org>

elf: Add <dl-r_debug.h>

Add <dl-r_debug.h> to get the adddress of the r_debug structure after
relocation and its offset before relocation from the PT_DYNAMIC segment
to support DT_DEBUG, DT_MIPS_RLD_MAP_REL and DT_MIPS_RLD_MAP.

Co-developed-by: Xi Ruoyao <xry111@mengyan1223.wang>

Mention _FORTIFY_SOURCE=3 for gcc12 in NEWS

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

malloc: Fix -Wuse-after-free warning in tst-mallocalign1 [BZ #26779]

The test leaks bits from the freed pointer via the return value
in ret, and the compiler correctly identifies this issue.
We switch the test to use TEST_VERIFY and terminate the test
if any of the pointers return an unexpected alignment.

This fixes another -Wuse-after-free error when compiling glibc
with gcc 12.

Tested on x86_64 and i686 without regression.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

Update libc.pot for 2.35 release.

tst-socket-timestamp-compat.c: Check __TIMESIZE [BZ #28837]

time_t size is defined by __TIMESIZE, not __WORDSIZE. Check __TIMESIZE,
instead of __WORDSIZE, for time_t size. This fixes BZ #28837.

Add prelink removal plan on NEWS

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Linux: Only generate 64 bit timestamps for 64 bit time_t recvmsg/recvmmsg

The timestamps created by __convert_scm_timestamps only make sense for
64 bit time_t programs, 32 bit time_t programs will ignore 64 bit time_t
timestamps since SO_TIMESTAMP will be defined to old values (either by
glibc or kernel headers).

Worse, if the buffer is not suffice MSG_CTRUNC is set to indicate it
(which breaks some programs [1]).

This patch makes only 64 bit time_t recvmsg and recvmmsg to call
__convert_scm_timestamps. Also, the assumption to called it is changed
from __ASSUME_TIME64_SYSCALLS to __TIMESIZE != 64 since the setsockopt
might be called by libraries built without __TIME_BITS=64. The
MSG_CTRUNC is only set for the 64 bit symbols, it should happen only
if 64 bit time_t programs run older kernels.

Checked on x86_64-linux-gnu and i686-linux-gnu.

[1] https://github.com/systemd/systemd/pull/20567

Reviewed-by: Florian Weimer <fweimer@redhat.com>

linux: Fix ancillary 64-bit time timestamp conversion (BZ #28349, BZ#28350)

The __convert_scm_timestamps only updates the control message last
pointer for SOL_SOCKET type, so if the message control buffer contains
multiple ancillary message types the converted timestamp one might
overwrite a valid message.

The test checks if the extra ancillary space is correctly handled
by recvmsg/recvmmsg, where if there is no extra space for the 64-bit
time_t converted message the control buffer should be marked with
MSG_TRUNC. It also check if recvmsg/recvmmsg handle correctly multiple
ancillary data.

Checked on x86_64-linux and on i686-linux-gnu on both 5.11 and
4.15 kernel.

Co-authored-by: Fabian Vogt <fvogt@suse.de>
Reviewed-by: Florian Weimer <fweimer@redhat.com>

support: Add support_socket_so_timestamp_time64

Check if the socket support 64-bit network packages timestamps
(SO_TIMESTAMP and SO_TIMESTAMPNS). This will be used on recvmsg
and recvmmsg tests to check if the timestamp should be generated.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Fix elf/loadfail test build dependencies

There was no direct or indirect make dependency on testobj3.so so the
test could fail with

/B/elf/loadfail: failed to load shared object: testobj3.so: cannot open
shared object file: No such file or directory

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Fix glibc 2.34 ABI omission (missing GLIBC_2.34 in dynamic loader)

The glibc 2.34 release really should have added a GLIBC_2.34
symbol to the dynamic loader. With it, we could move functions such
as dlopen or pthread_key_create that work on process-global state
into the dynamic loader (once we have fixed a longstanding issue
with static linking). Without the GLIBC_2.34 symbol, yet another
new symbol version would be needed because old glibc will fail to
load binaries due to the missing symbol version in ld.so that newly
linked programs will require.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86: Use CHECK_FEATURE_PRESENT to check HLE [BZ #27398]

HLE is disabled on blacklisted CPUs. Use CHECK_FEATURE_PRESENT, instead
of CHECK_FEATURE_ACTIVE, to check HLE.

Guard tst-valgrind-smoke.out with run-built-tests

Prevent tst-valgrind-smoke from running when run-built-tests is not yes.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

hurd: Add posix_spawnattr_tc{get,set}pgrp_np on libc.abilist

Commit 342cc934a3bf74ac missed the update-abi for the ABI.

Avoid -Wuse-after-free in tests [BZ #26779].

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

elf: Replace tst-p_alignmod1-editX with a python script

This avoid the cross-compiling breakage when the test should not run
($(run-built-tests) equal to no).

Checked on x86_64-linux-gnu and i686-linux-gnu as well with a cross
compile to aarch64-linux-gnu and powerpc64-linux-gnu.

stdlib: Avoid -Wuse-after-free in __add_to_environ [BZ #26779]

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

io: Fix use-after-free in ftw [BZ #26779]

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

intl: Avoid -Wuse-after-free [BZ #26779]

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

elf: Fix use-after-free in ldconfig [BZ #26779]

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

posix: Add terminal control setting support for posix_spawn

Currently there is no proper way to set the controlling terminal through
posix_spawn in race free manner [1].  This forces shell implementations
to keep using fork+exec when launching background process groups,
even when using posix_spawn yields better performance.

This patch adds a new GNU extension so the creating process can
configure the created process terminal group.  This is done with a new
flag, POSIX_SPAWN_TCSETPGROUP, along with two new attribute functions:
posix_spawnattr_tcsetpgrp_np, and posix_spawnattr_tcgetpgrp_np.
The function sets a new attribute, spawn-tcgroupfd, that references to
the controlling terminal.

The controlling terminal is set after the spawn-pgroup attribute, and
uses the spawn-tcgroupfd along with current creating process group
(so it is composable with POSIX_SPAWN_SETPGROUP).

To create a process and set the controlling terminal, one can use the
following sequence:

    posix_spawnattr_t attr;
    posix_spawnattr_init (&attr);
    posix_spawnattr_setflags (&attr, POSIX_SPAWN_TCSETPGROUP);
    posix_spawnattr_tcsetpgrp_np (&attr, tcfd);

If the idea is also to create a new process groups:

    posix_spawnattr_t attr;
    posix_spawnattr_init (&attr);
    posix_spawnattr_setflags (&attr, POSIX_SPAWN_TCSETPGROUP
     | POSIX_SPAWN_SETPGROUP);
    posix_spawnattr_tcsetpgrp_np (&attr, tcfd);
    posix_spawnattr_setpgroup (&attr, 0);

The controlling terminal file descriptor is ignored if the new flag is
not set.

This interface is slight different than the one provided by QNX [2],
which only provides the POSIX_SPAWN_TCSETPGROUP flag.  The QNX
documentation does not specify how the controlling terminal is obtained
nor how it iteracts with POSIX_SPAWN_SETPGROUP.  Since a glibc
implementation is library based, it is more straightforward and avoid
requires additional file descriptor operations to request the caller
to setup the controlling terminal file descriptor (and it also allows
a bit less error handling by posix_spawn).

Checked on x86_64-linux-gnu and i686-linux-gnu.

[1] https://github.com/ksh93/ksh/issues/79
[2] https://www.qnx.com/developers/docs/7.0.0/index.html#com.qnx.doc.neutrino.lib_ref/topic/p/posix_spawn.html

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

Linux: Detect user namespace support in io/tst-getcwd-smallbuff

Otherwise the test fails with certain container runtimes.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

Fix handling of unterminated bracket expressions in fnmatch (bug 28792)

When fnmatch processes a bracket expression, and eventually finds it to be
unterminated, it should rescan it, treating the starting bracket as a
normal character. That didn't happen when a matching character was found
while scanning the bracket expression.