sourceware.org Git - glibc.git/log

x86: Fix __wcsncmp_evex in strcmp-evex.S [BZ# 28755]

Fixes [BZ# 28755] for wcsncmp by redirecting length >= 2^56 to
__wcscmp_evex. For x86_64 this covers the entire address range so any
length larger could not possibly be used to bound `s1` or `s2`.

test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>

x86: Fix __wcsncmp_avx2 in strcmp-avx2.S [BZ# 28755]

Fixes [BZ# 28755] for wcsncmp by redirecting length >= 2^56 to
__wcscmp_avx2. For x86_64 this covers the entire address range so any
length larger could not possibly be used to bound `s1` or `s2`.

test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>

math: Fix float conversion regressions with gcc-12 [BZ #28713]

Converting double precision constants to float is now affected by the
runtime dynamic rounding mode instead of being evaluated at compile
time with default rounding mode (except static object initializers).

This can change the computed result and cause performance regression.
The known correctness issues (increased ulp errors) are already fixed,
this patch fixes remaining cases of unnecessary runtime conversions.

Add float M_* macros to math.h as new GNU extension API.  To avoid
conversions the new M_* macros are used and instead of casting double
literals to float, use float literals (only required if the conversion
is inexact).

The patch was tested on aarch64 where the following symbols had new
spurious conversion instructions that got fixed:

  __clog10f
  __gammaf_r_finite@GLIBC_2.17
  __j0f_finite@GLIBC_2.17
  __j1f_finite@GLIBC_2.17
  __jnf_finite@GLIBC_2.17
  __kernel_casinhf
  __lgamma_negf
  __log1pf
  __y0f_finite@GLIBC_2.17
  __y1f_finite@GLIBC_2.17
  cacosf
  cacoshf
  casinhf
  catanf
  catanhf
  clogf
  gammaf_positive

Fixes bug 28713.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>

elf: Simplify software TM implementation in _dl_find_object

With the current set of fences, the version update at the start
of the TM write operation is redundant, and the version update
at the end does not need to use an atomic read-modify-write
operation.

Also use relaxed MO stores during the dlclose update, and skip any
version changes there.

Suggested-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

Restore ENTRY_POINT definition on hppa, ia64 (bug 28749)

ENTRY_POINT is still needed for elf/rtld.c. Fixes commit 4fb4e7e821e3
("csu: Always use __executable_start in gmon-start.c").

elf: Fix fences in _dl_find_object_update (bug 28745)

As explained in Hans Boehm, Can Seqlocks Get Along with Programming
Language Memory Models?, an acquire fence is needed in
_dlfo_read_success. The lack of a fence resulted in an observable
bug on powerpc64le compile-time load reordering.

The fence in _dlfo_mappings_begin_update has been reordered, turning
the fence/store sequence into a release MO store equivalent.

Relaxed MO loads are used on the reader side, and relaxed MO stores
on the writer side for the shared data, to avoid formal data races.
This is just to be conservative; it should not actually be necessary
given how the data is used.

This commit also fixes the test run time. The intent was to run it
for 3 seconds, but 0.3 seconds was enough to uncover the bug very
occasionally (while 3 seconds did not reliably show the bug on every
test run).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

ttydefaults.h: Fix CSTATUS to control-t

4.4BSD actually defaults CSTATUS to control-t, so our generic header should
as well.

AArch64: Check for SVE in ifuncs [BZ #28744]

Add a check for SVE in the A64FX ifuncs for memcpy, memset and memmove.
This fixes BZ #28744.

debug: Remove catchsegv and libSegfault (BZ #14913)

Trapping SIGSEGV within the process is error-prone, adds security
issues, and modern analysis design tends to happen out of the
process (either by attaching a debugger or by post-mortem analysis).

The libSegfault also has some design problems, it uses non
async-signal-safe function (backtrace) on signal handler.

There are multiple alternatives if users do want to use similar
functionality, such as sigsegv gnulib module or libsegfault.

Documentation for OpenRISC port

OpenRISC architecture specification:

https://raw.githubusercontent.com/openrisc/doc/master/openrisc-arch-1.3-rev1.pdf

Currently the port as of the 2022-01-03 rebasing there are no known
architecture specific test failures.

Writing credits for the port are:

Stafford Horne <shorne@gmail.com>
Christian Svensson <blue@cmd.nu>

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

build-many-glibcs.py: add OpenRISC support

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: Build Infrastructure

Here we define the minumum linux kernel version at 5.4.0, as that is the
long term support version where 32-bit architectures start to support
64-bit time API's. The OpenRISC kernel had some bugs up until version 5.8
which caused issues with glibc fork/clone, they have been backported to
5.4 but not previous versions.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: ABI lists

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: Linux ABI

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: Linux Syscall Interface

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: math soft float support

OpenRISC support hard float but I will like to submit that after glibc
soft float goes upstream. The hard float support depends on adding user
access to the FPCSR, which is not supported by the kernel yet.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: Atomics and Locking primitives

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: Thread Local Storage support

OpenRISC includes 3 TLS addressing models. Local Dynamic optimizations
are not done in the linker and therefore use the same code sequences as
Global Dynamic.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: startup and dynamic linking code

Code for C runtime startup and dynamic loading including PLT layout.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

or1k: ABI Implementation

This code deals with the OpenRISC ABI.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

linux/syscalls: Add or1k_atomic syscall for OpenRISC

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Add reloc for OpenRISC

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Add a comment after trailing backslashes

elf: Also try DT_RUNPATH for LD_AUDIT dlopen [BZ #28455]

DT_RUNPATH is only used to find the immediate dependencies of the
executable or shared object containing the DT_RUNPATH entry. Update
LD_AUDIT dlopen call to try the DT_RUNPATH entry of the executable.

Add tst-audit14a, which is copied from tst-audit14, to DT_RUNPATH and
build tst-audit14 with -Wl,--disable-new-dtags to test DT_RPATH.

This partially fixes BZ #28455.

elf: Fix tst-linkall-static link when pthread is not in libc

In that case we want to link in libanl.a, thus providing getaddrinfo_a.

elf: Sort tests and modules-names

Sort tests and modules-names to reduce future conflicts.

hurd: nuke all unknown ports on exec

Ports which are not in the ports table or dtable will not make sense for the
new program, so we can nuke them. Actually we shall, otherwise we would
be leaking various ports, for instance the file_t of the executed program
itself.

hurd: Fix auth port leak

If access() was used before exec, _hurd_id.rid_auth would cache an
"effective" auth port. We do not want this to leak into the executed
program.

Remove stale reference to libanl.a

Since dbb949f53d4 ("resolv: Move libanl into libc (if libpthread is in
libc)") libanl.a is empty, so linking against it no longer necessary.

elf: Add <dl-debug.h>

Add <dl-debug.h> to setup debugging entry in PT_DYNAMIC segment to support
DT_DEBUG, DT_MIPS_RLD_MAP_REL and DT_MIPS_RLD_MAP.

Tested on x86-64, x32 and i686 as well as with build-many-glibcs.py.

Properly check linker option in LIBC_LINKER_FEATURE [BZ #28738]

Update LIBC_LINKER_FEATURE to also check linker warning message since
unknown linker -z option may be ignored by linker:

$ touch x.c
$ gcc -shared -Wl,-z,foobar x.c
/usr/bin/ld: warning: -z foobar ignored
$ echo $?
0
$

This fixes BZ #28738.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

hurd: Implement _S_msg_get_dtable

This will be needed for implementing lsof.

Update automatically-generated copyright dates

These were updated simply by running "make" to regen the files.

Sync move-if-change from Gnulib, updating copyright

Update copyright dates not handled by scripts/update-copyrights.

I've updated copyright dates in glibc for 2022. This is the patch for
the changes not generated by scripts/update-copyrights and subsequent
build / regeneration of generated files. As well as the usual annual
updates, mainly dates in --version output (minus csu/version.c which
previously had to be handled manually but is now successfully updated
by update-copyrights), there is a small change to the copyright notice
in NEWS which should let NEWS get updated automatically next year.

Please remember to include 2022 in the dates for any new files added
in future (which means updating any existing uncommitted patches you
have that add new files to use the new copyright dates in them).

Update copyright dates with scripts/update-copyrights

I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines

hurd: Use __trivfs_server_name instead of trivfs_server_name

The latter violates namespace contraints

hurd: Bump BRK_START to 0x20000000

By nowadays uses, 256MiB is not that large for the program+libraries.
Let's push the heap further to leave room for e.g. clang.

hurd: Avoid overzealous shared objects constraints

407765e9f24f ("hurd: Fix ELF_MACHINE_USER_ADDRESS_MASK value") switched
ELF_MACHINE_USER_ADDRESS_MASK from 0xf8000000UL to 0xf0000000UL to let
libraries etc. get loaded at 0x08000000. But
ELF_MACHINE_USER_ADDRESS_MASK is actually only meaningful for the main
program anyway, so keep it at 0xf8000000UL to prevent the program loader
from putting ld.so beyond 0x08000000. And conversely, drop the use of
ELF_MACHINE_USER_ADDRESS_MASK for shared objects, which don't need any
constraints since the program will have already be loaded by then.

time: Refactor timesize.h for some ABIs

Commit a4b413135535c83a changed default __TIMESIZE to 64, however
it added sub-architecture timesize.h for powerpc, s390, and
sparc.

Also simplify mips by removing _MIPS_SIM usage (which would require
to add sgidefs inclusion.

hurd: Make getrandom a stub inside the random translator

glibc uses /dev/urandom for getrandom(), and from version 2.34 malloc
initialization uses it. We have to detect when we are running the random
translator itself, in which case we can't read ourself.

open64: Force O_LARGEFILE on all architectures

When running tests on OpenRISC which has 32-bit wordsize but 64-bit
timesize it was found that O_LARGEFILE is not being set when calling
open64.  For 64-bit architectures the O_LARGEFILE flag is generally
implied by the kernel according to force_o_largefile.  However, for
32-bit architectures this is not done.

For this patch we unconditionally now set the O_LARGEFILE flag for
open64 class syscalls as there is no harm in doing so.

Tested on the OpenRISC the build works and timezone/tst-tzset passes
which was failing before.  I would expect this also would fix arc.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

x86-64: Add vector tan/tanf implementation to libmvec

Implement vectorized tan/tanf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector tan/tanf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector erfc/erfcf implementation to libmvec

Implement vectorized erfc/erfcf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector erfc/erfcf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

resolv: Do not install libanl.so symbolic link

resolv: Do not build libanl.so for ABIs starting at 2.35

timezone: test-case for BZ #28707

This test-case is the tzfile for Asuncion generated by
tzlib-2021e as follows, using the tzlib-2021e zic: "zic -d
DEST -r @1546300800 -L /dev/null -b slim
SOURCE/southamerica". Note that in its type 2 header, it
has two entries in its "time-types" array (types), but only
one entry in its "transition types" array (type_idxs).

* timezone/Makefile, timezone/tst-pr28707.c,
timezone/testdata/gen-XT5.sh: New test.

Co-authored-by: Christopher Wong <Christopher.Wong@axis.com>

timezone: handle truncated timezones from tzcode-2021d and later (BZ #28707)

When using a timezone file with a truncated starting time,
generated by the zic in IANA tzcode-2021d a.k.a. tzlib-2021d
(also in tzlib-2021e; current as of this writing), glibc
asserts in __tzfile_read (on e.g. tzset() for this file) and
you may find lines matching "tzfile.c:435: __tzfile_read:
Assertion `num_types == 1' failed" in your syslog.

One example of such a file is the tzfile for Asuncion
generated by tzlib-2021e as follows, using the tzlib-2021e zic:
"zic -d DEST -r @1546300800 -L /dev/null -b slim
SOURCE/southamerica".  Note that in its type 2 header, it has
two entries in its "time-types" array (types), but only one
entry in its "transition types" array (type_idxs).

This is valid and expected already in the published RFC8536, and
not even frowned upon: "Local time for timestamps before the
first transition is specified by the first time type (time type
0)" ... "every nonzero local time type index SHOULD appear at
least once in the transition type array".  Note the "nonzero ...
index".  Until the 2021d zic, index 0 has been shared by the
first valid transition but with 2021d it's separate, set apart
as a placeholder and only "implicitly" indexed.  (A draft update
of the RFC mandates that the entry at index 0 is a placeholder
in this case, hence can no longer be shared.)

* time/tzfile.c (__tzfile_read): Don't assert when no transitions
are found.

Co-authored-by: Christopher Wong <Christopher.Wong@axis.com>

x86-64: Add vector asinh/asinhf implementation to libmvec

Implement vectorized asinh/asinhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector asinh/asinhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector tanh/tanhf implementation to libmvec

Implement vectorized tanh/tanhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector tanh/tanhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector erf/erff implementation to libmvec

Implement vectorized erf/erff containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector erf/erff with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector acosh/acoshf implementation to libmvec

Implement vectorized acosh/acoshf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector acosh/acoshf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector atanh/atanhf implementation to libmvec

Implement vectorized atanh/atanhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector atanh/atanhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector log1p/log1pf implementation to libmvec

Implement vectorized log1p/log1pf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector log1p/log1pf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector log2/log2f implementation to libmvec

Implement vectorized log2/log2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector log2/log2f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector log10/log10f implementation to libmvec

Implement vectorized log10/log10f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector log10/log10f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector atan2/atan2f implementation to libmvec

Implement vectorized atan2/atan2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector atan2/atan2f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector cbrt/cbrtf implementation to libmvec

Implement vectorized cbrt/cbrtf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector cbrt/cbrtf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector sinh/sinhf implementation to libmvec

Implement vectorized sinh/sinhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector sinh/sinhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector expm1/expm1f implementation to libmvec

Implement vectorized expm1/expm1f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector expm1/expm1f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector cosh/coshf implementation to libmvec

Implement vectorized cosh/coshf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector cosh/coshf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector exp10/exp10f implementation to libmvec

Implement vectorized exp10/exp10f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector exp10/exp10f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector exp2/exp2f implementation to libmvec

Implement vectorized exp2/exp2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector exp2/exp2f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector hypot/hypotf implementation to libmvec

Implement vectorized hypot/hypotf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector hypot/hypotf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector asin/asinf implementation to libmvec

Implement vectorized asin/asinf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector asin/asinf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86-64: Add vector atan/atanf implementation to libmvec

Implement vectorized atan/atanf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector atan/atanf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Add _dl_find_object function

It can be used to speed up the libgcc unwinder, and the internal
_dl_find_dso_for_object function (which is used for caller
identification in dlopen and related functions, and in dladdr).

_dl_find_object is in the internal namespace due to bug 28503.
If libgcc switches to _dl_find_object, this namespace issue will
be fixed. It is located in libc for two reasons: it is necessary
to forward the call to the static libc after static dlopen, and
there is a link ordering issue with -static-libgcc and libgcc_eh.a
because libc.so is not a linker script that includes ld.so in the
glibc build tree (so that GCC's internal -lc after libgcc_eh.a does
not pick up ld.so).

It is necessary to do the i386 customization in the
sysdeps/x86/bits/dl_find_object.h header shared with x86-64 because
otherwise, multilib installations are broken.

The implementation uses software transactional memory, as suggested
by Torvald Riegel. Two copies of the supporting data structures are
used, also achieving full async-signal-safety.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

malloc: Remove memusage.h

And use machine-sp.h instead. The Linux implementation is based on
already provided CURRENT_STACK_FRAME (used on nptl code) and
STACK_GROWS_UPWARD is replaced with _STACK_GROWS_UP.

malloc: Use hp-timing on libmemusage

Instead of reimplemeting on GETTIME macro.

Remove atomic-machine.h atomic typedefs

Now that memusage.c uses generic types we can remove them.

malloc: Remove atomic_* usage

These typedef are used solely on memusage and can be replaced with
generic types.

microblaze: Add missing implementation when !__ASSUME_TIME64_SYSCALLS

In commit a92f4e6299fe0e3cb6f77e79de00817aece501ce ("linux: Add time64
pselect support"), a Microblaze specific implementation of
__pselect32() was added to cover the case of kernels < 3.15 which lack
the pselect6 system call.

This new file sysdeps/unix/sysv/linux/microblaze/pselect32.c takes
precedence over the default implementation
sysdeps/unix/sysv/linux/pselect32.c.

However sysdeps/unix/sysv/linux/pselect32.c provides an implementation
of __pselect32() which is needed when __ASSUME_TIME64_SYSCALLS is not
defined. On Microblaze, which is a 32-bit architecture,
__ASSUME_TIME64_SYSCALLS is only true for kernels >= 5.1.

Due to sysdeps/unix/sysv/linux/microblaze/pselect32.c taking
precedence over sysdeps/unix/sysv/linux/pselect32.c, it means that
when we are with a kernel >= 3.15 but < 5.1, we need a __pselect32()
implementation, but sysdeps/unix/sysv/linux/microblaze/pselect32.c
doesn't provide it, and sysdeps/unix/sysv/linux/pselect32.c which
would provide it is not compiled in.

This causes the following build failure on Microblaze with for example
Linux kernel headers 4.9:

[...]/build/libc_pic.os: in function `__pselect64':
(.text+0x120b44): undefined reference to `__pselect32'
collect2: error: ld returned 1 exit status

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Do not fail for failed dlmopen on audit modules (BZ #28061)

The dl_main sets the LM_ID_BASE to RT_ADD just before starting to
add load new shared objects. The state is set to RT_CONSISTENT just
after all objects are loaded.

However if a audit modules tries to dlmopen an inexistent module,
the _dl_open will assert that the namespace is in an inconsistent
state.

This is different than dlopen, since first it will not use
LM_ID_BASE and second _dl_map_object_from_fd is the sole responsible
to set and reset the r_state value.

So the assert on _dl_open can not really be seen if the state is
consistent, since _dt_main resets it. This patch removes the assert.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Issue audit la_objopen for vDSO

The vDSO is is listed in the link_map chain, but is never the subject of
an la_objopen call. A new internal flag __RTLD_VDSO is added that
acts as __RTLD_OPENEXEC to allocate the required 'struct auditstate'
extra space for the 'struct link_map'.

The return value from the callback is currently ignored, since there
is no PLT call involved by glibc when using the vDSO, neither the vDSO
are exported directly.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add audit tests for modules with TLSDESC

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Avoid unnecessary slowdown from profiling with audit (BZ#15533)

The rtld-audit interfaces introduces a slowdown due to enabling
profiling instrumentation (as if LD_AUDIT implied LD_PROFILE).
However, instrumenting is only necessary if one of audit libraries
provides PLT callbacks (la_pltenter or la_pltexit symbols). Otherwise,
the slowdown can be avoided.

The following patch adjusts the logic that enables profiling to iterate
over all audit modules and check if any of those provides a PLT hook.
To keep la_symbind to work even without PLT callbacks, _dl_fixup now
calls the audit callback if the modules implements it.

Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_pltexit

It consolidates the code required to call la_pltexit audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_pltenter

It consolidates the code required to call la_pltenter audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_preinit

It consolidates the code required to call la_preinit audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_symbind_alt and _dl_audit_symbind

It consolidates the code required to call la_symbind{32,64} audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_objclose

It consolidates the code required to call la_objclose audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_objsearch

It consolidates the code required to call la_objsearch audit
callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_activity_map and _dl_audit_activity_nsid

It consolidates the code required to call la_activity audit
callback.

Also for a new Lmid_t the namespace link_map list are empty, so it
requires to check if before using it. This can happen for when audit
module is used along with dlmopen.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add _dl_audit_objopen

It consolidates the code required to call la_objopen audit callback.

Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

hurd: Fix static-PIE startup

hurd initialization stages use RUN_HOOK to run various initialization
functions. That is however using absolute addresses which need to be
relocated, which is done later by csu. We can however easily make the
linker compute relative addresses which thus don't need a relocation.
The new SET_RELHOOK and RUN_RELHOOK macros implement this.

hurd: let csu initialize tls

Since 9cec82de715b ("htl: Initialize later"), we let csu initialize
pthreads. We can thus let it initialize tls later too, to better align
with the generic order. Initialization however accesses ports which
links/unlinks into the sigstate for unwinding. We can however easily
skip that during initialization.

hurd: Fix XFAIL-ing mallocfork2 tests

They are using setpshared but are outside the htl directory.

hurd: XFAIL more tests that require setpshared support

malloc: Add missing shared thread library flags

stdio-common: Fix %m sprintf test output for GNU/Hurd

GNU/Hurd has slightly different error messages for undefined numbers,
due to the notion of error subsystems.

x86: Optimize L(less_vec) case in memcmpeq-evex.S

No bug.
Optimizations are twofold.

1) Replace page cross and 0/1 checks with masked load instructions in
   L(less_vec). In applications this reduces branch-misses in the
   hot [0, 32] case.
2) Change controlflow so that L(less_vec) case gets the fall through.

Change 2) helps copies in the [0, 32] size range but comes at the cost
of copies in the [33, 64] size range.  From profiles of GCC and
Python3, 94%+ and 99%+ of calls are in the [0, 32] range so this
appears to the the right tradeoff.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

x86: Optimize L(less_vec) case in memcmp-evex-movbe.S

No bug.
Optimizations are twofold.

1) Replace page cross and 0/1 checks with masked load instructions in
   L(less_vec). In applications this reduces branch-misses in the
   hot [0, 32] case.
2) Change controlflow so that L(less_vec) case gets the fall through.

Change 2) helps copies in the [0, 32] size range but comes at the cost
of copies in the [33, 64] size range.  From profiles of GCC and
Python3, 94%+ and 99%+ of calls are in the [0, 32] range so this
appears to the the right tradeoff.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Remove AArch64 from comment for AT_MINSIGSTKSZ

Remove AArch64 from comment for AT_MINSIGSTKSZ to match

commit 7cd60e43a6def40ecb75deb8decc677995970d0b
Author: Chang S. Bae <chang.seok.bae@intel.com>
Date:   Tue May 18 13:03:15 2021 -0700

    uapi/auxvec: Define the aux vector AT_MINSIGSTKSZ

    Define AT_MINSIGSTKSZ in the generic uapi header. It is already used
    as generic ABI in glibc's generic elf.h, and this define will prevent
    future namespace conflicts. In particular, x86 is also using this
    generic definition.

in Linux kernel 5.14.

math: Properly cast X_TLOSS to float [BZ #28713]

Add

#define AS_FLOAT_CONSTANT_1(x) x##f
#define AS_FLOAT_CONSTANT(x) AS_FLOAT_CONSTANT_1(x)

to cast X_TLOSS to float at compile-time to fix:

FAIL: math/test-float-j0
FAIL: math/test-float-jn
FAIL: math/test-float-y0
FAIL: math/test-float-y1
FAIL: math/test-float-yn
FAIL: math/test-float32-j0
FAIL: math/test-float32-jn
FAIL: math/test-float32-y0
FAIL: math/test-float32-y1
FAIL: math/test-float32-yn

when compiling with GCC 12.

Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>

Set default __TIMESIZE default to 64

This is expected size for newer ABIs.

stdio: Implement %#m for vfprintf and related functions

%#m prints errno as an error constant if one is available, or
a decimal number as a fallback. This intends to address the gap
that strerrorname_np does not work well with printf for unknown
error codes due to its NULL return values in those cases.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Remove unused NEED_DL_BASE_ADDR and _dl_base_addr

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

x86-64: Add vector acos/acosf implementation to libmvec

Implement vectorized acos/acosf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI. It also contains
accuracy and ABI tests for vector acos/acosf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

intl/plural.y: Avoid conflicting declarations of yyerror and yylex

bison-3.8 includes these lines in the generated intl/plural.c:

  #if !defined __gettexterror && !defined YYERROR_IS_DECLARED
  void __gettexterror (struct parse_args *arg, const char *msg);
  #endif
  #if !defined __gettextlex && !defined YYLEX_IS_DECLARED
  int __gettextlex (YYSTYPE *yylvalp, struct parse_args *arg);
  #endif

Those default prototypes provided by bison conflict with the
declarations later on in plural.y.  This patch solves the issue.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

elf: Remove excessive p_align check on PT_LOAD segments [BZ #28688]

p_align does not have to be a multiple of the page size. Only PT_LOAD
segment layout should be aligned to the page size.

1: Remove p_align check against the page size.
2. Use the page size, instead of p_align, to check PT_LOAD segment layout.

Reviewed-by: Florian Weimer <fweimer@redhat.com>