Add a test to check for duplicate definitions in the static library
This change follows two previous fixes addressing multiple definitions
of __memcpy_chk and __mempcpy_chk functions on i586, and __memmove_chk
and __memset_chk functions on i686. The test is intended to prevent
such issues from occurring in the future.
Signed-off-by: Gabi Falk <gabifalk@gmx.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
i686: Fix multiple definitions of __memmove_chk and __memset_chk
Commit c73c96a4a1af1326df7f96eec58209e1e04066d8 updated memcpy.S and
mempcpy.S, but omitted memmove.S and memset.S. As a result, the static
library built as PIC, whether with or without multiarch support,
contains two definitions for each of the __memmove_chk and __memset_chk
symbols.
/usr/lib/gcc/i686-pc-linux-gnu/14/../../../../i686-pc-linux-gnu/bin/ld: /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset-ia32.o): in function `__memset_chk':
/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/string/../sysdeps/i386/i686/memset.S:32: multiple definition of `__memset_chk'; /usr/lib/gcc/i686-pc-linux-gnu/14/../../../../lib/libc.a(memset_chk.o):/var/tmp/portage/sys-libs/glibc-2.39-r3/work/glibc-2.39/debug/../sysdeps/i386/i686/multiarch/memset_chk.c:24: first defined here
After this change, regardless of PIC options, the static library, built
for i686 with multiarch contains implementations of these functions
respectively from debug/memmove_chk.c and debug/memset_chk.c, and
without multiarch contains implementations of these functions
respectively from sysdeps/i386/memmove_chk.S and
sysdeps/i386/memset_chk.S. This ensures that memmove and memset won't
pull in __chk_fail and the routines it calls.
Reported-by: Sam James <sam@gentoo.org> Tested-by: Sam James <sam@gentoo.org> Fixes: c73c96a4a1 ("i686: Fix build with --disable-multiarch") Signed-off-by: Gabi Falk <gabifalk@gmx.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
i586: Fix multiple definitions of __memcpy_chk and __mempcpy_chk
/home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy_chk.o): in function `__memcpy_chk':
/home/bmg/src/glibc/debug/../sysdeps/i386/memcpy_chk.S:29: multiple definition of `__memcpy_chk';/home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(memcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here /home/bmg/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/13.2.1/../../../../x86_64-glibc-linux-gnu/bin/ld: /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy_chk.o): in function `__mempcpy_chk': /home/bmg/src/glibc/debug/../sysdeps/i386/mempcpy_chk.S:28: multiple definition of `__mempcpy_chk'; /home/bmg/build/glibcs/i586-linux-gnu/glibc/libc.a(mempcpy.o):/home/bmg/src/glibc/string/../sysdeps/i386/i586/memcpy.S:31: first defined here
After this change, the static library built for i586, regardless of PIC
options, contains implementations of these functions respectively from
sysdeps/i386/memcpy_chk.S and sysdeps/i386/mempcpy_chk.S. This ensures
that memcpy and mempcpy won't pull in __chk_fail and the routines it
calls.
Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Gabi Falk <gabifalk@gmx.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Sergey Bugaev [Mon, 6 Nov 2023 13:50:51 +0000 (16:50 +0300)]
hurd: Stop mapping AT_NO_AUTOMOUNT to O_NOTRANS
While AT_NO_AUTOMOUNT is similar in function to the Hurd's O_NOTRANS,
there are significant enough differences in semantics:
1. AT_NO_AUTOMOUNT has no effect on already established mounts,
whereas O_NOTRANS causes the lookup to ignore both passive and active
translators. A better approximation of the AT_NO_AUTOMOUNT behavior
would be to honor active translators, but avoid starting passive
ones; like what the file_name_lookup_carefully () routine from
sutils/clookup.c in the Hurd source tree does.
2. On GNU/Hurd, translators are used much more pervasively than mounts
on "traditional" Unix systems: among other things, translators
underlie features like symlinks, device nodes, and sockets. And while
on a "traditional" Unix system, the mountpoint and the root of the
mounted tree may look similar enough for many purposes (they're both
directories, for one thing), the Hurd allows for any combination of
the two node types, and indeed it is common to have e.g. a device
node "mounted" on top of a regular file node on the underlying
filesystem. Ignoring the translator and stat'ing the underlying node
is therefore likely to return very different results from what you'd
get if you stat the translator's root node.
In practice, mapping AT_NO_AUTOMOUNT to O_NOTRANS was breaking GNU
Coreutils, including stat(1) and ls(1):
This was also breaking GNOME's glib, where a g_local_file_stat () call
that is supposed to stat () a file through a symlink uses
AT_NO_AUTOMOUNT, which gets mapped to O_NOTRANS, which then causes the
stat () call to stat symlink itself like lstat () would, rather then the
file it points to, which is what the logic expects to happen.
Mark Wielaard [Sun, 28 Apr 2024 14:59:39 +0000 (16:59 +0200)]
Make sure INSTALL is ASCII plaintext again
This reverts commit 84e93afc7 ("Switch to UTF-8 for INSTALL") and
reinstates commit c14f2e4aa ("Make sure INSTALL is ASCII plaintext")
and regenerates INSTALL.
It turns out that different versions of makeinfo (texinfo/texi2any),
at least versions 7.0.3 and 7.1, put unicode quote glyphs in different
places (specifically whether contractions like you'd, don't, aren't or
you'll use ’ or '). This breaks the make dist target as used for
(snapshot) releases, which have a check on the regenerated INSTALL
file. Using --disable-encoding generates the same plaintext ASCII on
all versions.
An alternative would be to regenerate INSTALL with texinfo 7.1 and
require at least that version. But that seems too soon while various
distros don't have 7.1 yet. We can try again to use UTF-8 for INSTALL
in a couple of years.
x86: In ld.so, diagnose missing APX support in APX-only builds
At this point, this is mainly a tool for testing the early ld.so
CPU compatibility diagnostics: GCC uses the new instructions in most
functions, so it's easy to spot if some of the early code is not
built correctly.
H.J. Lu [Thu, 25 Apr 2024 15:06:52 +0000 (08:06 -0700)]
elf: Also compile dl-misc.os with $(rtld-early-cflags)
Also compile dl-misc.os with $(rtld-early-cflags) to avoid
Program received signal SIGILL, Illegal instruction.
0x00007ffff7fd36ea in _dl_strtoul (nptr=nptr@entry=0x7fffffffe2c9 "2",
endptr=endptr@entry=0x7fffffffd728) at dl-misc.c:156
156 bool positive = true;
(gdb) bt
#0 0x00007ffff7fd36ea in _dl_strtoul (nptr=nptr@entry=0x7fffffffe2c9 "2",
endptr=endptr@entry=0x7fffffffd728) at dl-misc.c:156
#1 0x00007ffff7fdb1a9 in tunable_initialize (
cur=cur@entry=0x7ffff7ffbc00 <tunable_list+2176>,
strval=strval@entry=0x7fffffffe2c9 "2", len=len@entry=1)
at dl-tunables.c:131
#2 0x00007ffff7fdb3a2 in parse_tunables (valstring=<optimized out>)
at dl-tunables.c:258
#3 0x00007ffff7fdb5d9 in __GI___tunables_init (envp=0x7fffffffdd58)
at dl-tunables.c:288
#4 0x00007ffff7fe44c3 in _dl_sysdep_start (
start_argptr=start_argptr@entry=0x7fffffffdcb0,
dl_main=dl_main@entry=0x7ffff7fe5f80 <dl_main>)
at ../sysdeps/unix/sysv/linux/dl-sysdep.c:110
#5 0x00007ffff7fe5cae in _dl_start_final (arg=0x7fffffffdcb0) at rtld.c:494
#6 _dl_start (arg=0x7fffffffdcb0) at rtld.c:581
#7 0x00007ffff7fe4b38 in _start ()
(gdb)
when setting GLIBC_TUNABLES in glibc compiled with APX. Reviewed-by: Florian Weimer <fweimer@redhat.com>
CVE-2024-33601, CVE-2024-33602: nscd: netgroup: Use two buffers in addgetnetgrentX (bug 31680)
This avoids potential memory corruption when the underlying NSS
callback function does not use the buffer space to store all strings
(e.g., for constant strings).
Instead of custom buffer management, two scratch buffers are used.
This increases stack usage somewhat.
Scratch buffer allocation failure is handled by return -1
(an invalid timeout value) instead of terminating the process.
This fixes bug 31679.
The addgetnetgrentX call in addinnetgrX may have failed to produce
a result, so the result variable in addinnetgrX can be NULL.
Use db->negtimeout as the fallback value if there is no result data;
the timeout is also overwritten below.
Also avoid sending a second not-found response. (The client
disconnects after receiving the first response, so the data stream did
not go out of sync even without this fix.) It is still beneficial to
add the negative response to the mapping, so that the client can get
it from there in the future, instead of going through the socket.
H.J. Lu [Tue, 23 Apr 2024 20:59:50 +0000 (13:59 -0700)]
x86: Define MINIMUM_X86_ISA_LEVEL in config.h [BZ #31676]
Define MINIMUM_X86_ISA_LEVEL at configure time to avoid
/usr/bin/ld: …/build/elf/librtld.os: in function `init_cpu_features':
…/git/elf/../sysdeps/x86/cpu-features.c:1202: undefined reference to `_dl_runtime_resolve_fxsave'
/usr/bin/ld: …/build/elf/librtld.os: relocation R_X86_64_PC32 against undefined hidden symbol `_dl_runtime_resolve_fxsave' can not be used when making a shared object
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
when glibc is built with -march=x86-64-v3 and configured with
--with-rtld-early-cflags=-march=x86-64, which is used to allow ld.so to
print an error message on unsupported CPUs:
Fatal glibc error: CPU does not support x86-64-v3
This fixes BZ #31676. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
The current IFUNC selection is always using the most recent
features which are available via AT_HWCAP. But in
some scenarios it is useful to adjust this selection.
The environment variable:
GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,zzz,....
can be used to enable HWCAP feature yyy, disable HWCAP feature xxx,
where the feature name is case-sensitive and has to match the ones
used in sysdeps/loongarch/cpu-tunables.c.
Carlos O'Donell [Mon, 22 Apr 2024 12:16:09 +0000 (08:16 -0400)]
locale: Handle loading a missing locale twice (Bug 14247)
Delay setting file->decided until the data has been successfully loaded
by _nl_load_locale(). If the function fails to load the data then we
must return and error and leave decided untouched to allow the caller to
attempt to load the data again at a later time. We should not set
decided to 1 early in the function since doing so may prevent attempting
to load it again. We want to try loading it again because that allows an
open to fail and set errno correctly.
On the other side of this problem is that if we are called again with
the same inputs we will fetch the cached version of the object and carry
out no open syscalls and that fails to set errno so we must set errno to
ENOENT in that case. There is a second code path that has to be handled
where the name of the locale matches but the codeset doesn't match.
These changes ensure that errno is correctly set on failure in all the
return paths in _nl_find_locale().
Adds tst-locale-loadlocale to cover the bug.
No regressions on x86_64.
Co-authored-by: Jeff Law <law@redhat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
elf: Do not check for loader mmap on tst-decorate-maps (BZ 31553)
On some architectures and depending on the page size, the loader can
also allocate some memory during dependencies loading and it will be
marked as 'loader malloc'. However, if the system page size is
large enough, the initial data page will be enough for all required
allocation and there will be no extra loader mmap. To avoid false
negatives, the test does not check for such pages.
Checked on powerpc64le-linux-gnu with 64k pagesize. Reviewed-by: Simon Chopin <simon.chopin@canonical.com>
Joseph Myers [Fri, 19 Apr 2024 17:03:56 +0000 (17:03 +0000)]
Use --enable-obsolete in build-many-glibcs.py for nios2-linux-gnu
Until GCC removes Nios II support (at which point we should do so as
well), this is now needed for GCC 14 / mainline to build for
nios2-linux-gnu target.
Tested with build-many-glibcs.py (GCC mainline) for nios2-linux-gnu.
login: Use unsigned 32-bit types for seconds-since-epoch
These fields store timestamps when the system was running. No Linux
systems existed before 1970, so these values are unused. Switching
to unsigned types allows continued use of the existing struct layouts
beyond the year 2038.
The intent is to give distributions more time to switch to improved
interfaces that also avoid locking/data corruption issues.
These structs describe file formats under /var/log, and should not
depend on the definition of _TIME_BITS. This is achieved by
defining __WORDSIZE_TIME64_COMPAT32 to 1 on 32-bit ports that
support 32-bit time_t values (where __time_t is 32 bits).
Wilco Dijkstra [Tue, 28 Nov 2023 13:33:56 +0000 (13:33 +0000)]
benchtests: Add random() benchmark
Add a simple benchmark to measure the overhead of internal libc locks in
the random() implementation on both single- and multi-threaded cases.
This relies on the implementation of random using internal locks to
access shared global data, and that the runtime uses multi-threaded
locking once a thread has been created (even after it finishes).
Charles Fol [Thu, 28 Mar 2024 15:25:38 +0000 (12:25 -0300)]
iconv: ISO-2022-CN-EXT: fix out-of-bound writes when writing escape sequence (CVE-2024-2961)
ISO-2022-CN-EXT uses escape sequences to indicate character set changes
(as specified by RFC 1922). While the SOdesignation has the expected
bounds checks, neither SS2designation nor SS3designation have its;
allowing a write overflow of 1, 2, or 3 bytes with fixed values:
'$+I', '$+J', '$+K', '$+L', '$+M', or '$*H'.
Checked on aarch64-linux-gnu.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
elf/rtld: Count skipped environment variables for enable_secure
When using the glibc.rtld.enable_secure tunable we need to keep track of
the count of environment variables we skip due to __libc_enable_secure
being set and adjust the auxv section of the stack. This fixes an
assertion when running ld.so directly with glibc.rtld.enable_secure set.
Add a testcase that ensures the assert is not hit.
powerpc: Fix ld.so address determination for PCREL mode (bug 31640)
This seems to have stopped working with some GCC 14 versions,
which clobber r2. With other compilers, the kernel-provided
r2 value is still available at this point.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The test failure is a real valgrind bug that needs to be fixed before
valgrind is usable with a glibc that has been built with
CC="gcc -march=x86-64-v3". The proposed valgrind patch teaches
valgrind to replace ld.so strcmp with an unoptimized scalar
implementation, thus avoiding any AVX2-related problems.
wcsmbs: Ensure wcstr worst-case linear execution time (BZ 23865)
It uses the same two-way algorithm used on strstr, strcasestr, and
memmem. Different than strstr, neither the "shift table" optimization
nor the self-adapting filtering check is used because it would result in
a too-large shift table (and it also simplifies the implementation bit).
Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
The gnulib version contains an important change (9ce573cde), which
fixes some problems with multithreading, entropy loss, and ASLR leak
nfo. It also fixes an issue where getrandom is not being used
on some new files generation (only for __GT_NOCREATE on first try).
The 044bf893ac removed __path_search, which is now moved to another
gnulib shared files (stdio-common/tmpdir.{c,h}). Tthis patch
also fixes direxists to use __stat64_time64 instead of __xstat64,
and move the include of pathmax.h for !_LIBC (since it is not used
by glibc). The license is also changed from GPL 3.0 to 2.1, with
permission from the authors (Bruno Haible and Paul Eggert).
The sync also removed the clock fallback, since clock_gettime
with CLOCK_REALTIME is expected to always succeed.
elf: Add ld.so test with non-existing program name
None of the existing tests seem to cover the case where
_dl_signal_error is called without an active error handler.
The new elf/tst-rtld-does-not-exist test triggers such a
_dl_signal_error call from _dl_map_object.
H.J. Lu [Mon, 8 Apr 2024 16:06:09 +0000 (09:06 -0700)]
elf: Check objname before calling fatal_error
_dl_signal_error may be called with objname == NULL. _dl_exception_create
checks objname == NULL. But fatal_error doesn't. Check objname before
calling fatal_error. This fixes BZ #31596. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
H.J. Lu [Fri, 5 Apr 2024 23:42:57 +0000 (16:42 -0700)]
Use crtbeginT.o and crtend.o for non-PIE static executables
When static PIE is enabled by default, we shouldn't use crtbeginS.o and
crtendS.o for non-PIE static executables. Check $($(@F)-no-pie) to use
crtbeginT.o and crtend.o to create non-PIE static executables. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
x86: Add generic CPUID data dumper to ld.so --list-diagnostics
This is surprisingly difficult to implement if the goal is to produce
reasonably sized output. With the current approaches to output
compression (suppressing zeros and repeated results between CPUs,
folding ranges of identical subleaves, dealing with the %ecx
reflection issue), the output is less than 600 KiB even for systems
with 256 logical CPUs.
Paul Eggert [Sun, 7 Apr 2024 06:39:53 +0000 (23:39 -0700)]
timezone: sync to TZDB 2024a
Sync tzselect, zdump, zic to TZDB 2024a.
This patch incorporates the following TZDB source code changes,
listed roughly in descending order of importance.
zic now supports links to links, needed for future tzdata
zic now defaults to '-b slim'
zic now updates output files atomically
zic has new options -R, -l -, -p -
zic -r now uses -00 for unspecified timestamps
zdump now uses [lo,hi) for both -c and -t
Fix several integer overflow bugs
zic now checks input bytes more carefully
Simplify and fix new TZDIR setup
Default time_t to 64 bits on glibc 2.34+ 32-bit
zic now generates TZ strings that conform to POSIX when all-year DST
zic -v now shows extreme-int tm_year transitions
Fix zic bug in last time type of Asia/Gaza etc.
Fix zic bug with Palestine after 2075
Fix bug uncovered by recent change to Iran history
Fix 'zic -b fat' bug with Port Moresby 32-bit data
Fix zic bug with -r @X where X is deduced from TZ
Fix bug with zic -r cutoff before 1st transition
Fix leap second expiry and truncation
Fix zic bug on Linux 2.6.16 and 2.6.17
Fix bug with 'zic -d /a/b/c' if /a is unwriteable
Don't mistruncate TZif files at leap seconds
Fix zdump undefined behavior if !USE_LTZ
zdump -v reports localtime+gmtime failures better
Fix zdump diagnostic for missing timezone
Don't assume nonempty argv
Port better to C23
Do not assume negative >> behavior
I18nize zdump a bit better
Port zdump to right_only installations
New tzselect menu option 'now'
tzselect can now use current time to help choose
Improve tzselect behavior for Turkey etc.
tzselect: do not create temporary files
tzselect: work around mawk bug with {2,}
tzselect: Port to POSIX awk, which prohibits -v newlines
Do not use empty RE in tzselect
Don't set TZ in tzselect
Avoid sed, expr in tzselect
tzselect: Fix problems with spaces in TZDIR
Improve tzselect diagnostics
Remove zic workaround for Qt bug 53071
Remove zic support for "min" in Rule lines
Remove zic support for zic -y, Rule TYPEs, pacificnew
Remove tzselect workaround for Bash 1.14.7 bug
* SHARED-FILES: Update to match current sync.
* config.h.in (HAVE_STRERROR): Remove; no longer needed.
* timezone/Makefile ($(objpfx)zic.o): Depend on tzdir.h.
($(objpfx)tzdir.h): New rule to build a placeholder.
* timezone/private.h, timezone/tzfile.h, timezone/version:
* timezone/zdump.c, timezone/zic.c: Copy verbatim from TZDB 2024a.
Paul Eggert [Sat, 6 Apr 2024 15:44:01 +0000 (08:44 -0700)]
Fix bsearch, qsort doc to match POSIX better
* manual/search.texi (Array Search Function):
Correct the statement about lfind’s mean runtime:
it is proportional to a number (not that number),
and this is true only if random elements are searched for.
Relax the constraint on bsearch’s array argument:
POSIX says it need not be sorted, only partially sorted.
Say that the first arg passed to bsearch’s comparison function
is the key, and the second arg is an array element, as
POSIX requires. For bsearch and qsort, say that the
comparison function should not alter the array, as POSIX
requires. For qsort, say that the comparison function
must define a total order, as POSIX requires, that
it should not depend on element addresses, that
the original array index can be used for stable sorts,
and that if qsort still works if memory allocation fails.
Be more consistent in calling the array elements
“elements” rather than “objects”.
H.J. Lu [Thu, 4 Apr 2024 22:43:50 +0000 (15:43 -0700)]
x86-64: Exclude FMA4 IFUNC functions for -mapxf
When -mapxf is used to build glibc, the resulting glibc will never run
on FMA4 machines. Exclude FMA4 IFUNC functions when -mapxf is used.
This requires GCC which defines __APX_F__ for -mapxf with commit:
math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603)
The implementations of trunc functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
math: x86 floor traps when FE_INEXACT is enabled (BZ 31601)
The implementations of floor functions using x87 floating point (i386 and
86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600)
The implementations of ceil functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.
The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).
Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
misc: Add support for Linux uio.h RWF_NOAPPEND flag
In Linux 6.9 a new flag is added to allow for Per-io operations to
disable append mode even if a file was opened with the flag O_APPEND.
This is done with the new RWF_NOAPPEND flag.
This caused two test failures as these tests expected the flag 0x00000020
to be unused. Adding the flag definition now fixes these tests on Linux
6.9 (v6.9-rc1).
powerpc: Add missing arch flags on rounding ifunc variants
The ifunc variants now uses the powerpc implementation which in turn
uses the compiler builtin. Without the proper -mcpu switch the builtin
does not generate the expected optimization.
Checked on powerpc-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Always define __USE_TIME_BITS64 when 64 bit time_t is used
It was raised on libc-help [1] that some Linux kernel interfaces expect
the libc to define __USE_TIME_BITS64 to indicate the time_t size for the
kABI. Different than defined by the initial y2038 design document [2],
the __USE_TIME_BITS64 is only defined for ABIs that support more than
one time_t size (by defining the _TIME_BITS for each module).
The 64 bit time_t redirects are now enabled using a different internal
define (__USE_TIME64_REDIRECTS). There is no expected change in semantic
or code generation.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and
arm-linux-gnueabi
Use same strategy as bench-strstr.c (93eebae5168e5cf2 and 80b2bfb53504)
and use json_ctx for output to help standardize format across all
benchtests. Reviewed-by: Arjun Shankar <arjun@redhat.com>
As indicated in a recent thread, this it is a simple brute-force
algorithm that checks the whole needle at a matching character pair
(and does so 1 byte at a time after the first 64 bytes of a needle).
Also it never skips ahead and thus can match at every haystack
position after trying to match all of the needle, which generic
implementation avoids.
As indicated by Wilco, a 4x larger needle and 16x larger haystack gives
a clear 65x slowdown both basic_strstr and __strstr_avx512:
PS: I don't have an AVX512 capable machine to verify this issues, but
skimming through the code it does seems to follow what Wilco has
described. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Palmer Dabbelt [Thu, 22 Feb 2024 23:24:00 +0000 (15:24 -0800)]
RISC-V: Fix the static-PIE non-relocated object check
The value of l_scope is only valid post relocation, so this original
check was triggering undefined behavior. Instead just directly check to
see if the object has been relocated, at which point using l_scope is
safe.
Sergey Bugaev [Sat, 23 Mar 2024 17:32:47 +0000 (20:32 +0300)]
htl: Respect GL(dl_stack_flags) when allocating stacks
Previously, HTL would always allocate non-executable stacks. This has
never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and
makes all pages implicitly executable. Since GNU Mach on AArch64
supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE
immediately breaks any code that (unfortunately, still) relies on
executable stacks.
Sergey Bugaev [Sat, 23 Mar 2024 17:32:44 +0000 (20:32 +0300)]
Allow glibc to be compiled without EXEC_PAGESIZE
We would like to avoid statically defining any specific page size on
aarch64-gnu, and instead make sure that everything uses the dynamic
page size, available via vm_page_size and GLRO(dl_pagesize).
There are currently a few places in glibc that require EXEC_PAGESIZE
to be defined. Per Roland's suggestion [0], drop the static
GLRO(dl_pagesize) initializers (for now, only if EXEC_PAGESIZE is not
defined), and don't require EXEC_PAGESIZE definition for libio to
enable mmap usage.
Sergey Bugaev [Sat, 23 Mar 2024 17:32:43 +0000 (20:32 +0300)]
hurd: Stop relying on VM_MAX_ADDRESS
We'd like to avoid committing to a specific size of virtual address
space (i.e. the value of VM_AARCH64_T0SZ) on AArch64. While the current
version of GNU Mach still exports VM_MAX_ADDRESS for compatibility, we
should try to avoid relying on it when we can. This piece of logic in
_hurdsig_getenv () doesn't actually care about the size of user-
accessible virtual address space, it just wants to preempt faults on any
addresses starting from the value of the P pointer and above. So, use
(unsigned long int) -1 instead of VM_MAX_ADDRESS.
While at it, change the casts to (unsigned long int) and not just
(long int), since the type of struct hurd_signal_preemptor.{first,last}
is unsigned long int.
Sergey Bugaev [Sat, 23 Mar 2024 17:32:42 +0000 (20:32 +0300)]
hurd: Move internal functions to internal header
Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and
_hurd_critical_section_unlock () inline implementations (that were
already guarded by #if defined _LIBC) to the internal version of the
header. While at it, add <tls.h> to the includes, and use
__LIBC_NO_TLS () unconditionally.
Stafford Horne [Tue, 19 Mar 2024 21:01:24 +0000 (21:01 +0000)]
or1k: Add prctl wrapper to unwrap variadic args
On OpenRISC variadic functions and regular functions have different
calling conventions so this wrapper is needed to translate. This
wrapper is copied from x86_64/x32. I don't know the build system enough
to find a cleaner way to share the code between x86_64/x32 and or1k
(maybe Implies?), so I went with the straight copy.
Stafford Horne [Tue, 19 Mar 2024 20:53:37 +0000 (20:53 +0000)]
or1k: Only define fpu rouding and exceptions with hard-float
This test failure:
math/test-fenv
If rounding mode and exception macros are defined then the fenv tests
run and always fail. This patch adds an ifdef using the
__or1k_hard_float__ macro provided by gcc to avoid defining these fenv
macros when they cnnot be used. This is similar to what is done in csky.
Note, I will post the or1k hard-float support soon. So, I prefer to
leave the hard-float bits here for now.
Wilco Dijkstra [Thu, 21 Mar 2024 16:48:33 +0000 (16:48 +0000)]
AArch64: Check kernel version for SVE ifuncs
Old Linux kernels disable SVE after every system call. Calling the
SVE-optimized memcpy afterwards will then cause a trap to reenable SVE.
As a result, applications with a high use of syscalls may run slower with
the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0,
except for 5.14.0 which was patched. Avoid this by checking the kernel
version and selecting the SVE ifunc on modern kernels.
Parse the kernel version reported by uname() into a 24-bit kernel.major.minor
value without calling any library functions. If uname() is not supported or
if the version format is not recognized, assume the kernel is modern.
Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Amrita H S [Wed, 20 Mar 2024 00:08:47 +0000 (19:08 -0500)]
powerpc: Placeholder and infrastructure/build support to add Power11 related changes.
The following three changes have been added to provide initial Power11 support.
1. Add the directories to hold Power11 files.
2. Add support to select Power11 libraries based on AT_PLATFORM.
3. Let submachine=power11 be set automatically.
Reviewed-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
Manjunath Matti [Tue, 19 Mar 2024 20:29:48 +0000 (15:29 -0500)]
powerpc: Add HWCAP3/HWCAP4 data to TCB for Power Architecture.
This patch adds a new feature for powerpc. In order to get faster
access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for
implementing __builtin_cpu_supports() in GCC) without the overhead of
reading them from the auxiliary vector, we now reserve space for them
in the TCB.
This is an ABI change for GLIBC 2.39.
Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
The aarch64 uses 'trad' for traditional tls and 'desc' for tls
descriptors, but unlike other targets it defaults to 'desc'. The
gnutls2 configure check does not set aarch64 as an ABI that uses
TLS descriptors, which then disable somes stests.
Also rename the internal machinery fron gnu2 to tls descriptors.
Checked on aarch64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
arm: Update _dl_tlsdesc_dynamic to preserve caller-saved registers (BZ 31372)
ARM _dl_tlsdesc_dynamic slow path has two issues:
* The ip/r12 is defined by AAPCS as a scratch register, and gcc is
used to save the stack pointer before on some function calls. So it
should also be saved/restored as well. It fixes the tst-gnu2-tls2.
* None of the possible VFP registers are saved/restored. ARM has the
additional complexity to have different VFP bank sizes (depending of
VFP support by the chip).
The tst-gnu2-tls2 test is extended to check for VFP registers, although
only for hardfp builds. Different than setcontext, _dl_tlsdesc_dynamic
does not have HWCAP_ARM_IWMMXT (I don't have a way to properly test
it and it is almost a decade since newer hardware was released).
With this patch there is no need to mark tst-gnu2-tls2 as XFAIL.
Checked on arm-linux-gnueabihf. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
H.J. Lu [Mon, 18 Mar 2024 13:40:16 +0000 (06:40 -0700)]
x86-64: Allocate state buffer space for RDI, RSI and RBX
_dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack.
After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11. Define
TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX
to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to
STATE_SAVE_OFFSET(%rsp).
+==================+<- stack frame start aligned at 8 or 16 bytes
| |<- RDI saved in the red zone
| |<- RSI saved in the red zone
| |<- RBX saved in the red zone
| |<- paddings for stack realignment of 64 bytes
|------------------|<- xsave buffer end aligned at 64 bytes
| |<-
| |<-
| |<-
|------------------|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp)
| |<- 8-byte padding for 64-byte alignment
| |<- 8-byte padding for 64-byte alignment
| |<- R11
| |<- R10
| |<- R9
| |<- R8
| |<- RDX
| |<- RCX
+==================+<- RSP aligned at 64 bytes
Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size
for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI
and RBX are saved onto stack without adjusting stack pointer first, using
the red-zone. This fixes BZ #31501. Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>