The REP MOVSB usage on memcpy/memmove does not show much performance
improvement on Zen3/Zen4 cores compared to the vectorized loops. Also,
as from BZ 30994, if the source is aligned and the destination is not
the performance can be 20x slower.
The performance difference is noticeable with small buffer sizes, closer
to the lower bounds limits when memcpy/memmove starts to use ERMS. The
performance of REP MOVSB is similar to vectorized instruction on the
size limit (the L2 cache). Also, there is no drawback to multiple cores
sharing the cache.
Checked on x86_64-linux-gnu on Zen3. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Joseph Myers [Thu, 8 Feb 2024 12:57:24 +0000 (12:57 +0000)]
Add SOL_VSOCK from Linux 6.7 to bits/socket.h
Linux 6.7 adds a constant SOL_VSOCK (recall that various constants in
include/linux/socket.h are in fact part of the kernel-userspace API
despite that not being a uapi header). Add it to glibc's
bits/socket.h.
arm: Remove wrong ldr from _dl_start_user (BZ 31339)
The commit 49d877a80b29d3002887b084eec6676d9f5fec18 (arm: Remove
_dl_skip_args usage) removed the _SKIP_ARGS literal, which was
previously loader to r4 on loader _start. However, the cleanup did not
remove the following 'ldr r4, [sl, r4]' on _dl_start_user, used to check
to skip the arguments after ld self-relocations.
In my testing, the kernel initially set r4 to 0, which makes the
ldr instruction just read the _GLOBAL_OFFSET_TABLE_. However, since r4
is a callee-saved register; a different runtime might not zero
initialize it and thus trigger an invalid memory access.
Checked on arm-linux-gnu.
Reported-by: Adrian Ratiu <adrian.ratiu@collabora.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Xi Ruoyao [Sun, 4 Feb 2024 00:27:50 +0000 (08:27 +0800)]
LoongArch: Use builtins for ffs and ffsll
On LoongArch GCC compiles __builtin_ffs{,ll} to basically
`(x ? __builtin_ctz (x) : -1) + 1`. Since a hardware ctz instruction is
available, this is much better than the table-driven generic
implementation.
Tested on loongarch64.
Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Stefan Liebler [Tue, 16 Jan 2024 08:44:30 +0000 (09:44 +0100)]
Fix stringop-overflow warning in tst-strlcat2.
On s390x, I get warnings like this when do_one_test is inlined with SIZE_MAX:
In function ‘do_one_test’,
inlined from ‘do_overflow_tests’ at tst-strlcat2.c:184:2:
tst-strlcat2.c:49:18: error: ‘strnlen’ specified bound [18446744073709550866, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Werror=stringop-overflow=]
49 | # define STRNLEN strnlen
| ^
tst-strlcat2.c:89:23: note: in expansion of macro ‘STRNLEN’
89 | size_t dst_length = STRNLEN (dst, n);
| ^~~~~~~
This patch just marks the do_one_test function as noinline as also done in test-strncat.c:
Fix stringop-overflow warning in test-strncat.
https://sourceware.org/git/?p=glibc.git;a=commit;h=51aeab9a363a0d000d0912aa3d6490463a26fba2
For o32 we need to setup a minimal stack frame to allow cprestore
on __thread_start_clone3 (which instruct the linker to save the
gp for PIC). Also, there is no guarantee by kABI that $8 will be
preserved after syscall execution, so we need to save it on the
provided stack.
Jakub Jelinek [Thu, 1 Feb 2024 15:58:49 +0000 (16:58 +0100)]
soft-fp: Add brain format support
In https://gcc.gnu.org/r13-3292 I've added brain format support
(std::bfloat16_t) on the GCC side, but as glibc has the master copy
of soft-fp, the following patch adds the files from that commit
and from https://gcc.gnu.org/r13-6598 and https://gcc.gnu.org/r13-6622
The files are not used by glibc right now.
Jakub Jelinek [Thu, 1 Feb 2024 15:36:55 +0000 (16:36 +0100)]
manual: Fix up stdbit.texi
My recent change broke make pdf and in other documentation formats
results in weird rendering and invalid URL, all because of a forgotten
comma to separate @uref arguments.
misc: tst-poll: Proper synchronize with child before sending the signal
When running the testsuite in parallel, for instance running make -j
$(nproc) check, occasionally tst-epoll fails with a timeout. It happens
because it sometimes takes a bit more than 10ms for the process to get
cloned and blocked by the syscall. In that case the signal is
sent to early, and the test fails with a timeout.
The exp10, exp10l, fma, fmaf, and fmal default implementation do not
implement the appropriate semantics nor with an reasonable accuracy.
They are also not used by any supported port.
Joseph Myers [Thu, 1 Feb 2024 11:02:01 +0000 (11:02 +0000)]
Refer to C23 in place of C2X in glibc
WG14 decided to use the name C23 as the informal name of the next
revision of the C standard (notwithstanding the publication date in
2024). Update references to C2X in glibc to use the C23 name.
This is intended to update everything *except* where it involves
renaming files (the changes involving renaming tests are intended to
be done separately). In the case of the _ISOC2X_SOURCE feature test
macro - the only user-visible interface involved - support for that
macro is kept for backwards compatibility, while adding
_ISOC23_SOURCE.
Fangrui Song [Wed, 31 Jan 2024 23:46:23 +0000 (15:46 -0800)]
build-many-glibcs: relax version check to allow non-digit characters
A version string may contain non-digit characters, commonly found in
built-from-VCS tools, e.g.
```
git version 2.39.GIT
git version 2.43.0.493.gbc7ee2e5e1
```
`int()` will raise a ValueError, leading to a spurious 'missing'.
Jakub Jelinek [Wed, 31 Jan 2024 18:17:27 +0000 (19:17 +0100)]
Use gcc __builtin_stdc_* builtins in stdbit.h if possible
The following patch uses the GCC 14 __builtin_stdc_* builtins in stdbit.h
for the type-generic macros, so that when compiled with GCC 14 or later,
it supports not just 8/16/32/64-bit unsigned integers, but also 128-bit
(if target supports them) and unsigned _BitInt (any supported precision).
And so that the macros don't expand arguments multiple times and can be
evaluated in constant expressions.
The new testcase is gcc's gcc/testsuite/gcc.dg/builtin-stdc-bit-1.c
adjusted to test stdbit.h and the type-generic macros in there instead
of the builtins and adjusted to use glibc test framework rather than
gcc style tests with __builtin_abort ().
Signed-off-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Joseph Myers <josmyers@redhat.com>
building glibc on s390x with --disable-multi-arch fails if only
the C-variant of strchrnul / memrchr is used. This is the case
if gcc uses -march < z13.
The build fails with:
../sysdeps/s390/strchrnul-c.c:28:49: error: ‘__strchrnul_c’ undeclared here (not in a function); did you mean ‘__strchrnul’?
28 | __hidden_ver1 (__strchrnul_c, __GI___strchrnul, __strchrnul_c);
With --disable-multi-arch, __strchrnul_c is not available as string/strchrnul.c
is just included without defining STRCHRNUL and thus we also don't have to create
the internal hidden symbol.
Tested-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Update advisory format and introduce some automation
Simplify the advisory format by dropping the -Backport tags and instead
stick to using just the -Commit tags. To identify backports, put a
substring of git-describe into the release version in the brackets next
to the commit ref. This way, it not only identifies that the fix (or
regression) is on the release/2.YY/master branch, it also disambiguates
regressions/fixes in the branch from those in the tarball.
Add a README to make it easier for consumers to understand the format.
Additionally, the Release wiki needs to be updated to inform the release
manager to:
1. Generate a NEWS snipped from the advisories directory
AND
2. on release/2.YY/master, replace the advisories directory with a text
file pointing to the advisories directory in master so that we don't
have to update multiple locations.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Arjun Shankar [Mon, 15 Jan 2024 16:44:44 +0000 (17:44 +0100)]
syslog: Fix heap buffer overflow in __vsyslog_internal (CVE-2023-6779)
__vsyslog_internal used the return value of snprintf/vsnprintf to
calculate buffer sizes for memory allocation. If these functions (for
any reason) failed and returned -1, the resulting buffer would be too
small to hold output. This commit fixes that.
All snprintf/vsnprintf calls are checked for negative return values and
the function silently returns upon encountering them.
Arjun Shankar [Mon, 15 Jan 2024 16:44:43 +0000 (17:44 +0100)]
syslog: Fix heap buffer overflow in __vsyslog_internal (CVE-2023-6246)
__vsyslog_internal did not handle a case where printing a SYSLOG_HEADER
containing a long program name failed to update the required buffer
size, leading to the allocation and overflow of a too-small buffer on
the heap. This commit fixes that. It also adds a new regression test
that uses glibc.malloc.check.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
Carlos O'Donell [Thu, 18 Jan 2024 17:28:20 +0000 (12:28 -0500)]
Relicense IBM portions of resolv/base64.c resolv/res_debug.c.
This change relicenses the IBM portions of resolv/base64.c and
resolv/res_debug.c to a new license that does not have use-limited
patent language. The top-level LICENSE file is updated with the
license.
The relicensing was approved by IBM.
Signed-off-by: Brad Topol, IBM Director of Open Technologies <btopol@us.ibm.com> Signed-off-by: Richard Fontana <rfontana@redhat.com> Signed-off-by: Carlos O'Donell <carlos@redhat.com>
string: Disable stack protector for memset in early static initialization
For ports that use the default memset, the compiler might generate early
calls before the stack protector is initialized (for instance, riscv
with -fstack-protector-all on _dl_aux_init).
Xi Ruoyao [Mon, 22 Jan 2024 20:29:18 +0000 (04:29 +0800)]
qsort: Fix a typo causing unnecessary malloc/free (BZ 31276)
In qsort_r we allocate a buffer sized QSORT_STACK_SIZE (1024) on stack
and we intend to use it if all elements can fit into it. But there is a
typo:
if (total_size < sizeof buf)
buf = tmp;
else
/* allocate a buffer on heap and use it ... */
Here "buf" is a pointer, thus sizeof buf is just 4 or 8, instead of
1024. There is also a minor issue that we should use "<=" instead of
"<".
This bug is detected debugging some strange heap corruption running the
Ruby-3.3.0 test suite (on an experimental Linux From Scratch build using
Binutils-2.41.90 and Glibc trunk, and also Fedora Rawhide [1]). It
seems Ruby is doing some wild "optimization" by jumping into somewhere
in qsort_r instead of calling it normally, resulting in a double free of
buf if we allocate it on heap. The issue can be reproduced
deterministically with:
in Ruby-3.3.0 tree after building it. This change would hide the issue
for Ruby, but Ruby is likely still buggy (if using this "optimization"
sorting larger arrays).
The small counts copy bytes comparsion should be unsigned (as the
memmove size argument). It fixes string/tst-memmove-overflow on
sparcv9, where the input size triggers an invalid code path.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
Joseph Myers [Fri, 19 Jan 2024 13:30:34 +0000 (13:30 +0000)]
Further build-many-glibcs.py fixes for utcnow() deprecation
It turns out that the replacement of datetime.datetime.utcnow(), for a
warning produced early in running build-many-glibcs.py with Python
3.12, (a) wasn't complete (there were other uses elsewhere in the
script also needing updating) and (b) broke reading of build-time from
build-state.json, because an aware datetime was written out including
+00:00 for the timezone, which was not expected by the strptime call.
Fix the first by making the change to
datetime.datetime.now(datetime.timezone.utc) for all the remaining
utcnow() calls. Fix the second by using strftime with an explicit
format instead of just str() when formatting build times for
build-state.json and and email subjects, and then setting the timezone
explicitly when reading from build-state.json. (Other uses, in
particular messages output by the bot, continue to use str() as the
precise format should not matter in those cases; it shouldn't actually
matter for email subjects either but it seems a good idea to keep
those short.)
Tested with a bot-cycle run and checking the format of times in
build-state.json afterwards.
Daniel Cederman [Tue, 16 Jan 2024 08:31:40 +0000 (09:31 +0100)]
sparc: Fix llrint and llround missing exceptions on SPARC V8
Conversions from a float to a long long on SPARC v8 uses a libgcc function
that may not raise the correct exceptions on overflow. It also may raise
spurious "inexact" exceptions on non overflow cases. This patch fixes the
problem in the same way as for RV32.
Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Daniel Cederman [Tue, 16 Jan 2024 08:31:41 +0000 (09:31 +0100)]
sparc: Remove unwind information from signal return stubs [BZ #31244]
The functions were previously written in C, but were not compiled
with unwind information. The ENTRY/END macros includes .cfi_startproc
and .cfi_endproc which adds unwind information. This caused the
tests cleanup-8 and cleanup-10 in the GCC testsuite to fail.
This patch adds a version of the ENTRY/END macros without the
CFI instructions that can be used instead.
sigaction registers a restorer address that is located two instructions
before the stub function. This patch adds a two instruction padding to
avoid that the unwinder accesses the unwind information from the function
that the linker has placed right before it in memory. This fixes an issue
with pthread_cancel that caused tst-mutex8-static (and other tests) to fail.
Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Daniel Cederman [Mon, 15 Jan 2024 14:53:45 +0000 (15:53 +0100)]
sparc: Prevent stfsr from directly following floating-point instruction
On LEON, if the stfsr instruction is immediately following a floating-point
operation instruction in a running program, with no other instruction in
between the two, the stfsr might behave as if the order was reversed
between the two instructions and the stfsr occurred before the
floating-point operation.
Add a nop instruction before the stfsr to prevent this from happening.
Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Daniel Cederman [Mon, 15 Jan 2024 14:53:44 +0000 (15:53 +0100)]
sparc: Use existing macros to avoid code duplication
Macros for using inline assembly to access the fp state register exists
in both fenv_private.h and in fpu_control.h. Let fenv_private.h use the
macros from fpu_control.h
Signed-off-by: Daniel Cederman <cederman@gaisler.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Joseph Myers [Wed, 17 Jan 2024 21:15:37 +0000 (21:15 +0000)]
Update kernel version to 6.7 in header constant tests
This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.7. (There are no new
constants covered by these tests in 6.7 that need any other header
changes.)
Joseph Myers [Wed, 17 Jan 2024 15:38:54 +0000 (15:38 +0000)]
Update syscall lists for Linux 6.7
Linux 6.7 adds the futex_requeue, futex_wait and futex_wake syscalls,
and enables map_shadow_stack for architectures previously missing it.
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.
Kuan-Wei Chiu [Tue, 16 Jan 2024 02:16:56 +0000 (10:16 +0800)]
stdlib: Fix heapsort for cases with exactly two elements
When malloc fails to allocate a buffer and falls back to heapsort, the
current heapsort implementation does not perform sorting when there are
exactly two elements. Heapsort is now skipped only when there is
exactly one element.
Mike FABIAN [Mon, 15 Jan 2024 22:12:48 +0000 (23:12 +0100)]
localedata: anp_IN: Fix abbreviated month names
Resolves: BZ # 31239
The correct abbreviated month names were apparently given in the comment above `abmon`.
But the value of `abmon` was apparently just copied from the value of `mon` and this
mistake was hard to see because code point notation <Uxxxx> was used. After converting
to UTF-8 it was obvious that there was apparently a copy and paste mistake.
stdlib: Reinstate stable mergesort implementation on qsort
The mergesort removal from qsort implementation (commit 03bf8357e8)
had the side-effect of making sorting nonstable. Although neither
POSIX nor C standard specify that qsort should be stable, it seems
that it has become an instance of Hyrum's law where multiple programs
expect it.
Also, the resulting introsort implementation is not faster than
the previous mergesort (which makes the change even less appealing).
This patch restores the previous mergesort implementation, with the
exception of machinery that checks the resulting allocation against
the _SC_PHYS_PAGES (it only adds complexity and the heuristic not
always make sense depending on the system configuration and load).
The alloca usage was replaced with a fixed-size buffer.
For the fallback mechanism, the implementation uses heapsort. It is
simpler than quicksort, and it does not suffer from adversarial
inputs. With memory overcommit, it should be rarely triggered.
The drawback is mergesort requires O(n) extra space, and since it is
allocated with malloc the function is AS-signal-unsafe. It should be
feasible to change it to use mmap, although I am not sure how urgent
it is. The heapsort is also nonstable, so programs that require a
stable sort would still be subject to this latent issue.
The tst-qsort5 is removed since it will not create quicksort adversarial
inputs with the current qsort_r implementation.
Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>
H.J. Lu [Fri, 12 Jan 2024 18:19:41 +0000 (10:19 -0800)]
x86-64: Check if mprotect works before rewriting PLT
Systemd execution environment configuration may prohibit changing a memory
mapping to become executable:
MemoryDenyWriteExecute=
Takes a boolean argument. If set, attempts to create memory mappings
that are writable and executable at the same time, or to change existing
memory mappings to become executable, or mapping shared memory segments
as executable, are prohibited.
When it is set, systemd service stops working if PLT rewrite is enabled.
Check if mprotect works before rewriting PLT. This fixes BZ #31230.
This also works with SELinux when deny_execmem is on. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Ffsll function randomly regress by ~20%, depending on how code gets
aligned in memory. Ffsll function code size is 17 bytes. Since default
function alignment is 16 bytes, it can load on 16, 32, 48 or 64 bytes
aligned memory. When ffsll function load at 16, 32 or 64 bytes aligned
memory, entire code fits in single 64 bytes cache line. When ffsll
function load at 48 bytes aligned memory, it splits in two cache line,
hence random regression.
Ffsll function size reduction from 17 bytes to 12 bytes ensures that it
will always fit in single 64 bytes cache line.
This patch fixes ffsll function random performance regression.
Yanzhang Wang [Tue, 2 Jan 2024 10:54:15 +0000 (18:54 +0800)]
RISC-V: Enable static-pie.
This patch referents the commit 374cef3 to add static-pie support. And
because the dummy link map is used when relocating ourselves, so need
not to set __global_pointer$ at this time.
It will also check whether toolchain supports to build static-pie. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The 551101e8240b7514fc646d1722f8b79c90362b8f change is incorrect for
alpha and sparc, since __NR_stat is defined by both kABI. Use
__NR_newfstat to check whether to fallback to __NR_fstat64 (similar
to what fstatat64 does).
Checked on sparc64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Wilco Dijkstra [Tue, 9 Jan 2024 15:32:08 +0000 (15:32 +0000)]
math: remove exp10 wrappers
Remove the error handling wrapper from exp10. This is very similar to
the changes done to exp and exp2, except that we also need to handle
pow10 and pow10l.
Wilco Dijkstra [Tue, 2 Jan 2024 17:08:02 +0000 (17:08 +0000)]
Benchtests: Increase benchmark iterations
Increase benchmark iterations for math and vector math functions to improve
timing accuracy. Vector math benchmarks now take 1-3 seconds on a modern CPU.
Xi Ruoyao [Thu, 4 Jan 2024 13:41:20 +0000 (21:41 +0800)]
Make __getrandom_nocancel set errno and add a _nostatus version
The __getrandom_nocancel function returns errors as negative values
instead of errno. This is inconsistent with other _nocancel functions
and it breaks "TEMP_FAILURE_RETRY (__getrandom_nocancel (p, n, 0))" in
__arc4random_buf. Use INLINE_SYSCALL_CALL instead of
INTERNAL_SYSCALL_CALL to fix this issue.
But __getrandom_nocancel has been avoiding from touching errno for a
reason, see BZ 29624. So add a __getrandom_nocancel_nostatus function
and use it in tcache_key_initialize.
Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
H.J. Lu [Wed, 10 Jan 2024 16:48:47 +0000 (08:48 -0800)]
x86-64/cet: Make CET feature check specific to Linux/x86
CET feature bits in TCB, which are Linux specific, are used to check if
CET features are active. Move CET feature check to Linux/x86 directory. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
Stefan Liebler [Thu, 11 Jan 2024 13:01:18 +0000 (14:01 +0100)]
resolv: Fix endless loop in __res_context_query
Starting with commit 40c0add7d48739f5d89ebba255c1df26629a76e2
"resolve: Remove __res_context_query alloca usage"
there is an endless loop in __res_context_query if
__res_context_mkquery fails e.g. if type is invalid. Then the
scratch buffer is resized to MAXPACKET size and it is retried again.
Before the mentioned commit, it was retried only once and with the
mentioned commit, there is no check and it retries in an endless loop.
This is observable with xtest resolv/tst-resolv-qtypes which times out
after 300s.
This patch retries mkquery only once as before the mentioned commit.
Furthermore, scratch_buffer_set_array_size is now only called with
nelem=2 if type is T_QUERY_A_AND_AAAA (also see mentioned commit).
The test tst-resolv-qtypes is also adjusted to verify that <func>
is really returning with -1 in case of an invalid type. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>