Debugging interfaces: p_*, fp_*, and sym_* could conceivably be
used to produce debug out, but these functions have not been
updated to parse more resource records, so they are not very useful
today. Likewise for ns_sprintrr and ns_sprintrrf. ns_format_ttl and
ns_parse_ttl are related to these.
Internal implementation details: res_isourserver is probably only
useful in the implementation of a stub resolver, and so is
res_nameinquery.
Unclear semantics and bad performance: ns_samedomain, ns_subdomain,
ns_makecanon, ns_samename do textual converions & copies instead of
checking equivalence of the wire format.
inet_neta cannot handle IPv6 addresses.
res_hostalias has been superseded by getaddrinfo with AI_CANONNAME.
hostalias is not thread-safe.
Some functions have int as size arguments instead of size_t, so they
do not follow current coding practices. However, dn_expand and
b64_ntop are somewhat widely used (to name just two examples), so
deprecating them seems problematic.
Reviewed-by: Carlos O'Donell <carlos@systemhalted.org>
tst-safe-linking: make false positives even more improbable
There is a 1 in 16 chance of a corruption escaping safe-linking and to
guard against spurious failures, tst-safe-linking runs each subtest 10
times to ensure that the chance is reduced to 1 in 2^40. However, in
the 1 in 16 chance that a corruption does escape safe linking, it
could well be caught by other sanity checks we do in malloc, which
then results in spurious test failures like below:
test test_fastbin_consolidate failed with a different error
expected: malloc_consolidate(): unaligned fastbin chunk detected
actual: malloc_consolidate(): invalid chunk size
This failure is seen more frequently on i686; I was able to reproduce
it in about 5 min of running it in a loop.
Guard against such failures by recording them and retrying the test.
Also, do not fail the test if we happened to get defeated by the 1 in
2^40 odds if in at least one of the instances it was detected by other
checks.
Finally, bolster the odds to 2^64 by running 16 times instead of 10.
The test still has a chance of failure so it is still flaky in theory.
However in practice if we see a failure here then it's more likely
that there's a bug than it being an issue with the test. Add more
printfs and also dump them to stdout so that in the event the test
actually fails, we will have some data to try and understand why it
may have failed.
resolv: Move ns_name_ntop to its own file and into libc
Reformat to GNU style. Avoid out-of-bounds pointer arithmetic
(e.g., use eom - dn < 2 instead of dn + 1 >= eom). Inline the
labellen function and fold the compression pointer check into
the length check (l >= 64). Assume ASCII encoding.
The symbol was moved using scripts/move-symbol-to-libc.py.
Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>
This is updated version of the 572bd547d57a (reverted by 40ebfd016ad2)
that fixes the _dl_next_tls_modid issues.
This issue with 572bd547d57a patch is the DTV entry will be only
update on dl_open_worker() with the update_tls_slotinfo() call after
all dependencies are being processed by _dl_map_object_deps(). However
_dl_map_object_deps() itself might call _dl_next_tls_modid(), and since
the _dl_tls_dtv_slotinfo_list::map is not yet set the entry will be
wrongly reused.
This patch fixes by renaming the _dl_next_tls_modid() function to
_dl_assign_tls_modid() and by passing the link_map so it can set
the slotinfo value so a subsequente _dl_next_tls_modid() call will
see the entry as allocated.
The intermediary value is cleared up on remove_slotinfo() for the case
a library fails to load with RTLD_NOW.
This patch fixes BZ #27135.
Checked on x86_64-linux-gnu.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Stefan Liebler [Wed, 14 Jul 2021 13:58:08 +0000 (15:58 +0200)]
Fix linknamespace errors and local-plt-usages in nss_files.
After commit f9c8b11ed7726b858cd7b7cea0d3d7c5233d78cf
"nss: Access nss_files through direct references",
when building with -Os, multiple conform/.../linknamespace tests
and elf/check-localplt are failing:
Extra PLT reference: libc.so: fgetc_unlocked
Extra PLT reference: libc.so: getline
This patch is using the hidden symbols where possible.
Instead of fputc_unlocked, __putc_unlocked is used.
(Compare to commit eeaa19f75e52d2d48074ae0c423f2311d67c42c6
"mntent: Use __putc_unlocked instead of fputc_unlocked")
H.J. Lu [Sat, 13 Feb 2021 19:47:46 +0000 (11:47 -0800)]
Add an internal wrapper for clone, clone2 and clone3
The clone3 system call (since Linux 5.3) provides a superset of the
functionality of clone and clone2. It also provides a number of API
improvements, including the ability to specify the size of the child's
stack area which can be used by kernel to compute the shadow stack size
when allocating the shadow stack. Add:
extern int __clone_internal (struct clone_args *__cl_args,
int (*__func) (void *__arg), void *__arg);
to provide an abstract interface for clone, clone2 and clone3.
1. Simplify stack management for thread creation by passing both stack
base and size to create_thread.
2. Consolidate clone vs clone2 differences into a single file.
3. Call __clone3 if HAVE_CLONE3_WAPPER is defined. If __clone3 returns
-1 with ENOSYS, fall back to clone or clone2.
4. Use only __clone_internal to clone a thread. Since the stack size
argument for create_thread is now unconditional, always pass stack size
to create_thread.
5. Enable the public clone3 wrapper in the future after it has been
added to all targets.
NB: Sandbox will return ENOSYS on clone3 in both Chromium:
Because clone3 uses a pointer argument rather than a flags argument, we
cannot examine the contents with seccomp, which is essential to
preventing sandboxed processes from starting other processes. So, we
won't be able to support clone3 in Chromium. This CL modifies the
BPF policy to return ENOSYS for clone3 so glibc always uses the fallback
to clone.
Bug: 1213452
Change-Id: I7c7c585a319e0264eac5b1ebee1a45be2d782303
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2936184 Reviewed-by: Robert Sesek <rsesek@chromium.org>
Commit-Queue: Matthew Denton <mpdenton@chromium.org>
Cr-Commit-Position: refs/heads/master@{#888980}
Cooper Qu [Tue, 13 Jul 2021 12:50:40 +0000 (20:50 +0800)]
nss: Fix build error with --disable-nscd
The error is as follows:
nss_module.c: In function 'module_load_nss_files':
nss_module.c:117:7: error: 'is_nscd' undeclared (first use in this function)
117 | if (is_nscd)
| ^~~~~~~
nss_module.c:117:7: note: each undeclared identifier is reported only once for each function it appears in
nss_module.c:119:51: error: 'nscd_init_cb' undeclared (first use in this function); did you mean 'nscd_init'?
119 | void (*cb) (size_t, struct traced_file *) = nscd_init_cb;
| ^~~~~~~~~~~~
| nscd_init
The make program might open a pipe for its job server, which triggers
an invalid check on the spawned process. This patch now passes the
lowest file descriptor as ithe first argument, so only the range
that was actually opened is checked.
Checked on x86_64-linux-gnu and i686-linux-gnu and centos7 (which
triggers the issue).
H.J. Lu [Mon, 12 Jul 2021 21:36:39 +0000 (14:36 -0700)]
mcheck: Align struct hdr to MALLOC_ALIGNMENT bytes [BZ #28068]
1. Align struct hdr to MALLOC_ALIGNMENT bytes so that malloc hooks in
libmcheck align memory to MALLOC_ALIGNMENT bytes.
2. Remove tst-mallocalign1 from tests-exclude-mcheck for i386 and x32.
3. Add tst-pvalloc-fortify and tst-reallocarray to tests-exclude-mcheck
since they use malloc_usable_size (see BZ #22057).
Linux: Use 32-bit vDSO for clock_gettime, gettimeofday, time (BZ# 28071)
The previous approach defeats the vDSO optimization on older kernels
because a failing clock_gettime64 system call is performed on every
function call. It also results in a clobbered errno value, exposing
an OpenJDK bug (JDK-8270244).
This patch fixes by open-code INLINE_VSYSCALL macro and replace all
INLINE_SYSCALL_CALL with INTERNAL_SYSCALL_CALLS. Now for
__clock_gettime64x, the 64-bit vDSO is used and the 32-bit vDSO is
tried before falling back to 64-bit syscalls.
The previous code preferred 64-bit syscall for the case where the kernel
provides 64-bit time_t syscalls *and* also a 32-bit vDSO (in this case
the *64-bit* syscall should be preferable over the vDSO). All
architectures that provides 32-bit vDSO (i386, mips, powerpc, s390)
modulo sparc; but I am not sure if some kernels versions do provide
only 32-bit vDSO while still providing 64-bit time_t syscall.
Regardless, for such cases the 64-bit time_t syscall is used if the
vDSO returns overflowed 32-bit time_t.
Tested on i686-linux-gnu (with a time64 and non-time64 kernel),
x86_64-linux-gnu. Built with build-many-glibcs.py.
Reduce <limits.h> pollution due to dynamic PTHREAD_STACK_MIN
<limits.h> used to be a header file with no declarations.
GCC's libgomp includes it in a #pragma GCC visibility hidden block.
Including <unistd.h> from <limits.h> (indirectly) declares everything
in <unistd.h> with hidden visibility, resulting in linker failures.
This commit avoids C declarations in assembler mode and only declares
__sysconf in <limits.h> (and not the entire contents of <unistd.h>).
The __sysconf symbol is already part of the ABI. PTHREAD_STACK_MIN
is no longer defined for __USE_DYNAMIC_STACK_SIZE && __ASSEMBLER__
because there is no possible definition.
Additionally, PTHREAD_STACK_MIN is now defined by <pthread.h> for
__USE_MISC because this is what developers expect based on the macro
name. It also helps to avoid libgomp linker failures in GCC because
libgomp includes <pthread.h> before its visibility hacks.
Stefan Liebler [Mon, 12 Jul 2021 09:00:53 +0000 (11:00 +0200)]
Fix failing nss/tst-nss-files-hosts-long.
Sometimes the test nss/tst-nss-files-hosts-long is failing as getent
fails with exit-code 2.
This happens if tst-reload1 was run just before this test:
make t=nss/tst-reload1 test
make t=nss/tst-nss-files-hosts-long test
Then the test fails as /etc/nsswitch.conf contains "hosts: test2"
and the hosts are not searched in /etc/hosts at all.
Thus this patch just requests a post cleanup after nss/tst-reload1
has run.
Samuel Thibault [Sun, 11 Jul 2021 17:51:12 +0000 (17:51 +0000)]
hurd _Fork: Drop duplicate malloc_fork_lock calls
This was put in __libc_fork by c32c868ab8b2 ("posix: Add _Fork [BZ #4737]")
so we need to avoid locking them again in _Fork called by __libc_lock, otherwise
we deadlock.
H.J. Lu [Sat, 10 Jul 2021 17:56:50 +0000 (10:56 -0700)]
support: Replace _SC_MINSIGSTKSZ with _SC_SIGSTKSZ
Replace _SC_MINSIGSTKSZ with _SC_SIGSTKSZ since sysconf (_SC_MINSIGSTKSZ)
returns the minimum number of bytes of free stack space required in order
to guarantee successful, non-nested handling of a single signal whose
handler is an empty function while sysconf (_SC_SIGSTKSZ) returns the
suggested minimum number of bytes of stack space required for a signal
stack.
H.J. Lu [Mon, 21 Jun 2021 19:42:56 +0000 (12:42 -0700)]
Define PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN)
The constant PTHREAD_STACK_MIN may be too small for some processors.
Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE. When
_DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define
PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed
to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)).
Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to
provide a constant target specific PTHREAD_STACK_MIN value.
H.J. Lu [Fri, 9 Jul 2021 03:48:14 +0000 (20:48 -0700)]
Add a generic malloc test for MALLOC_ALIGNMENT
1. Add sysdeps/generic/malloc-size.h to define size related macros for
malloc.
2. Move x86_64/tst-mallocalign1.c to malloc and replace ALIGN_MASK with
MALLOC_ALIGN_MASK.
3. Add tst-mallocalign1 to tests-exclude-mcheck for i386 and x32 since
mcheck doesn't honor MALLOC_ALIGNMENT.
H.J. Lu [Fri, 9 Jul 2021 12:57:51 +0000 (05:57 -0700)]
Properly run tst-spawn5 directly [BZ #28067]
Change tst-spawn5.c to handle tst-spawn5 without optional path to ld.so,
--library-path nor the library path when glibc is configured with
--enable-hardcoded-path-in-tests. This fixes BZ #28067.
nptl: Use out-of-line wake function in __libc_lock_unlock slow path
This slightly reduces code size, as can be seen below.
__libc_lock_unlock is usually used along with __libc_lock_lock in
the same function. __libc_lock_lock already has an out-of-line
slow path, so this change should not introduce many additional
non-leaf functions.
This change also fixes a link failure in 32-bit Arm thumb mode
because commit 1f9c804fbd699104adefbce9e56d2c8aa711b6b9
("nptl: Use internal low-level lock type for !IS_IN (libc)")
introduced __libc_do_syscall calls outside of libc.
Before x86-64:
text data bss dec hex filename 1937748 20456 54896 2013100 1eb7ac libc.so.6
25601 856 12768 39225 9939 nss/libnss_db.so.2
40310 952 25144 66406 10366 nss/libnss_files.so.2
After x86-64:
text data bss dec hex filename 1935312 20456 54896 2010664 1eae28 libc.so.6
25559 864 12768 39191 9917 nss/libnss_db.so.2
39764 960 25144 65868 1014c nss/libnss_files.so.2
Anton Blanchard [Tue, 6 Jul 2021 10:19:36 +0000 (20:19 +1000)]
powerpc64le: Fix typo in configure
The configure script checks for -mlong-double-128 but mentions -mlongdouble
when it fails. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
5 years ago, commit 8f1b841e452dbb083112fd036033b7f4af506ba0
unintentionally added an ifunc to the loader.
That modification has not caused any harm so far, but it doesn't add any
value either, because the hwcap information is available later during
libc initialization.
Added wcsnlen-sse4.1 to the wcslen ifunc implementation list and did
not add wcslen-sse4.1 to wcslen ifunc implementation list. This commit
fixes that by removing wcsnlen-sse4.1 from the wcslen ifunc
implementation list and adding wcslen-sse4.1 to the ifunc
implementation list.
Testing:
test-wcslen.c, test-rsi-wcslen.c, and test-rsi-strlen.c are passing as
well as all other tests in wcsmbs and string.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
added wcsnlen-sse4.1 to the wcslen ifunc implementation list. Since the
random value in the the RSI register is larger than the wide-character
string length in the existing wcslen test, it didn't trigger the wcslen
test failure. Add a test to force 0 into the RSI register before calling
wcslen.
Fangrui Song [Thu, 8 Jul 2021 21:26:22 +0000 (14:26 -0700)]
x86_64: Remove unneeded static PIE check for undefined weak diagnostic
https://sourceware.org/bugzilla/show_bug.cgi?id=21782 dropped an ld
diagnostic for R_X86_64_PC32 referencing an undefined weak symbol in
-pie links. Arguably keeping the diagnostic like other ports is more
correct, since statically resolving movl foo(%rip), %eax to the
link-time zero address produces a corrupted output.
It turns out that --enable-static-pie builds do not depend on the ld
behavior. GCC generates GOT indirection for weak declarations for
-fPIE/-fPIC, so what ld does with the PC-relative relocation doesn't
really matter.
This patch adds a way to close a range of file descriptors on
posix_spawn as a new file action. The API is similar to the one
provided by Solaris 11 [1], where the file action causes the all open
file descriptors greater than or equal to input on to be closed when
the new process is spawned.
The function posix_spawn_file_actions_addclosefrom_np is safe to be
implemented by iterating over /proc/self/fd, since the Linux spawni.c
helper process does not use CLONE_FILES, so its has own file descriptor
table and any failure (in /proc operation) aborts the process creation
and returns an error to the caller.
I am aware that this file action might be redundant to the current
approach of POSIX in promoting O_CLOEXEC in more interfaces. However
O_CLOEXEC is still not the default and for some specific usages, the
caller needs to close all possible file descriptors to avoid them
leaking. Some examples are CPython (discussed in BZ#10353) and OpenJDK
jspawnhelper [2] (where OpenJDK spawns a helper process to exactly
closes all file descriptors). Most likely any environment which calls
functions that might open file descriptor under the hood and aim to use
posix_spawn might face the same requirement.
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
The function closes all open file descriptors greater than or equal to
input argument. Negative values are clamped to 0, i.e, it will close
all file descriptors.
As indicated by the bug report, this is a common symbol provided by
different systems (Solaris, OpenBSD, NetBSD, FreeBSD) and, although
its has inherent issues with not taking in consideration internal libc
file descriptors (such as syslog), this is also a common feature used
in multiple projects [1][2][3][4][5].
The Linux fallback implementation iterates over /proc and close all
file descriptors sequentially. Although it was raised the questioning
whether getdents on /proc/self/fd might return disjointed entries
when file descriptor are closed; it does not seems the case on my
testing on multiple kernel (v4.18, v5.4, v5.9) and the same strategy
is used on different projects [1][2][3][5].
Also, the interface is set a fail-safe meaning that a failure in the
fallback results in a process abort.
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
It was added on Linux 5.9 (278a5fbaed89) with CLOSE_RANGE_CLOEXEC
added on 5.11 (582f1fb6b721f). Although FreeBSD has added the same
syscall, this only adds the symbol on Linux ports. This syscall is
required to provided a fail-safe way to implement the closefrom
symbol (BZ #10353).
Checked on x86_64-linux-gnu and i686-linux-gnu on kernel 5.11 and 4.15.
The code to allocate a stack from xsigstack is refactored so it can
be more generic. The new support_stack_alloc() also set PROT_EXEC
if DEFAULT_STACK_PERMS has PF_X. This is required on some
architectures (hppa for instance) and trying to access the rtld
global from testsuite will require more intrusive refactoring
in the ldsodefs.h header.
Checked on x86_64-linux-gnu and i686-linux-gnu. I also ran
tst-xsigstack on both hppa and ia64.
_int_realloc is correctly declared at the top to be static, but
incorrectly defined without the static keyword. Fix that. The
generated binaries have identical code.
The tcache allocator layer uses the tcache pointer as a key to
identify a block that may be freed twice. Since this is in the
application data area, an attacker exploiting a use-after-free could
potentially get access to the entire tcache structure through this
key. A detailed write-up was provided by Awarau here:
Replace this static pointer use for key checking with one that is
generated at malloc initialization. The first attempt is through
getrandom with a fallback to random_bits(), which is a simple
pseudo-random number generator based on the clock. The fallback ought
to be sufficient since the goal of the randomness is only to make the
key arbitrary enough that it is very unlikely to collide with user
data.
This partially fixes static-only NSS support (bug 27959): The files
module no longer needs dlopen. Support for the dns module remains
to be added, and also support for disabling dlopen altogether.
nss_files: Add generic code for set*ent, end*ent and file open
This reduces RSS usage if nss_files is not actually used, and can
be used later to make NSS data thread-specific. It also results in
a small code size reduction.
nss_files: Allocate nscd file registration data on the heap
This is only needed if nss_files is loaded by nscd.
Before:
text data bss dec hex filename
767 0 24952 25719 6477 nss/files-init.os
After:
text data bss dec hex filename
666 0 0 666 29a nss/files-init.os
Using PATH_MAX bytes unconditionally for the directory name
is wasteful, but fixing that would constitute another break
of this semi-public ABI. (The other issue is that with
symbolic links, an arbitrary set of parent directories may need
watching, not just a single one.)
1. Add __extendhfdf2/__extendhfsf2 to return an IEEE half converted to IEEE double/single.
2. Add __truncdfhf2/__extendsfhf2 to truncate IEEE double/single into IEEE half.
3. Add __eqhf2/__nehf2 to return 0 if a == b and a,b are not NAN, otherwise return 1.
Joseph Myers [Wed, 7 Jul 2021 13:24:05 +0000 (13:24 +0000)]
Update kernel version to 5.13 in tst-mman-consts.py
This patch updates the kernel version in the test tst-mman-consts.py
to 5.13. (There are no new MAP_* constants covered by this test in
5.13 that need any other header changes.)
elf: Clean up GLIBC_PRIVATE exports of internal libdl symbols
They are no longer needed after everything has been moved into
libc. The _dl_vsym test has to be removed because the symbol
cannot be used outside libc anymore.
nptl: Use internal low-level lock type for !IS_IN (libc)
This avoids an ABI hazard (types changing between different modules
of glibc) without introducing linknamespace issues. In particular,
NSS modules now call __lll_lock_wait_private@@GLIBC_PRIVATE to wait
on internal locks (the unlock path is inlined and performs a direct
system call).
The realloc (NULL, 0) test in tst-realloc fails with gcc 7.x but
passes with newer gcc. This is because a newer gcc transforms the
realloc call to malloc (0), thus masking the bug in mcheck.
Disable the test with mcheck for now. The malloc removal patchset
will fix this and then remove this test from the exclusion list.
Reported-by: Stefan Liebler <stli@linux.ibm.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
linux: Consolidate Linux setsockopt implementation
This patch consolidates the setsockopt implementation on
sysdeps/unix/sysv/linux/getsockopt.c. The changes are:
1. Remove it from auto-generation syscalls.list on all architectures.
2. Add __ASSUME_SETSOCKOPT_SYSCALL as default and undef if for
specific kernel versions on some architectures.
This also fix a potential issue where 32-bit time_t ABI should use the
linux setsockopt which overrides the underlying SO_* constants used for
socket timestamping for _TIME_BITS=64.
linux: Consolidate Linux getsockopt implementation
This patch consolidates the getsockopt Linux syscall implementation on
sysdeps/unix/sysv/linux/getsockopt.c. The changes are:
1. Remove it from auto-generation syscalls.list on all architectures.
2. Add __ASSUME_GETSOCKOPT_SYSCALL as default and undef if for
specific kernel versions on some architectures.
This also fix a potential issue where 32-bit time_t ABI should use the
linux getsockopt which overrides the underlying SO_* constants used for
socket timestamping for _TIME_BITS=64.
elf: Call free from base namespace on error in dl-libc.c [BZ #27646]
In dlerror_run, free corresponds to the local malloc in the
namespace, but GLRO (dl_catch_error) uses the malloc from the base
namespace. elf/tst-dlmopen-gethostbyname triggers this mismatch,
but it does not crash, presumably because of a fastbin deallocation.
The variable and function pair appear to provide a way for users to
set conditional breakpoints in mtrace when a specific address is
returned by the allocator. This can be achieved by using conditional
breakpoints in gdb so it is redundant. There is no documentation of
this interface in the manual either, so it appears to have been a hack
that got added to debug malloc. Deprecate these symbols and do not
call tr_break anymore.
Reviewed-by: DJ Delorie <dj@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
malloc: Initiate tcache shutdown even without allocations [BZ #28028]
After commit 1e26d35193efbb29239c710a4c46a64708643320 ("malloc: Fix
tcache leak after thread destruction [BZ #22111]"),
tcache_shutting_down is still not early enough. When we detach a
thread with no tcache allocated, tcache_shutting_down would still be
false.
Like malloc-check, add generic rules to run all tests in malloc by
linking with libmcheck.a so as to provide coverage for mcheck().
Currently the following 12 tests fail:
and they have been added to tests-exclude-mcheck for now to keep
status quo. At least the last two can be attributed to bugs in
mcheck() but I haven't fixed them here since they should be fixed by
removing malloc hooks. Others need to be triaged to check if they're
due to mcheck bugs or due to actual bugs.
Build of iconvconfig failed with CFLAGS=-Os since __feof_unlocked is
not a public symbol. Replace with feof_unlocked (defined to
__feof_unlocked when IS_IN (libc)) to fix this.
Reported-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
resolv: Move libanl into libc (if libpthread is in libc)
The symbols gai_cancel, gai_error, gai_suspend, getaddrinfo_a,
__gai_suspend_time64 were moved using scripts/move-symbol-to-libc.py.
For Hurd (which remains !PTHREAD_IN_LIBC), a few #define redirects
had to be added because several pthread functions are not available
under __. (Linux uses __ prefixes for most hidden aliases, and has
to in some cases to avoid linknamespace issues.)
This patch modifies the current POWER9 implementation of strcpy and
stpcpy to optimize it for POWER9/10.
Since no new POWER10 instructions are used, the original POWER9 strcpy is
modified instead of creating a new implementation for POWER10. This
implementation is based on both the original POWER9 implementation of
strcpy and the preamble of the new POWER10 implementation of strlen.
The changes also affect stpcpy, which uses the same implementation with
some additional code before returning.
On POWER9, averaging improvements across the benchmark
inputs (length/source alignment/destination alignment), for an
experiment that ran the benchmark five times, bench-strcpy showed an
improvement of 5.23%, and bench-stpcpy showed an improvement of 6.59%.
On POWER10, bench-strcpy showed 13.16%, and bench-stpcpy showed 13.59%.
The changes are:
1. Removed the null string optimization.
Although this results in a few extra cycles for the null string, in
combination with the second change, this resulted in improvements for
for other cases.
2. Adapted the preamble from strlen for POWER10.
This is the part of the function that handles up to the first 16 bytes
of the string.
3. Increased number of unrolled iterations in the main loop to 6.
* Intel TSX will be disabled by default.
* The processor will force abort all Restricted Transactional Memory (RTM)
transactions by default.
* A new CPUID bit CPUID.07H.0H.EDX[11](RTM_ALWAYS_ABORT) will be enumerated,
which is set to indicate to updated software that the loaded microcode is
forcing RTM abort.
* On processors that enumerate support for RTM, the CPUID enumeration bits
for Intel TSX (CPUID.07H.0H.EBX[11] and CPUID.07H.0H.EBX[4]) continue to
be set by default after microcode update.
* Workloads that were benefited from Intel TSX might experience a change
in performance.
* System software may use a new bit in Model-Specific Register (MSR) 0x10F
TSX_FORCE_ABORT[TSX_CPUID_CLEAR] functionality to clear the Hardware Lock
Elision (HLE) and RTM bits to indicate to software that Intel TSX is
disabled.
1. Add RTM_ALWAYS_ABORT to CPUID features.
2. Set RTM usable only if RTM_ALWAYS_ABORT isn't set. This skips the
string/tst-memchr-rtm etc. testcases on the affected processors, which
always fail after a microcde update.
3. Check RTM feature, instead of usability, against /proc/cpuinfo.
Joseph Myers [Thu, 1 Jul 2021 17:37:36 +0000 (17:37 +0000)]
Update syscall lists for Linux 5.13
Linux 5.13 has three new syscalls (landlock_create_ruleset,
landlock_add_rule, landlock_restrict_self). Update syscall-names.list
and regenerate the arch-syscall.h headers with build-many-glibcs.py
update-syscalls.
Stefan Liebler [Tue, 29 Jun 2021 09:37:28 +0000 (11:37 +0200)]
s390: Fix MEMCHR_Z900_G5 ifunc-variant if n>=0x80000000 [BZ #28024]
On s390 (31bit), the pointer to the first byte after s always wraps
around with n >= 0x80000000 and can lead to stop searching before
end of s.
Thus this patch just use NULL as byte after s in this case and
the srst instruction stops searching with "not found" when wrapping
around from top address to zero.
This is observable with testcase string/test-memchr
starting with commit "String: Add overflow tests for strnlen, memchr,
and strncat [BZ #27974]"
https://sourceware.org/git/?p=glibc.git;a=commit;h=da5a6fba0febbfc90896ce1b2eb75c6d8a88a72d
Stefan Liebler [Wed, 30 Jun 2021 14:17:37 +0000 (16:17 +0200)]
Fix extra PLT reference in libc.so due to __glob64_time64 if build with gcc 7.5 on 32bit.
Starting with recent commit 84f7ce84474c1648ce96884f1c91ca7b97ca3fc2
"posix: Add glob64 with 64-bit time_t support", elf/check-localplt
fails due to extra PLT reference __glob64_time64 in __glob64_time64
itself.
This is observable with gcc 7.5 on x86_64 with -m32 or s390x with
-m31. E.g. if build with gcc 10, gcc is generating a call to
__glob64_time64.localalias.
This patch is adding a hidden version of __glob64_time64 in the
same way as for __globfree64_time64.
Add hp-timing.h using the cntvct_el0 counter. Return timing in nanoseconds
so it is fully compatible with generic hp-timing. Don't set HP_TIMING_INLINE
in the dynamic linker since it adds unnecessary overheads and some ancient
kernels may not handle emulating cntcvt correctly. Currently cntvct_el0 is
only used for timing in the benchtests.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimize strnlen by avoiding UMINV which is slow on most cores. On Neoverse N1
large strings are 1.8x faster than the current version, and bench-strnlen is
50% faster overall. This version is MTE compatible.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>