H.J. Lu [Fri, 31 Jul 2009 18:53:35 +0000 (11:53 -0700)]
Support multiarch for i686.
This patch adds multiarch support when configured for i686. I modified
some x86-64 functions to support 32bit. I will contribute 32bit SSE string
and memory functions later.
Jakub Jelinek [Fri, 31 Jul 2009 14:26:36 +0000 (07:26 -0700)]
Fix obstack* on i?86
obstack calls several callbacks, so on i?86 it'd better be compiled
without -mpreferred-stack-boundary=2, otherwise the callbacks are called
with misaligned stack.
We use sigaltstack internally which on some systems is a syscall
and should be used as such. Move the x86-64 version to the Linux
specific directory and create in its place a file which always
causes compile errors.
We use a callback function into libc.so to get access to the data
structure with the information and have special versions of the test
macros which automatically use this function.
Preserve SSE registers in runtime relocations on x86-64.
SSE registers are used for passing parameters and must be preserved
in runtime relocations. This is inside ld.so enforced through the
tests in tst-xmmymm.sh. But the malloc routines used after startup
come from libc.so and can be arbitrarily complex. It's overkill
to save the SSE registers all the time because of that. These calls
are rare. Instead we save them on demand. The new infrastructure
put in place in this patch makes this possible and efficient.
Refine testing for xmm/ymm register use in x86-64 ld.so.
The test now takes the callgraph into account. Only code called
during runtime relocation is affected by the limitation. We now
determine the affected object files as closely as possible from
the outside. This allowed to remove some the specializations
for some of the string functions as they are only used in other
code paths.
Jakub Jelinek [Mon, 27 Jul 2009 14:25:57 +0000 (07:25 -0700)]
Fix STB_GNU_UNIQUE handling for > 30 unique symbols.
There were several issues when the initial 31 entries hashtab filled up.
size * 3 <= tab->n_elements is always false, table can't have more elements
than its size. I assume from libiberty/hashtab.c this meant to be check for
3/4 full. Even after fixing that, _dl_higher_prime_number (31) apparently
returns 31, only _dl_higher_prime_number (32) returns 61. And, size
variable wasn't updated during reallocation, which means during reallocation
the insertion of the new entry was done into a wrong spot.
All this lead to a hang in ld.so, because a search with n_elements 31 size
31 wouldn't ever terminate.
Make sure no code in ld.so uses xmm/ymm registers on x86-64.
This patch introduces a test to make sure no function modifies the
xmm/ymm registers. With the exception of the auditing functions.
The test is probably too pessimistic. All code linked into ld.so
is checked. Perhaps at some point the callgraph starting from
_dl_fixup and _dl_profile_fixup is checked and we can start using
faster SSE-using functions in parts of ld.so.
Perform test for Arom x86-64 in central place and handle it.
There will be more than one function which, in multiarch mode, wants
to use SSSE3. We should not test in each of them for Atoms with
slow SSSE3. Instead, disable the SSSE3 bit in the startup code for
such machines.
The posix/tst-rfc3484* test cases caused warnings in newer gccs
because the unused but copied sin_zero part of sockaddr_in wasn't
explicitly initialized.
References to unique symbols from copy relocations can only come
from executables which cannot be unloaded anyway. Optimize the
code to set the unload flag a bit.
Check generated locale for non-ASCII 8-bit characters with case conversion.
If a locale does not have 8-bit characters with case conversion which
are different from the ASCII conversion (±0x20) then we can perform
some optimizations. These will follow later.
It just happens that __pthread_enable_asynccancel doesn't modify the $rdi
register. But this isn't guaranteed. Hence we reload the register after
the calls.
In EDNS0 records the maximum result size is transmitted in a 16
bit value. Large buffer sizes were handled incorrectly by using
only the low 16 bits. Fix this by limiting the size to 0xffff.
Petr Baudis [Thu, 16 Jul 2009 17:10:10 +0000 (10:10 -0700)]
Fix lock handling in memory hander of nscd.
The commit 20e498bd removes the pthread_mutex_rdlock() calls, but not the
corresponding pthread_mutex_unlock() calls. Also, the database lock is never
unlocked in one branch of the mempool_alloc() if.
I think unreproducible random assert(dh->usable) crashes in prune_cache() were
caused by this. But an easy way to make nscd threads hang with the broken
locking was.
With atomic fastbins the checks performed can race with concurrent
modifications of the arena. If we detect a problem re-do the test
after getting the lock.
Jakub Jelinek [Thu, 16 Jul 2009 14:24:50 +0000 (07:24 -0700)]
Use rel semantics of cas instead of acq semantics with full barrier before it in _int_free
The following patch fixes catomic_compare_and_exchange_*_rel definitions
(which were never used and weren't correct) and uses
catomic_compare_and_exchange_val_rel in _int_free. Comparing to the
pre-2009-07-02 --enable-experimental-malloc state the generated code should
be identical on all arches other than ppc/ppc64 and on ppc/ppc64 should use
lwsync instead of isync barrier.
The original AVX patch used a function pointer to handle the difference
between machines with and without AVX support. This is insecure. A
well-placed memory exploit could lead to redirection of the execution.
Using a variable and several tests is a bit slower but cannot be
exploited in this way.
Some symbols have to be identified process-wide by their name. This is
particularly important for some C++ features (e.g., class local static data
and static variables in inline functions). This cannot completely be
implemented with ELF functionality so far. The STB_GNU_UNIQUE binding
helps by ensuring the dynamic linker will always use the same definition for
all symbols with the same name and this binding.