It was noted in 2005 (BZ #832), 2006 (BZ #3266), and 2007 [1] that ldd
fails on shells other than Bash >= 3.0 because of the pipefail option
around try_trace (added on 2004-12-08). EGLIBC was patched in 2008 [2]
(r6912) to make the pipefail check run only on shells that support it,
but RTLD output would still be lost on other shells with certain SELinux
policies.
This patch rewrites try_trace to work on any POSIX-conformant shell in
such a way as to also work with such SELinux policies. It also obviates
one difference between glibc and EGLIBC.
[BZ #832]
* elf/ldd.bash.in (try_trace): More robustly and portably work around
SELinux terminal write permissions by using a command substitution
instead of a pipeline and pipefail option.
Add systemtap probes to various slow paths in libm so that application
developers may use systemtap to find out if their applications are
hitting these slow paths. We have added probes for pow, exp, log,
tan, atan and atan2.
Eric Biggers [Fri, 11 Oct 2013 16:59:38 +0000 (22:29 +0530)]
Fix fwrite() reading beyond end of buffer in error path
Partially revert commits 2b766585f9b4ffabeef2f36200c275976b93f2c7 and de2fd463b1c0310d75084b6d774fb974075a4ad9, which were intended to fix BZ#11741
but caused another, likely worse bug, namely that fwrite() and fputs() could,
in an error path, read data beyond the end of the specified buffer, and
potentially even write this data to the file.
Fix BZ#11741 properly by checking the return value from _IO_padn() in
stdio-common/vfprintf.c.
Will Newton [Wed, 9 Oct 2013 13:41:57 +0000 (14:41 +0100)]
malloc/hooks.c: Correct check for overflow in memalign_check.
A large value of bytes passed to memalign_check can cause an integer
overflow in _int_memalign and heap corruption. This issue can be
exposed by running tst-memalign with MALLOC_CHECK_=3.
ChangeLog:
2013-10-10 Will Newton <will.newton@linaro.org>
* malloc/hooks.c (memalign_check): Ensure the value of bytes
passed to _int_memalign does not overflow.
Torvald Riegel [Tue, 8 Oct 2013 11:04:10 +0000 (14:04 +0300)]
benchtests: Add include-sources directive.
This adds the "include-sources" directive to scripts/bench.pl. This
allows for including source code (vs including headers, which might get
a different search path) after the inclusion of any headers.
This patch adds some more directives to the benchmark inputs file,
moving functionality from the Makefile and making the code generation
script a bit cleaner. The function argument and return types that
were earlier added as variables in the makefile and passed to the
script via command line arguments are now the 'args' and 'ret'
directive respectively. 'args' should be a colon separated list of
argument types (skipped if the function doesn't accept any arguments)
and 'ret' should be the return type.
Additionally, an 'includes' directive may have a comma separated list
of headers to include in the source. For example, the pow input file
now looks like this:
I did this to unclutter the benchtests Makefile a bit and eventually
eliminate dependency of the tests on the Makefile and have tests
depend on their respective include files only.
Will Newton [Thu, 3 Oct 2013 10:26:34 +0000 (11:26 +0100)]
malloc/tst-valloc.c: Tidy up code.
Add some comments and call free on all potentially allocated pointers.
Also remove duplicate check for NULL pointer.
ChangeLog:
2013-10-04 Will Newton <will.newton@linaro.org>
* malloc/tst-valloc.c: Add comments.
(do_test): Add comments and call free on all potentially
allocated pointers. Remove duplicate check for NULL pointer.
Add space after cast.
Will Newton [Thu, 3 Oct 2013 10:24:38 +0000 (11:24 +0100)]
malloc/tst-pvalloc.c: Tidy up code.
Add some comments and call free on all potentially allocated pointers.
Also remove duplicate check for NULL pointer.
ChangeLog:
2013-10-04 Will Newton <will.newton@linaro.org>
* malloc/tst-pvalloc.c: Add comments.
(do_test): Add comments and call free on all potentially
allocated pointers. Remove duplicate check for NULL pointer.
Add space after cast.
Alan Modra [Fri, 4 Oct 2013 03:18:51 +0000 (12:48 +0930)]
Use stdint.h types in union unaligned.
* sysdeps/powerpc/powerpc32/dl-machine.c (__process_machine_rela):
Use stdint types in rather than __attribute__((mode())).
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
Alan Modra [Sat, 17 Aug 2013 09:18:36 +0000 (18:48 +0930)]
PowerPC LE memchr and memrchr
http://sourceware.org/ml/libc-alpha/2013-08/msg00105.html
Like strnlen, memchr and memrchr had a number of defects fixed by this
patch as well as adding little-endian support. The first one I
noticed was that the entry to the main loop needlessly checked for
"are we done yet?" when we know the size is large enough that we can't
be done. The second defect I noticed was that the main loop count was
wrong, which in turn meant that the small loop needed to handle an
extra word. Thirdly, there is nothing to say that the string can't
wrap around zero, except of course that we'd normally hit a segfault
on trying to read from address zero. Fixing that simplified a number
of places:
- /* Are we done already? */
- addi r9,r8,8
- cmpld r9,r7
- bge L(null)
becomes
+ cmpld r8,r7
+ beqlr
However, the exit gets an extra test because I test for being on the
last word then if so whether the byte offset is less than the end.
Overall, the change is a win.
Lastly, memrchr used the wrong cache hint.
* sysdeps/powerpc/powerpc64/power7/memchr.S: Replace rlwimi with
insrdi. Make better use of reg selection to speed exit slightly.
Schedule entry path a little better. Remove useless "are we done"
checks on entry to main loop. Handle wrapping around zero address.
Correct main loop count. Handle single left-over word from main
loop inline rather than by using loop_small. Remove extra word
case in loop_small caused by wrong loop count. Add little-endian
support.
* sysdeps/powerpc/powerpc32/power7/memchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise. Use proper
cache hint.
* sysdeps/powerpc/powerpc32/power7/memrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Add little-endian
support. Avoid rlwimi.
* sysdeps/powerpc/powerpc32/power7/rawmemchr.S: Likewise.
Alan Modra [Sat, 17 Aug 2013 09:17:59 +0000 (18:47 +0930)]
PowerPC LE memset
http://sourceware.org/ml/libc-alpha/2013-08/msg00104.html
One of the things I noticed when looking at power7 timing is that rlwimi
is cracked and the two resulting insns have a register dependency.
That makes it a little slower than the equivalent rldimi.
Alan Modra [Sat, 17 Aug 2013 09:17:22 +0000 (18:47 +0930)]
PowerPC LE memcpy
http://sourceware.org/ml/libc-alpha/2013-08/msg00103.html
LIttle-endian support for memcpy. I spent some time cleaning up the
64-bit power7 memcpy, in order to avoid the extra alignment traps
power7 takes for little-endian. It probably would have been better
to copy the linux kernel version of memcpy.
* sysdeps/powerpc/powerpc32/power4/memcpy.S: Add little endian support.
* sysdeps/powerpc/powerpc32/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/mempcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise. Make better
use of regs. Use power7 mtocrf. Tidy function tails.
Alan Modra [Sat, 17 Aug 2013 09:16:47 +0000 (18:46 +0930)]
PowerPC LE memcmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00102.html
This is a rather large patch due to formatting and renaming. The
formatting changes were to make it possible to compare power7 and
power4 versions of memcmp. Using different register defines came
about while I was wrestling with the code, trying to find spare
registers at one stage. I found it much simpler if we refer to a reg
by the same name throughout a function, so it's better if short-term
multiple use regs like rTMP are referred to using their register
number. I made the cr field usage changes when attempting to reload
rWORDn regs in the exit path to byte swap before comparing when
little-endian. That proved a bad idea due to the pipelining involved
in the main loop; Offsets to reload the regs were different first
time around the loop.. Anyway, I left the cr field usage changes in
place for consistency.
Aside from these more-or-less cosmetic changes, I fixed a number of
places where an early exit path restores regs unnecessarily, removed
some dead code, and optimised one or two exits.
* sysdeps/powerpc/powerpc64/power7/memcmp.S: Add little-endian support.
Formatting. Consistently use rXXX register defines or rN defines.
Use early exit labels that avoid restoring unused non-volatile regs.
Make cr field use more consistent with rWORDn compares. Rename
regs used as shift registers for unaligned loop, using rN defines
for short lifetime/multiple use regs.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc32/power7/memcmp.S: Likewise. Exit with
addi 1,1,64 to pop stack frame. Simplify return value code.
* sysdeps/powerpc/powerpc32/power4/memcmp.S: Likewise.
Alan Modra [Sat, 17 Aug 2013 09:16:05 +0000 (18:46 +0930)]
PowerPC LE strchr
http://sourceware.org/ml/libc-alpha/2013-08/msg00101.html
Adds little-endian support to optimised strchr assembly. I've also
tweaked the big-endian code a little. In power7/strchr.S there's a
check in the tail of the function that we didn't match 0 before
finding a c match, done by comparing leading zero counts. It's just
as valid, and quicker, to compare the raw output from cmpb.
Another little tweak is to use rldimi/insrdi in place of rlwimi for
the power7 strchr functions. Since rlwimi is cracked, it is a few
cycles slower. rldimi can be used on the 32-bit power7 functions
too.
* sysdeps/powerpc/powerpc64/power7/strchr.S (strchr): Add little-endian
support. Correct typos, formatting. Optimize tail. Use insrdi
rather than rlwimi.
* sysdeps/powerpc/powerpc32/power7/strchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strchrnul.S (__strchrnul): Add
little-endian support. Correct typos.
* sysdeps/powerpc/powerpc32/power7/strchrnul.S: Likewise. Use insrdi
rather than rlwimi.
* sysdeps/powerpc/powerpc64/strchr.S (rTMP4, rTMP5): Define. Use
in loop and entry code to keep "and." results.
(strchr): Add little-endian support. Comment. Move cntlzd
earlier in tail.
* sysdeps/powerpc/powerpc32/strchr.S: Likewise.
Alan Modra [Sat, 17 Aug 2013 09:11:17 +0000 (18:41 +0930)]
PowerPC LE strcmp and strncmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00099.html
More little-endian support. I leave the main strcmp loops unchanged,
(well, except for renumbering rTMP to something other than r0 since
it's needed in an addi insn) and modify the tail for little-endian.
I noticed some of the big-endian tail code was a little untidy so have
cleaned that up too.
Alan Modra [Sat, 17 Aug 2013 09:10:48 +0000 (18:40 +0930)]
PowerPC LE strnlen
http://sourceware.org/ml/libc-alpha/2013-08/msg00098.html
The existing strnlen code has a number of defects, so this patch is more
than just adding little-endian support. The changes here are similar to
those for memchr.
* sysdeps/powerpc/powerpc64/power7/strnlen.S (strnlen): Add
little-endian support. Remove unnecessary "are we done" tests.
Handle "s" wrapping around zero and extremely large "size".
Correct main loop count. Handle single left-over word from main
loop inline rather than by using small_loop. Correct comments.
Delete "zero" tail, use "end_max" instead.
* sysdeps/powerpc/powerpc32/power7/strnlen.S: Likewise.
Alan Modra [Sat, 17 Aug 2013 09:10:11 +0000 (18:40 +0930)]
PowerPC LE strlen
http://sourceware.org/ml/libc-alpha/2013-08/msg00097.html
This is the first of nine patches adding little-endian support to the
existing optimised string and memory functions. I did spend some
time with a power7 simulator looking at cycle by cycle behaviour for
memchr, but most of these patches have not been run on cpu simulators
to check that we are going as fast as possible. I'm sure PowerPC can
do better. However, the little-endian support mostly leaves main
loops unchanged, so I'm banking on previous authors having done a
good job on big-endian.. As with most code you stare at long enough,
I found some improvements for big-endian too.
Little-endian support for strlen. Like most of the string functions,
I leave the main word or multiple-word loops substantially unchanged,
just needing to modify the tail.
Removing the branch in the power7 functions is just a tidy. .align
produces a branch anyway. Modifying regs in the non-power7 functions
is to suit the new little-endian tail.
* sysdeps/powerpc/powerpc64/power7/strlen.S (strlen): Add little-endian
support. Don't branch over align.
* sysdeps/powerpc/powerpc32/power7/strlen.S: Likewise.
* sysdeps/powerpc/powerpc64/strlen.S (strlen): Add little-endian support.
Rearrange tmp reg use to suit. Comment.
* sysdeps/powerpc/powerpc32/strlen.S: Likewise.
This copies the sparc version of sigstack.h, which gives powerpc
#define MINSIGSTKSZ 4096
#define SIGSTKSZ 16384
Before the VSX changes, struct rt_sigframe size was 1920 plus 128 for
__SIGNAL_FRAMESIZE giving ppc64 exactly the default MINSIGSTKSZ of
2048.
After VSX, ucontext increased by 256 bytes. Oops, we're over
MINSIGSTKSZ, so powerpc has been using the wrong value for quite a
while. Add another ucontext for TM and rt_sigframe is now at 3872,
giving actual MINSIGSTKSZ of 4000.
The glibc testcase that I was looking at was tst-cancel21, which
allocates 2*SIGSTKSZ (not because the test is trying to be
conservative, but because the test actually has nested signal stack
frames). We blew the allocation by 48 bytes when using current
mainline gcc to compile glibc (le ppc64).
The required stack depth in _dl_lookup_symbol_x from the top of the
next signal frame was 10944 bytes. I guess you'd want to add 288 to
that, implying an actual SIGSTKSZ of 11232.
* sysdeps/unix/sysv/linux/powerpc/bits/sigstack.h: New file.
Use conditional form of branch and link to avoid destroying the cpu
link stack used to predict blr return addresses.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/makecontext.S: Use
conditional form of branch and link when obtaining pc.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/makecontext.S: Likewise.
Alan Modra [Sat, 17 Aug 2013 09:05:40 +0000 (18:35 +0930)]
PowerPC ugly symbol versioning
http://sourceware.org/ml/libc-alpha/2013-08/msg00090.html
This patch fixes symbol versioning in setjmp/longjmp. The existing
code uses raw versions, which results in wrong symbol versioning when
you want to build glibc with a base version of 2.19 for LE.
Note that the merging the 64-bit and 32-bit versions in novmx-lonjmp.c
and pt-longjmp.c doesn't result in GLIBC_2.0 versions for 64-bit, due
to the base in shlib_versions.
Anton Blanchard [Sat, 17 Aug 2013 09:04:40 +0000 (18:34 +0930)]
PowerPC LE setjmp/longjmp
http://sourceware.org/ml/libc-alpha/2013-08/msg00089.html
Little-endian fixes for setjmp/longjmp. When writing these I noticed
the setjmp code corrupts the non volatile VMX registers when using an
unaligned buffer. Anton fixed this, and also simplified it quite a
bit.
The current code uses boilerplate for the case where we want to store
16 bytes to an unaligned address. For that we have to do a
read/modify/write of two aligned 16 byte quantities. In our case we
are storing a bunch of back to back data (consective VMX registers),
and only the start and end of the region need the read/modify/write.
Alan Modra [Sat, 17 Aug 2013 09:02:18 +0000 (18:32 +0930)]
PowerPC floating point little-endian [13 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00088.html
* sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Increase alignment of
constants to usual value for .cst8 section, and remove redundant
high address load.
* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Use float
constant for 0x1p52. Load little-endian words of double from
correct stack offsets.