Alan Modra [Sat, 17 Aug 2013 09:02:18 +0000 (18:32 +0930)]
PowerPC floating point little-endian [13 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00088.html
* sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Increase alignment of
constants to usual value for .cst8 section, and remove redundant
high address load.
* sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Use float
constant for 0x1p52. Load little-endian words of double from
correct stack offsets.
Alan Modra [Sat, 17 Aug 2013 09:00:23 +0000 (18:30 +0930)]
PowerPC floating point little-endian [10 of 15]
http://sourceware.org/ml/libc-alpha/2013-07/msg00201.html
These two functions oddly test x+1>0 when a double x is >= 0.0, and
similarly when x is negative. I don't see the point of that since the
test should always be true. I also don't see any need to convert x+1
to integer rather than simply using xr+1. Note that the standard
allows these functions to return any value when the input is outside
the range of long long, but it's not too hard to prevent xr+1
overflowing so that's what I've done.
(With rounding mode FE_UPWARD, x+1 can be a lot more than what you
might naively expect, but perhaps that situation was covered by the
x - xrf < 1.0 test.)
Alan Modra [Sat, 17 Aug 2013 08:59:43 +0000 (18:29 +0930)]
PowerPC floating point little-endian [9 of 15]
http://sourceware.org/ml/libc-alpha/2013-07/msg00200.html
This works around the fact that vsx is disabled in current
little-endian gcc. Also, float constants take 4 bytes in memory
vs. 16 bytes for vector constants, and we don't need to write one lot
of masks for double (register format) and another for float (mem
format).
* sysdeps/powerpc/fpu/s_float_bitwise.h (__float_and_test28): Don't
use vector int constants.
(__float_and_test24, __float_and8, __float_get_exp): Likewise.
Alan Modra [Sat, 17 Aug 2013 08:57:19 +0000 (18:27 +0930)]
PowerPC floating point little-endian [6 of 15]
http://sourceware.org/ml/libc-alpha/2013-07/msg00197.html
A rewrite to make this code correct for little-endian.
* sysdeps/ieee754/ldbl-128ibm/e_sqrtl.c (mynumber): Replace
union 32-bit int array member with 64-bit int array.
(t515, tm256): Double rather than long double.
(__ieee754_sqrtl): Rewrite using 64-bit arithmetic.
Alan Modra [Sat, 17 Aug 2013 08:56:39 +0000 (18:26 +0930)]
PowerPC floating point little-endian [5 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00085.html
Rid ourselves of ieee854.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h (union ieee854_long_double):
Delete.
(IEEE854_LONG_DOUBLE_BIAS): Delete.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Don't include ieee854
version of math_ldbl.h.
Alan Modra [Sat, 17 Aug 2013 08:55:51 +0000 (18:25 +0930)]
PowerPC floating point little-endian [4 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00084.html
Another batch of ieee854 macros and union replacement. These four
files also have bugs fixed with this patch. The fact that the two
doubles in an IBM long double may have different signs means that
negation and absolute value operations can't just twiddle one sign bit
as you can with ieee864 style extended double. fmodl, remainderl,
erfl and erfcl all had errors of this type. erfl also returned +1 for
large magnitude negative input where it should return -1. The hypotl
error is innocuous since the value adjusted twice is only used as a
flag. The e_hypotl.c tests for large "a" and small "b" are mutually
exclusive because we've already exited when x/y > 2**120. That allows
some further small simplifications.
[BZ #15734], [BZ #15735]
* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl): Rewrite
all uses of ieee875 long double macros and unions. Simplify test
for 0.0L. Correct |x|<|y| and |x|=|y| test. Use
ldbl_extract_mantissa value for ix,iy exponents. Properly
normalize after ldbl_extract_mantissa, and don't add hidden bit
already handled. Don't treat low word of ieee854 mantissa like
low word of IBM long double and mask off bit when testing for
zero.
* sysdeps/ieee754/ldbl-128ibm/e_hypotl.c (__ieee754_hypotl): Rewrite
all uses of ieee875 long double macros and unions. Simplify tests
for 0.0L and inf. Correct double adjustment of k. Delete dead code
adjusting ha,hb. Simplify code setting kld. Delete two600 and
two1022, instead use their values. Recognise that tests for large
"a" and small "b" are mutually exclusive. Rename vars. Comment.
* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c (__ieee754_remainderl):
Rewrite all uses of ieee875 long double macros and unions. Simplify
test for 0.0L and nan. Correct negation.
* sysdeps/ieee754/ldbl-128ibm/s_erfl.c (__erfl): Rewrite all uses of
ieee875 long double macros and unions. Correct output for large
magnitude x. Correct absolute value calculation.
(__erfcl): Likewise.
* math/libm-test.inc: Add tests for errors discovered in IBM long
double versions of fmodl, remainderl, erfl and erfcl.
Alan Modra [Sat, 17 Aug 2013 08:54:58 +0000 (18:24 +0930)]
PowerPC floating point little-endian [3 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00083.html
Further replacement of ieee854 macros and unions. These files also
have some optimisations for comparison against 0.0L, infinity and nan.
Since the ABI specifies that the high double of an IBM long double
pair is the value rounded to double, a high double of 0.0 means the
low double must also be 0.0. The ABI also says that infinity and
nan are encoded in the high double, with the low double unspecified.
This means that tests for 0.0L, +/-Infinity and +/-NaN need only check
the high double.
* sysdeps/ieee754/ldbl-128ibm/e_atan2l.c (__ieee754_atan2l): Rewrite
all uses of ieee854 long double macros and unions. Simplify tests
for long doubles that are fully specified by the high double.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_isinf_nsl.c (__isinf_nsl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_isinfl.c (___isinfl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_modfl.c (__modfl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Likewise.
Comment on variable precision.
* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Adjust tan_towardzero ulps.
Alan Modra [Sat, 17 Aug 2013 08:54:05 +0000 (18:24 +0930)]
PowerPC floating point little-endian [2 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00082.html
This patch replaces occurrences of GET_LDOUBLE_* and SET_LDOUBLE_*
macros, and union ieee854_long_double_shape_type in ldbl-128ibm/,
and a stray one in the 32-bit fpu support. These files have no
significant changes apart from rewriting the long double bit access.
Alan Modra [Sat, 17 Aug 2013 08:51:58 +0000 (18:21 +0930)]
PowerPC floating point little-endian [1 of 15]
http://sourceware.org/ml/libc-alpha/2013-08/msg00081.html
This is the first of a series of patches to ban ieee854_long_double
and the ieee854_long_double macros when using IBM long double. union
ieee854_long_double just isn't correct for IBM long double, especially
when little-endian, and pretending it is OK has allowed a number of
bugs to remain undetected in sysdeps/ieee754/ldbl-128ibm/.
This changes the few places in generic code that use it.
* stdio-common/printf_size.c (__printf_size): Don't use
union ieee854_long_double in fpnum union.
* stdio-common/printf_fphex.c (__printf_fphex): Likewise. Use
signbit macro to retrieve sign from long double.
* stdio-common/printf_fp.c (___printf_fp): Use signbit macro to
retrieve sign from long double.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Adjust for fpnum change.
* sysdeps/ieee754/ldbl-128/printf_fphex.c: Likewise.
* sysdeps/ieee754/ldbl-96/printf_fphex.c: Likewise.
* sysdeps/x86_64/fpu/printf_fphex.c: Likewise.
* math/test-misc.c (main): Don't use union ieee854_long_double.
ports/
* sysdeps/ia64/fpu/printf_fphex.c: Adjust for fpnum change.
Alan Modra [Sat, 17 Aug 2013 08:49:44 +0000 (18:19 +0930)]
Fix for [BZ #15680] IBM long double inaccuracy
http://sourceware.org/ml/libc-alpha/2013-06/msg00919.html
I discovered a number of places where denormals and other corner cases
were being handled wrongly.
- printf_fphex.c: Testing for the low double exponent being zero is
unnecessary. If the difference in exponents is less than 53 then the
high double exponent must be nearing the low end of its range, and the
low double exponent hit rock bottom.
- ldbl2mpn.c: A denormal (ie. exponent of zero) value is treated as
if the exponent was one, so shift mantissa left by one. Code handling
normalisation of the low double mantissa lacked a test for shift count
greater than bits in type being shifted, and lacked anything to handle
the case where the difference in exponents is less than 53 as in
printf_fphex.c.
- math_ldbl.h (ldbl_extract_mantissa): Same as above, but worse, with
code testing for exponent > 1 for some reason, probably a typo for >= 1.
- math_ldbl.h (ldbl_insert_mantissa): Round the high double as per
mpn2ldbl.c (hi is odd or explicit mantissas non-zero) so that the
number we return won't change when applying ldbl_canonicalize().
Add missing overflow checks and normalisation of high mantissa.
Correct misleading comment: "The hidden bit of the lo mantissa is
zero" is not always true as can be seen from the code rounding the hi
mantissa. Also by inspection, lzcount can never be less than zero so
remove that test. Lastly, masking bitfields to their widths can be
left to the compiler.
- mpn2ldbl.c: The overflow checks here on rounding of high double were
just plain wrong. Incrementing the exponent must be accompanied by a
shift right of the mantissa to keep the value unchanged. Above notes
for ldbl_insert_mantissa are also relevant.
[BZ #15680]
* sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c: Comment fix.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c
(PRINT_FPHEX_LONG_DOUBLE): Tidy code by moving -53 into ediff
calculation. Remove unnecessary test for denormal exponent.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c (__mpn_extract_long_double):
Correct handling of denormals. Avoid undefined shift behaviour.
Correct normalisation of low mantissa when low double is denormal.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h
(ldbl_extract_mantissa): Likewise. Comment. Use uint64_t* for hi64.
(ldbl_insert_mantissa): Make both hi64 and lo64 parms uint64_t.
Correct normalisation of low mantissa. Test for overflow of high
mantissa and normalise.
(ldbl_nearbyint): Use more readable constant for two52.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c
(__mpn_construct_long_double): Fix test for overflow of high
mantissa and correct normalisation. Avoid undefined shift.
Alan Modra [Sat, 17 Aug 2013 08:42:56 +0000 (18:12 +0930)]
IBM long double mechanical changes to support little-endian
http://sourceware.org/ml/libc-alpha/2013-07/msg00001.html
This patch starts the process of supporting powerpc64 little-endian
long double in glibc. IBM long double is an array of two ieee
doubles, so making union ibm_extended_long_double reflect this fact is
the correct way to access fields of the doubles.
* sysdeps/ieee754/ldbl-128ibm/ieee754.h
(union ibm_extended_long_double): Define as an array of ieee754_double.
(IBM_EXTENDED_LONG_DOUBLE_BIAS): Delete.
* sysdeps/ieee754/ldbl-128ibm/printf_fphex.c: Update all references
to ibm_extended_long_double and IBM_EXTENDED_LONG_DOUBLE_BIAS.
* sysdeps/ieee754/ldbl-128ibm/e_exp10l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/ldbl2mpn.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/math_ldbl.h: Likewise.
* sysdeps/ieee754/ldbl-128ibm/mpn2ldbl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nearbyintl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise.
PowerPC: fix backtrace to handle signal trampolines
This patch fixes backtrace for PPC32 and PPC64 to correctly handle
signal trampolines. The 'debug/tst-backtrace6.c' also check for
SA_SIGINFO handling, where is triggers another vDSO symbols for PPC32.
David S. Miller [Tue, 12 Nov 2013 20:48:01 +0000 (12:48 -0800)]
Fix sparc 64-bit GMP ifunc resolution in static builds.
[BZ #16150]
* sysdeps/sparc/sparc64/multiarch/add_n.S: Resolve to the correct generic
symbol in the non-vis3 case in static builds.
* sysdeps/sparc/sparc64/multiarch/addmul_1.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/mul_1.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/sub_n.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/submul_1.S: Likewise.
Fix PI mutex check in pthread_cond_broadcast and pthread_cond_signal
Fixes BZ #15996.
The check had a typo - it checked for PTHREAD_MUTEX_ROBUST_NP instead
of PTHREAD_MUTEX_ROBUST_NORMAL_NP. It has now been replaced by the
already existing convenience macro USE_REQUEUE_PI.
PowerPC: use _dl_static_init to set GLRO(gl_pagesize)
This patch fixes dlfcn/tststatic5 for PowerPC where pagesize
variable was not properly initialized in certain cases. This patch
is based on other architecture code.
Carlos O'Donell [Fri, 19 Jul 2013 06:42:03 +0000 (02:42 -0400)]
CVE-2013-2207, BZ #15755: Disable pt_chown.
The helper binary pt_chown tricked into granting access to another
user's pseudo-terminal.
Pre-conditions for the attack:
* Attacker with local user account
* Kernel with FUSE support
* "user_allow_other" in /etc/fuse.conf
* Victim with allocated slave in /dev/pts
Using the setuid installed pt_chown and a weak check on whether a file
descriptor is a tty, an attacker could fake a pty check using FUSE and
trick pt_chown to grant ownership of a pty descriptor that the current
user does not own. It cannot access /dev/pts/ptmx however.
In most modern distributions pt_chown is not needed because devpts
is enabled by default. The fix for this CVE is to disable building
and using pt_chown by default. We still provide a configure option
to enable hte use of pt_chown but distributions do so at their own
risk.
Carlos O'Donell [Tue, 16 Jul 2013 21:55:43 +0000 (17:55 -0400)]
BZ #15711: Avoid circular dependency for syscall.h
The generated header is compiled with `-ffreestanding' to avoid any
circular dependencies against the installed implementation headers.
Such a dependency would require the implementation header to be
installed before the generated header could be built (See bug 15711).
In current practice the generated header dependencies do not include
any of the implementation headers removed by the use of `-ffreestanding'.
---
2013-07-15 Carlos O'Donell <carlos@redhat.com>
[BZ #15711]
* sysdeps/unix/sysv/linux/Makefile ($(objpfx)bits/syscall%h):
Avoid system header dependency with -ffreestanding.
($(objpfx)bits/syscall%d): Likewise.
Chris Metcalf [Wed, 3 Jul 2013 18:48:39 +0000 (14:48 -0400)]
tile: use _dl_static_init to set GLRO(gl_pagesize)
A recently-added test (dlfcn/tststatic5) pointed out that tile was not
properly initializing the variable pagesize in certain cases. This
change just copies the existing code from MIPS.
Chris Metcalf [Wed, 3 Jul 2013 15:23:01 +0000 (11:23 -0400)]
tile: use soft-fp for fma() and fmaf()
The sfp-machine.h is based on the gcc version, but extended with
required new macros by comparison with other architectures and by
investigating the hardware support for FP on tile.
Andi Kleen [Thu, 27 Jun 2013 18:15:06 +0000 (11:15 -0700)]
Disable elision for any pthread_mutexattr_settype call
PTHREAD_MUTEX_NORMAL requires deadlock for nesting, DEFAULT
does not. Since glibc uses the same value (0) disable elision
for any call to pthread_mutexattr_settype() with a 0 value.
This implies that a program can disable elision by doing
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_NORMAL)
Andi Kleen [Sat, 22 Dec 2012 09:03:04 +0000 (01:03 -0800)]
Add elision to pthread_mutex_{try,timed,un}lock
Add elision paths to the basic mutex locks.
The normal path has a check for RTM and upgrades the lock
to RTM when available. Trylocks cannot automatically upgrade,
so they check for elision every time.
We use a 4 byte value in the mutex to store the lock
elision adaptation state. This is separate from the adaptive
spin state and uses a separate field.
Condition variables currently do not support elision.
Recursive mutexes and condition variables may be supported at some point,
but are not in the current implementation. Also "trylock" will
not automatically enable elision unless some other lock call
has been already called on the lock.
This version does not use IFUNC, so it means every lock has one
additional check for elision. Benchmarking showed the overhead
to be negligible.
Andi Kleen [Fri, 28 Jun 2013 12:19:37 +0000 (05:19 -0700)]
Add minimal test suite changes for elision enabled kernels
tst-mutex5 and 8 test some behaviour not required by POSIX,
that elision changes. This changes these tests to not check
this when elision is enabled at configure time.
Andi Kleen [Sat, 10 Nov 2012 08:51:26 +0000 (00:51 -0800)]
Add the low level infrastructure for pthreads lock elision with TSX
Lock elision using TSX is a technique to optimize lock scaling
It allows to run locks in parallel using hardware support for
a transactional execution mode in 4th generation Intel Core CPUs.
See http://www.intel.com/software/tsx for more Information.
This patch implements a simple adaptive lock elision algorithm based
on RTM. It enables elision for the pthread mutexes and rwlocks.
The algorithm keeps track whether a mutex successfully elides or not,
and stops eliding for some time when it is not.
When the CPU supports RTM the elision path is automatically tried,
otherwise any elision is disabled.
The adaptation algorithm and its tuning is currently preliminary.
The code adds some checks to the lock fast paths. Micro-benchmarks
show little to no difference without RTM.
This patch implements the low level "lll_" code for lock elision.
Followon patches hook this into the pthread implementation
Changes with the RTM mutexes:
-----------------------------
Lock elision in pthreads is generally compatible with existing programs.
There are some obscure exceptions, which are expected to be uncommon.
See the manual for more details.
- A broken program that unlocks a free lock will crash.
There are ways around this with some tradeoffs (more code in hot paths)
I'm still undecided on what approach to take here; have to wait for testing reports.
- pthread_mutex_destroy of a lock mutex will not return EBUSY but 0.
- There's also a similar situation with trylock outside the mutex,
"knowing" that the mutex must be held due to some other condition.
In this case an assert failure cannot be recovered. This situation is
usually an existing bug in the program.
- Same applies to the rwlocks. Some of the return values changes
(for example there is no EDEADLK for an elided lock, unless it aborts.
However when elided it will also never deadlock of course)
- Timing changes, so broken programs that make assumptions about specific timing
may expose already existing latent problems. Note that these broken programs will
break in other situations too (loaded system, new faster hardware, compiler
optimizations etc.)
- Programs with non recursive mutexes that take them recursively in a thread and
which would always deadlock without elision may not always see a deadlock.
The deadlock will only happen on an early or delayed abort (which typically
happens at some point)
This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL
or PTHREAD_MUTEX_ADAPTIVE_NP. PTHREAD_MUTEX_NORMAL mutexes do not elide.
The elision default can be set at configure time.
This patch implements the basic infrastructure for elision.