This is the mail archive of the
`libc-alpha@sourceware.org`
mailing list for the glibc project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: "Joseph S. Myers" <joseph at codesourcery dot com>*To*: <libc-alpha at sourceware dot org>*Cc*: <olga dot kupriianova at lip6 dot fr>, <christoph dot lauter at ens-lyon dot org>*Date*: Tue, 23 Jul 2013 16:45:21 +0000*Subject*: libm function variants in glibc

The following are some comments on variants of libm functions in glibc, following reading the metalibm slides now the Cauldron slides have been posted to the GCC wiki. I'm talking about libm in general, although some areas such as <complex.h> functions don't currently seem to be considered by metalibm. 1. Discussions relating to what goes in glibc libm should of course take place on the libc-alpha mailing list; discussions elsewhere, as referred to on slide 24, are not particularly relevant to establishing community consensus. 2. See <http://sourceware.org/ml/libc-alpha/2013-05/msg00132.html> for a detailed description of what I think the accuracy goals are for glibc libm functions at present. 3. The principle of generating function implementations, for the cases using polynomial approximations, automatically using metalibm, certainly makes sense and seems better than the existing cases where tables of polynomial coefficients (or other precomputed values) were probably computed with a range of software, not all free software, and no documented way to recompute or verify those values exists, and there is no good way to determine error bounds either. 4. A large part of libm doesn't involve polynomial approximation or other precomputed values at all. Such tools as metalibm would seem to help much less for implementing all the new functions from TS 18661-1 (bindings to IEEE 754-2008), if there were interest in adding those to glibc, for example; the only ones for which polynomial approximation could be relevant at all are the formatOf sqrt variants (fsqrt, fsqrtl, dsqrtl - compute a square root as if to infinite precision and round once to a narrower type without double rounding). (This does not mean such tools are of no use in such cases - the ability to produce C code with a correctness proof would certainly be of value in cases such as fma - look at the number of bugs I found in the various fma implementations when reviewing/fixing them for correctness in all cases including non-default rounding modes, even though the basic algorithm, ignoring overflow and underflow cases, was proved correct by Boldo and Melquiond. But there you have the further issue that it's not quite just standard C being used - it's C with additional macros to ensure things get evaluated on the correct side of rounding-mode changes or exception tests, given the limitations on optimizations in GCC not respecting those things at present. That issue of interactions of GCC optimizations with rounding-mode changes and exception tests could also apply to other code generated by metalibm, I suppose.) These functions - functions that are in general fully specified for ISO C (+ Annex F), and for which self-contained theoretical arguments for correctness can be given that do not rely on exhaustive searches for worst cases for correct rounding - are also the most likely to have architecture-specific variants in glibc's sysdeps directories, because processors are quite likely to have instructions for some or all of them. But there are plenty of architecture-specific variants for other functions as well - quite a few for x86/x86_64, and almost the entire libm for ia64. Any comparison of new function implementations with old (whether for speed or for accuracy) needs to take the architecture-specific variants into account; in some cases it may be appropriate for an architecture to stop using an architecture-specific variant, in other cases not. 5. To the extent that *new* functions are referred to at the user API level - functions not currently provided by libm at all - I think we should be guided by ISO C API extensions from WG14; functions not in such API extensions are probably better added elsewhere (e.g. GSL) rather than speculatively provided in glibc. I think it *would* be reasonable to add any of the functions from ISO 24747 ("Extensions to the C Library, to Support Mathematical Special Functions"), although those are likely to be particularly hard to implement (involving many numerically sensitive cases). It would also be reasonable to add any of the functions listed in IEEE 754-2008 that aren't currently in the standard C library, *once a draft of TS 18661-4 is available to indicate what the C APIs for such functions, and special cases, should look like*. (This refers to functions such as sinpi. In the case of exp10, we already have it as an extension to ISO C. Most of these functions are straightforward to implement if you don't mind errors of a few ulp.) It would also be reasonable to add any of the functions C11 reserves for addition to <complex.h> (we already have clog10), with the caveat that it's not immediately obvious what branch cuts should look like for clgamma. The difficulty of implementing such functions varies. For any new functions, it would definitely help to have them implemented first in MPFR/MPC for convenience in generating testcases, although I don't think we can require that. 6. An important axis of variation in libm function implementations - though not in the interface provided to users - that doesn't seem to be mentioned in the metalibm slides (and I couldn't find anything about it in the metalibm implementation either) is the choice of the type used for arithmetic within the implementation, as distinct from the type used for arguments and return values. Thus, when implementing a float function (for example - similar issues apply to double functions as well), say sinf (and presuming float is IEEE binary32, as it always is in glibc), there are several choices, and metalibm would need to allow for this issue when being used to implement functions for glibc: (a) Implement using float as the internal implementation type. This is common in glibc, but maybe not such a good idea in reality. For most architectures it's likely use of double internally is more efficient for a given accuracy requirement than using float internally, as it may reduce the need for precision-extension techniques (Dekker etc.). In certain cases where float is supported in hardware but double isn't, it's conceivable using float internally could be better. (b) Implement using double as the internal implementation type (with double+double if correct rounding is required *and* just one double isn't accurate enough for the given inputs). This is likely to be appropriate in most cases. (c) Implement using long double as the internal implementation type, where long double is the x86/ia64/m68k "extended" type (the m68k version being slightly different from the others). This is indicated on 32-bit x86, where you simply can't use float or double as internal type reliably because of the workings of the x87 FPU. (You can set and restore the precision control bits, and some glibc libm functions do so, but the handling of values with out-of-range exponent will still depend on which computed values GCC decides to spill to memory, so any proofs would need to allow for that, and setting and restoring precision control has its own performance cost.) It may be indicated on some m68k systems, for similar reasons. I don't know whether it's appropriate on ia64 (that probably depends on the performance characteristics of float / double / long double arithmetic, since the hardware properly supports all of those). It's probably not appropriate on x86_64, since the SSE floating-point unit is supposed to be faster than the x87 one. And on 32-bit x86, it may be appropriate to use STT_GNU_IFUNC so that functions can be implemented using SSE2 on hardware supporting that and only using x87 for the "long double" functions or for hardware without SSE2. (d) Implement using long double as the internal implementation type, where long double is binary128. Most platforms with binary128 implement it in software, meaning this is not the most efficient approach. I think S/390 may implement it in hardware, however (at least, GCC has instruction patterns for it; I don't know if the instructions are actually implemented in hardware or through kernel emulation). (e) Implement with software floating point (using soft-fp interfaces directly). Although more efficient on systems without hardware floating-point support than using precision-extension techniques layered on top of software floating point, I don't expect anyone actually to implement this, as users seriously concerned about efficiency of floating-point functions are unlikely to be using such processors. So this option is just listed for completeness. 7. A persistent source of issues in glibc's libm is powerpc with IBM long double. I don't know if there's anything metalibm can do there; such a non-IEEE type is generally problematic for lots of things that work well with IEEE floating-point types. One might imagine it could be used to generate polynomial coefficients, even if precise error bounds can't usefully be proved when this type is involved. 8. 12000 function variants (80 per (operation, type) pair), as referred to in the metalibm slides, would clearly be absurd for glibc. We need to keep the size of the libm.so binary under control. Also, the interface provided by libm.so under a particular symbol name needs to be well-defined, not depending on how glibc is configured, and once such a symbol is exported we need to support it forever. Right now, we have *two* variants for most functions: (i) The default version, with a name such as "sin" - see the message I referred to under 2. above for the goals for such a function. (ii) A version __<func>_finite, used when a user program is built with -ffinite-math-only. For glibc libm, -ffinite-math-only is considered to imply -fno-math-errno (all this means is no errno settings on underflow, since -ffinite-math-only implies that no other inputs that could involve errno setting are valid, and ISO C leaves it implementation-defined whether errno is set on underflow even when (math_errhandling & MATH_ERRNO) is nonzero). The relevant existing GCC options are -ffast-math (an option implying several others), -funsafe-math-optimizations (which itself implies a few others), -ffinite-math-only, -fno-math-errno, -fno-signaling-nans (default), -fno-rounding-math (default), -fcx-limited-range, -fno-trapping-math, -fno-signed-zeros, -fassociative-math, -freciprocal-math. Trying to define libm functions variants for all combinations of these would be excessive. I suggest that the *most* that makes sense is the following eight variants, and this could reasonably be cut further. (a) Correctly rounded. These should be named e.g. crsin following TS 18661-4, when provided at all. They should handle all rounding modes, produce accurate exceptions including "inexact", and accurate errno. (b) -ffast-math. This would imply no error or exception guarantees, only finite arguments and results need be handled, and the result is within a few ulp of the correct result for inputs within a few ulp of the input value actually passed to the function (so there could be large errors in numerically unstable regions). (c) General-purpose, with goals as described before in 2.; these would get the linkage names such as "sin", and would handle all rounding modes and errno and exception (except "inexact") setting. If automatically generated with error guarantees, say < 1ulp in round-to-nearest and < 2ulp in other modes. The same goals would apply for all the following variants with the implied limitations (-ffinite-math-only means no need to handle non-finite inputs or outputs or to set errno on underflow; -fno-math-errno means no need to set errno for any inputs; -fno-rounding-math means the function can assume it is called with round-to-nearest mode). (d) -ffinite-math-only (these entry points already exist). (e) -fno-math-errno (in most cases, these could in fact alias the -ffinite-math-only entry points as all those do at present is bypass errno-setting wrappers). (f) -fno-rounding-math. (g) -fno-math-errno -fno-rounding-math. (h) -ffinite-math-only -fno-math-errno -fno-rounding-math. One might reasonably exclude the (e), (f) and (g) entry points to reduce the interface further, to five entry points per function (and many in practice wouldn't have the correctly-rounded entry point, at least for long double). To keep libm code size down, I'd expect all of (c) to (h) to be implemented as alternative entry points (maybe aliases in some cases) to a common core implementation rather than duplicating code for them. At present this might mean using some form of wrapper functions, as are presently used for errno setting, but one might imagine adding a GCC extension to support other forms of multiple entry points without such separate functions. To keep the dynamic symbol table size down, entry points other than for (a) and (c) should only be provided where actually needed; all other cases require header magic to map a function to an appropriate linkage name, and such magic should use a minimal number of linkage names; once such a name is added, we're then stuck with supporting it even if a future implementation can use an alias. For example, on platforms where long double has the same representation as double, glibc needs to provide symbols such as "sinl" (ISO C allows declaring such functions yourself rather than using <math.h>), although it's an alias to "sin". But there is no need to provide __sinl_finite, and the ABI size is kept down by mapping sinl calls with -ffinite-math-only directly to __sin_finite rather than putting __sinl_finite in the ABI. This applies to all cases where different functions can be aliases: don't add an entry point to the ABI unless and until it's needed, whether because it is actually providing something for which an alias would not suffice, or because it's required by ISO C or an ISO C extension. Note that GCC does not predefine macros for -fno-math-errno or -fno-rounding-math. If you wanted to define entry points to glibc for those options, you might want to make GCC predefine such macros first. The same applies for -fno-trapping-math, where I haven't proposed entry points in the list above but they might make sense in a limited number of cases where glibc needs to go to special lengths to *avoid* generating spurious "inexact" exceptions (so -fno-trapping-math could use entry points that don't worry about spurious exceptions). 9. It would be reasonable to implement the TS 18661-3 functions for particular IEEE 754-2008 floating-point types. The main thing needed there would be GCC support for _Float32 etc.; then it would "just" be a matter of adding aliases and header declarations (without making glibc's ABI depend on the compiler support). The points made under 8. above apply to determining what does or does not need an alias in the ABI. Such an addition could eventually obsolete GCC's libquadmath by providing the _Float128 functions directly in glibc's libm under the names proposed for TS 18661-3. It would however be necessary to have the ldbl-128 functions implemented in such a way that they can be used both for long double (_Float128 functions being aliases) and for _Float128 when that type is not long double. The fact that functions for binary128 might need to call the binary128 type by different C names (and use different suffixes on constants etc.) depending on the architecture is something to consider in making metalibm generate such functions. (Right now there is an error-prone manual editing process involved in adapting the ldbl-128 functions for libquadmath.) 10. None of the above considers entry points whose ABI, other than the symbol name, is different from that for standard libm functions. The main case for that is probably entry points taking and returning vectors. The general issue of keeping down the ABI size applies to any such entry points, as does the issue of avoiding the libm ABI depending on what vector ISA extensions are supported by the processor for which glibc is built. Al Grant has suggested the consideration of vector types for C as part of a discussion of parallel language extensions (discussion starting at <http://www.open-std.org/pipermail/cplex/2013-June/000010.html>); it's not clear what if anything might eventually come of that. -- Joseph S. Myers joseph@codesourcery.com

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |