What is an indirect function (IFUNC)?
The GNU indirect function support (IFUNC) is a feature of the GNU toolchain that allows a developer to create multiple implementations of a given function and to select amongst them at runtime using a resolver function which is also written by the developer. The resolver function is called by the dynamic loader during early startup to resolve which of the implementations will be used by the application. Once an implementation choice is made it is fixed and may not be changed for the lifetime of the process.
How do indirect functions work?
In a traditional binary you have compile time and runtime choices regarding which functions are called by a source-level function call. While your source code may make a call to memcpy, such a call might be handled by a compiler builtin (results in no library call) or a library call (results in a call to an implementation of memcpy in a library). Further at runtime ELF symbol interposition means that any preloaded library might define memcpy and thus become the function that is bound to the program and used by all callers in the program that call such a named library function (ignoring dlmopen isolated namespaces where the choice can be made again). While ELF allows symbol interposition, such interposition is made based on library load order and symbol resolution. Where IFUNC takes this a step further it allows the selected symbol to further differentiate the implementation chosen. Symbols of type STT_GNU_IFUNC (GNU-specific extension) are treated differently from normal symbols. Such IFUNC symbols point to the resolver function, and all calls to such functions are delayed until runtime. References to the function are handled indirectly via R_*_IRELATIVE relocations which return the result of having run the resolver i.e. they return a function pointer to the chosen implementation. Thus when you call the function in question the dynamic loader runs the resolver to determine the best implementation to use, that choice is made and remembered (PLT is updated), and then the function is called.
Design goals
The following is a statement of intent in the design, and does not reflect what is currently implemented. This entire document is up for discussion.
- It would be generally useful to allow IFUNC resolvers to call code in other translation units to allow more complex expressions for deciding which function to return. The problem is that such function calls represent implicit dependencies on the symbols you are calling, requiring that the definition of the symbol be initialized first. This makes IFUNC resolution order the equivalent of a topological depth-first sort against library DT_NEEDED and relocation data. Such a sort would ensure that dependent objects are initialized first. Any cycles are user errors e.g. resolvers that call functions that need the resolver. Any missing edges in the graph are user errors e.g. fail to link against the libc when calling libc functions.
- Symbol interposition either via LD_PRELOAD or a library loading in load order ahead of another library, such that an unexpected symbol is used instead of the original intended symbol should not create any new ordering issues for IFUNCs. That is to say that the runtime is aware at load time of the symbol being selected and must adjust the sorting appropriately to meet the initialization requirements.
- The requirements for the topological depth-first sort against library DT_NEEDED and relocation data should not destroy the benefits of lazy symbol resolution, but may require that some symbols which were previously never resolved at startup to be resolved at startup because the IFUNC resolvers require them.
How do I use indirect functions in my own code?
The GNU Compiler Collection documentation describes how IFUNC should be used here: ifunc attribute. That is all there is to it.
Unfortunately there are actually a lot of restrictions placed on IFUNC usage which aren't entirely clear and the documentation needs to be updated.
Fistly, as you can see there are already a few restrictions on the usage of IFUNC:
- Requirement (a): Resolver must be defined in the same translation unit as the implementations.
- Requirement (b): Cannot be weakly defined functions.
The restrictions are actually more strict than this because the resolution of an IFUNC can happen very early on in the dynamic loader startup process, particularly if -Wl,z,now is used to compile the application (requires all symbols be resolved before application startup).
When LD_BIND_NOW=1 or -Wl,z,now is in effect symbols must be immediately resolved at startup. In cases where an external function call depends needs to be made that may fail if such a call has not been initialized yet (PLT-based relocation which is processed later). For example calling strlen in an IFUNC resolver built with -Wl,z,now may lead to a segfault because the PLT is not yet resolved. This may work on x86_64 where the R_*_IRELATIVE relocations happen in DT_JMPREL after the DT_REL* relocations, but that is no guarantee that it will work on AArch64, PPC64, or other architectures that are slightly different. Such fundamental limitations may be lifted at a future point.
- The resolver must not be compiled with -fstack-protector-all or any similar protections e.g. asan, since they may require early setup which has not yet completed.
Lazy binding (no LD_BIND_NOW=1 or -Wl,z,now) means that the resolver may be called at any point, and therefore must be as safe as the function it implements. If the implemented function is async-signal safe and thread-safe, then the resolver must also be, because it could be called in those same contexts to select an implementation.
If your applications uses IFUNCs and has problems, please reach out to libc-alpha@sourceware.org to discuss your use cases and how we can extend the existing framework to better support your uses. As noted above, some of these limitations can be lifted with more careful ordering of relocations for IFUNC across all architectures.
Work that needs to be done
The IFUNC implementation in glibc needs the following things to become sufficiently robust for real users to use:
- Document in the glibc manual all the resolver function restrictions that exist for all machines that are supported by glibc.
- Document in the gcc manual that the glibc manual for the version of glibc you're using should be consulted to determine resolver function restrictions.
- Incrementally enhance IFUNCs to work around some of the above restrictions.
Resources
http://www.x86-64.org/documentation_folder/abi-0.99.pdf (4.3 Symbol Table, STT_GNU_IFUNC)
https://sites.google.com/site/x32abi/documents (ifunc.txt)
https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc_007d-function-attribute-3095 (ifunc attribute)
https://sourceware.org/ml/libc-alpha/2015-11/msg00108.html (Szabolcs Nagy: using IFUNC outside the libc)