This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

using IFUNC outside the libc


ifunc resolvers outside of the libc:

(this area is not well documented, i dump my understanding of the
current situation here, would be nice to see some approving nods
that this is indeed the expected behaviour.)

there are two sets of relocation entries in an elf shared object:
DT_JMPREL (plt relocations) and DT_REL + DT_RELA relocations.

plt relocs only need a minimal setup in case of lazy binding, but both
sets may have relocs for which an ifunc resolver has to be called at
relocation processing time.

(for plt relocs it's R_*_IRELATIVE, happens when a function with ifunc
resolver binds locally, e.g. it has protected visibility.  in the other
set, symbolic relocations that refer to a STT_GNU_IFUNC symbol, e.g.
R_*_GLOB_DAT relocation for a func ptr with ifunc resolver.)

it seems that both sets of relocations are sorted so the ones which
need ifunc calls are at the end:
https://sourceware.org/bugzilla/show_bug.cgi?id=13302
https://sourceware.org/bugzilla/show_bug.cgi?id=18841

but since the two sets of relocations are processed one after the other
(first DT_REL* then DT_JMPREL in case of glibc, but the ordering is
outside the elf spec) this is not enough to guarantee that all normal
relocs are processed before an ifunc resolver is called.

an ifunc resolver will crash if it is called while processing the
first set, but it relies on normal relocs from the second set:

  // dso.c
  void bar(void);
  static void foo_impl() {}
  static void *foo_resolver() { bar(); return foo_impl; }
  void foo() __attribute__((ifunc("foo_resolver")));
  void *test() { return foo; } // R_*_GLOB_DAT reloc for foo calls the
                               // resolver before bar's GOTPLT setup

  // main.c
  void bar(){}
  void *test(void);
  int main() { return !test(); }

(1) despite all the ordering efforts, calling an extern function from
an ifunc resolver may not work.  (this lead to the broken x86 multi-
versioning in gcc-5: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65612 )
in particular an ifunc resolver cannot call into the libc.  running user
code before relocations are done is the most problematic part of ifuncs.
(the compiler should be free to add random relocations to the ifunc
resolver if it is written in c.)

note that calling the extern function through a func ptr happens to
work, there is no plt reloc then (this allows a simpler fix for the gcc
multiversioning issue that relies on the binutils 18841 fix), but is
this guaranteed dynamic linker behaviour?

in the static linking case i see that ifunc resolvers are called before
thread ptr setup, (so no errno, tls, pthread_self etc), and before ssp
canary and ptr mangling setup and vdso setup.

(2) if the resolver is compiled with -fstack-protector-all it may
crash (any similar instrumentation that requires early setup may
crash the resolver e.g. the asan shadow mapping will not be available).

(3) libc functions are not safe to call from an ifunc resolver with
static linking either.  (of course it would be risky to call libc
functions anyway: they might need ifunc resolution themselvs.)

so the resolver has very limited methods to get information about the
underlying machine. (e.g. sysfs can only be accessed with raw syscalls,
vdso access needs manual elf symbol lookups, but i think auxv is not
available either for the vdso address or other information from the os).

(4) an ifunc resolver can get information via
- cpu id instructions in the isa (only some archs has it, inflexible)
- raw syscall or other trap into the os (cannot use libc, slow)
- arguments the libc passes to the resolver (the arguments are abi,
cannot be changed, usually just hwcap, but currently undocumented).
both machine and os information may be necessary in a future proof
way for the dispatch.

(5) i think it's not possible to safely cache the results in userspace
if the machine identification is slow. (x86 gcc multiversioning caches
the cpuid results once per dso which may mean a large number of dirty
pages and slow start up.)

(6) because of lazy binding, the resolver must be as-safe and
thread-safe, since the resolver may be called in a signal handler or
in a multi-threaded process.  (e.g. the resolver used in x86 gcc for
multiversioning now is neither and it is possible to construct a
program where that is observably broken.)  i think a callback in the
dynamic linker while doing lazy binding is the second most problematic
part of ifuncs.

the set of safe operations from an ifunc resolver is still not clear
to me, this is my attempt to document some of the caveats (6 for now).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]