This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Document use of IFUNC support outside of libc.


On 03/03/16 21:10, Carlos O'Donell wrote:
> I attempted to distill some of your notes here:
> https://sourceware.org/glibc/wiki/GNU_IFUNC
> 

thanks, i was meaning to write something about it on the wiki,
but it is a bit hard to separate the bugs from the features.

i identified some issues:

* the first point about bind now is not entirely correct,
lazy binding does not change that much.

the reloc processing order at load time is:

1) DT_REL(A) relocs
2) DT_REL(A) relocs that call ifunc resolvers
3) DT_JMPREL relocs (may call ifunc resolvers or delay them)
4) DT_JMPREL relocs that call ifunc resolvers

(for example 1) can be data access through GOT, 2) is ifunc
resolved function address access through GOT, 3) is extern
function call, 4) is ifunc resolved function call that binds
locally e.g. static function with _IRELATIVE reloc.)

the only difference between lazy binding and bind now is at
step 3): run time vs load time ifunc resolution.

of course the ordering in 3) can break resolvers with bind
now that work with lazy binding, but the real problem is 2):
a resolver called there must only depend on relocs in 1).

it is still possible to call extern functions from an ifunc
resolver, but only if it is forced to use relocs in 1) (e.g.
call through a volatile funcptr or -fno-plt).  i'm not sure
if glibc wants to document this to work, because the user
needs to know about relocations (which is compiler/linker
internals).  the nasty part is that the compiler is free to
add extern calls (into libc or compiler runtime) which can
break the resolver so it cannot be written in c or c++ in
principle :(

the dynamic linker could do the reloc ordering a bit better
(so e.g. 2) happens after 3) in case of lazy binding), but
i'm not sure how much that would help if potentially all
functions may be ifunc resolved in a module.


* an omission from that wiki page is static linking:
ifunc resolvers run very early then (so memcpy etc work
during libc initialization), and that breaks stack-protection
etc instrumentation: the thread pointer is not yet set up.

the vdso is not yet set up either and the vsyscall mechanism
uses ifunc now, so vdso does not work with static linking at
all (!) clock_gettime goes through a syscall (i think this is
a bug that can result in surprising perf regression for users
who expect speedup from static linking so i opened BZ 19767 ).

i suspect there might be other limitations on resolvers
because ptr mangling is not set up either..

probably static linking can be fixed by having two sets of
ifunc resolvers: one that only the libc uses and runs early
and another set that runs after some c runtime init is done
similar to the dynamic linked case.

i actually would like to use vdso from ifunc resolvers
to do the ifunc dispatch based on information that is only
available in the kernel and cannot be easily communicated
through other means (e.g. sysfs stuff).


* yet another issue is that the ifunc resolver type
signature is different on different targets.
(and if the user defined resolver takes no argument, but the
dynamic linker calls it with arguments that is not strictly
correct in c even if it happens to work for most call abis:
there were hardening proposals based on type signature checks
for indirect calls which the dynamic linker would violate).

> That way I can point users at this.
> 
> In gperftools tcmalloc added an IFUNC use [1] which
> violates some of the requirements under -Wl,z,now,
> so I have a need to document this support and discuss
> with tcmalloc developers what we might do. Right now
> they call way too much code for this to work.
> 
> Cheers,
> Carlos.
> 
> [1] https://github.com/gperftools/gperftools/commit/6fdfc5a7f40ebcff3fdaada1a2994ff54be2f9c7
> 
+static bool sized_delete_enabled(void) {
+  if (tcmalloc_sized_delete_enabled != 0) {
+    return !!tcmalloc_sized_delete_enabled();
+  }

i think this call happens to work because the func address
check for the weak ref forces the reloc to happen at step 1).

+  const char *flag = TCMallocGetenvSafe("TCMALLOC_ENABLE_SIZED_DELETE");
+  return tcmalloc::commandlineflags::StringToBool(flag, false);

i think this will crash if the address of delete is used
(so ifunc resolver runs at step 2 while PLTGOT entries are
uninitialized) independently of binding lazy vs now.
with binding now it may crash without taking the address
of delete.


i'll try to update the wiki, but will wait for some
feedbacks here for a while.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]