ifunc resolving

Fangrui Song i@maskray.me
Mon Jan 18 22:04:03 GMT 2021


I have seen ifunc relocation activities on glibc and ld recently.
https://sourceware.org/glibc/wiki/GNU_IFUNC is under-documented, some aspects
have not been well-known, and there are a lot differences across architectures
supporting ifunc, so I am sending this email hoping that these aspects can be
clarified, toolchain developers can get on the same page, and documentation can
be improved (if developers get confused at times, how could regular users
comfortably use them? :) )


1. An ifunc defined in the executable is called by a (link-time DT_NEEDED or
    runtime) shared object.

 From ld https://sourceware.org/bugzilla/show_bug.cgi?id=23169 (x86 only) this looks desired.
My understanding (comment 8) is that

(1) The main executable is relocated the last.
(2) By converting the main executable STT_GNU_IFUNC symbol to STT_FUNC, when
processing relocations in a DSO, the ifunc resolver will not be called while the
main executable is unresolved.

ifunc calls from within the executable do not incur additional costs.
ifunc calls from DSOs go through the main exe PLT and are punished.

When processing an ifunc relocation in a DSO, if the ifunc resolver is defined
in another DSO, according to comment 9 it will be errored.

The adds an executable-vs-shared difference to non-preemptible ifunc, but so be it.


The above sounds reasonable. However, the top-of-tree ld does not make -no-pie
and -pie behaviors consistent (note: ld does not support -no-pie yet).


cat > ./a.s <<eof
resolver:
   nop

.globl ifunc, _start
.type ifunc, @gnu_indirect_function
.set ifunc, resolver

_start:
   movq ifunc@GOTPCREL(%rip), %rax
   call ifunc
   # bl ifunc
eof
echo 'call ifunc' > ./b.s
as a.s -o a.o
gcc -shared -fpic b.s -o b.so

~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf -W -s a | grep ifunc
~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && readelf -W -s a | grep ifunc
~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && readelf -W --dyn-syms a | grep ifunc
~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o a && readelf -W --dyn-syms a | grep ifunc

~/Dev/binutils-gdb/Debug/ld/ld-new is a top-of-tree ld.

% ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf -W -s a | grep ifunc
      7: 0000000000401008     0 IFUNC   GLOBAL DEFAULT    3 ifunc
% ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && readelf -W -s a | grep ifunc
      5: 0000000000401010     0 FUNC    GLOBAL DEFAULT    7 ifunc
      8: 0000000000401010     0 FUNC    GLOBAL DEFAULT    7 ifunc

% ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && readelf -W --dyn-syms a | grep ifunc
      5: 0000000000001020     0 IFUNC   GLOBAL DEFAULT    8 ifunc
% ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o a && readelf -W --dyn-syms a | grep ifunc
      5: 0000000000001020     0 IFUNC   GLOBAL DEFAULT    8 ifunc

In the four combinations, -no-pie a.o ./b.so does the conversion.

Once a resolution is agreed, it'd be good to make aarch64/ppc/x86/etc consistent.


2. When to convert STT_GNU_IFUNC to STT_FUNC?

(This is more a ld question.)

In LLD, for a non-GOT-generating-non-PLT-generating relocation referencing a
STT_GNU_IFUNC, a canonical PLT entry is created and the symbol type is changed
to STT_FUNC. (An absolute relocation with 0 addend in a SHF_WRITE section used
to not trigger a nonical PLT entry. https://reviews.llvm.org/D65995 dropped the
case.) References from other modules will resolve to the PLT entry.

This approach has pros and cons:

* With a canonical PLT entry, the resolver of a symbol is called only once.
* If the relocation appears in a non-SHF_WRITE section, a text relocation can be avoided.
* Relocation types which are not valid dynamic relocation types are supported. GNU ld may error relocation R_X86_64_PC32 against STT_GNU_IFUNC symbol `ifunc' isn't supported
* References will bind to the canonical PLT entry. A function call needs to jump to the PLT, loads the value from the GOT, then does an indirect call.

Last time I checked, the architectures of GNU ld behaved quite differently. This
is an area that arch consistency should be improved.


3. Prefer .rela.dyn over .rela.plt for R_*_IRELATIVE?

ld powerpc produces R_*_IRELATIVE in .rela.dyn.
glibc powerpc32/powerpc64 do not process R_*_IRELATIVE if they are not in
[DT_JMPREL, DT_JMPREL+DT_PLTRELSZ).

This may be a good practice because R_*_IRELATIVE is by nature eagerly resolved.
The potentially lazy .rela.plt is not suitable.

I think at least aarch64 and x86 are still using .rela.plt.

In LLD I followed .rela.dyn and it has been working well https://reviews.llvm.org/D65651 .


4. When to define __rela_iplt_start and __rela_iplt_end?

Static pie and static no-pie relocation processing is very different in glibc.

* Static no-pie uses special code to process a magic array delimitered by __rela_iplt_start/__rela_iplt_end.
* Static pie uses self-relocation to take care of R_*_IRELATIVE. The above magic array code is executed as well. If __rela_iplt_start/__rela_iplt_end are defined, we will get 0 < __rela_iplt_start < __rela_iplt_end in csu/libc-start.c. ARCH_SETUP_IREL will crash when resolving the first relocation which has been processed.

LLD defines __rela_iplt_start/__rela_iplt_end in -pie mode (GNU ld doesn't) so
static pie elf/ldconfig segfaults.  If we take the patch "Make
_dl_relocate_static_pie return an int indicating whether it applied relocs."
from https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/lld ,
LLD linked static-pie glibc programs will work well (with another cleanup from
an unrelated thing: https://sourceware.org/pipermail/libc-alpha/2020-December/121144.html).

My idea is that defining __rela_iplt_start/__rela_iplt_end in -pie is justified.

I do see that GNU ld may not want a change (probably in a couple of years)
because it does not want to gratuitously break older glibc, but taking the patch
(probably with description rewritten) is a clarification to glibc code to me.

glibc maintainers can follow up on "[PATCH 0/3] Make glibc build with LLD"
if you accept that patch.

In a few years, when the compatibility for older glibc can be dropped.
ld can define __rela_iplt_start in -pie mode to drop the unneeded difference
in diff -u =(ld.bfd --verbose) =(ld.bfd -pie --verbose) output.


More information about the Libc-alpha mailing list