ifunc resolving

H.J. Lu hjl.tools@gmail.com
Mon Jan 18 22:53:53 GMT 2021


On Mon, Jan 18, 2021 at 2:04 PM Fangrui Song <i@maskray.me> wrote:
>
>
> I have seen ifunc relocation activities on glibc and ld recently.
> https://sourceware.org/glibc/wiki/GNU_IFUNC is under-documented, some aspects
> have not been well-known, and there are a lot differences across architectures
> supporting ifunc, so I am sending this email hoping that these aspects can be
> clarified, toolchain developers can get on the same page, and documentation can
> be improved (if developers get confused at times, how could regular users
> comfortably use them? :) )
>
>
> 1. An ifunc defined in the executable is called by a (link-time DT_NEEDED or
>     runtime) shared object.
>
>  From ld https://sourceware.org/bugzilla/show_bug.cgi?id=23169 (x86 only) this looks desired.
> My understanding (comment 8) is that
>
> (1) The main executable is relocated the last.
> (2) By converting the main executable STT_GNU_IFUNC symbol to STT_FUNC, when
> processing relocations in a DSO, the ifunc resolver will not be called while the
> main executable is unresolved.
>
> ifunc calls from within the executable do not incur additional costs.
> ifunc calls from DSOs go through the main exe PLT and are punished.
>
> When processing an ifunc relocation in a DSO, if the ifunc resolver is defined
> in another DSO, according to comment 9 it will be errored.
>
> The adds an executable-vs-shared difference to non-preemptible ifunc, but so be it.
>
>
> The above sounds reasonable. However, the top-of-tree ld does not make -no-pie
> and -pie behaviors consistent (note: ld does not support -no-pie yet).
>
>
> cat > ./a.s <<eof
> resolver:
>    nop
>
> .globl ifunc, _start
> .type ifunc, @gnu_indirect_function
> .set ifunc, resolver
>
> _start:
>    movq ifunc@GOTPCREL(%rip), %rax
>    call ifunc
>    # bl ifunc
> eof
> echo 'call ifunc' > ./b.s
> as a.s -o a.o
> gcc -shared -fpic b.s -o b.so
>
> ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf -W -s a | grep ifunc
> ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && readelf -W -s a | grep ifunc
> ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && readelf -W --dyn-syms a | grep ifunc
> ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o a && readelf -W --dyn-syms a | grep ifunc
>
> ~/Dev/binutils-gdb/Debug/ld/ld-new is a top-of-tree ld.
>
> % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf -W -s a | grep ifunc
>       7: 0000000000401008     0 IFUNC   GLOBAL DEFAULT    3 ifunc

.symtab is unused by ld.so.

> % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && readelf -W -s a | grep ifunc
>       5: 0000000000401010     0 FUNC    GLOBAL DEFAULT    7 ifunc
>       8: 0000000000401010     0 FUNC    GLOBAL DEFAULT    7 ifunc

The KEY is that the address of the PLT entry in PDE is known at
link-time.  No IRELATIVE is needed.

> % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && readelf -W --dyn-syms a | grep ifunc
>       5: 0000000000001020     0 IFUNC   GLOBAL DEFAULT    8 ifunc
> % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o a && readelf -W --dyn-syms a | grep ifunc
>       5: 0000000000001020     0 IFUNC   GLOBAL DEFAULT    8 ifunc

The address of the PLT entry in PIE is unknown at link-time.

> In the four combinations, -no-pie a.o ./b.so does the conversion.
>
> Once a resolution is agreed, it'd be good to make aarch64/ppc/x86/etc consistent.
>
>
> 2. When to convert STT_GNU_IFUNC to STT_FUNC?

Only when the address of the PLT entry in executable is known at link-time.

> (This is more a ld question.)
>
> In LLD, for a non-GOT-generating-non-PLT-generating relocation referencing a
> STT_GNU_IFUNC, a canonical PLT entry is created and the symbol type is changed
> to STT_FUNC. (An absolute relocation with 0 addend in a SHF_WRITE section used
> to not trigger a nonical PLT entry. https://reviews.llvm.org/D65995 dropped the
> case.) References from other modules will resolve to the PLT entry.
>
> This approach has pros and cons:
>
> * With a canonical PLT entry, the resolver of a symbol is called only once.
> * If the relocation appears in a non-SHF_WRITE section, a text relocation can be avoided.
> * Relocation types which are not valid dynamic relocation types are supported. GNU ld may error relocation R_X86_64_PC32 against STT_GNU_IFUNC symbol `ifunc' isn't supported
> * References will bind to the canonical PLT entry. A function call needs to jump to the PLT, loads the value from the GOT, then does an indirect call.

This allows IFUNC in PDE with external references from DSO.

> Last time I checked, the architectures of GNU ld behaved quite differently. This
> is an area that arch consistency should be improved.

Not all targets support this.

>
> 3. Prefer .rela.dyn over .rela.plt for R_*_IRELATIVE?
>
> ld powerpc produces R_*_IRELATIVE in .rela.dyn.
> glibc powerpc32/powerpc64 do not process R_*_IRELATIVE if they are not in
> [DT_JMPREL, DT_JMPREL+DT_PLTRELSZ).
>
> This may be a good practice because R_*_IRELATIVE is by nature eagerly resolved.
> The potentially lazy .rela.plt is not suitable.
>
> I think at least aarch64 and x86 are still using .rela.plt.
>
> In LLD I followed .rela.dyn and it has been working well https://reviews.llvm.org/D65651 .
>
>
> 4. When to define __rela_iplt_start and __rela_iplt_end?

I invented these.  __rela_iplt_start and __rela_iplt_end should be
defined ONLY when
there are no dynamic tags.  Since all ET_DYN files have dynamic tags,
ET_DYN files
shouldn't define them.

> Static pie and static no-pie relocation processing is very different in glibc.
>
> * Static no-pie uses special code to process a magic array delimitered by __rela_iplt_start/__rela_iplt_end.
> * Static pie uses self-relocation to take care of R_*_IRELATIVE. The above magic array code is executed as well. If __rela_iplt_start/__rela_iplt_end are defined, we will get 0 < __rela_iplt_start < __rela_iplt_end in csu/libc-start.c. ARCH_SETUP_IREL will crash when resolving the first relocation which has been processed.
>
> LLD defines __rela_iplt_start/__rela_iplt_end in -pie mode (GNU ld doesn't) so

That is wrong.  ET_DYN file shouldn't define them.

> static pie elf/ldconfig segfaults.  If we take the patch "Make
> _dl_relocate_static_pie return an int indicating whether it applied relocs."
> from https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/lld ,
> LLD linked static-pie glibc programs will work well (with another cleanup from
> an unrelated thing: https://sourceware.org/pipermail/libc-alpha/2020-December/121144.html).
>
> My idea is that defining __rela_iplt_start/__rela_iplt_end in -pie is justified.

You are wrong.

> I do see that GNU ld may not want a change (probably in a couple of years)

Never.

> because it does not want to gratuitously break older glibc, but taking the patch
> (probably with description rewritten) is a clarification to glibc code to me.

I strongly object to such bogus change.

> glibc maintainers can follow up on "[PATCH 0/3] Make glibc build with LLD"
> if you accept that patch.
>
> In a few years, when the compatibility for older glibc can be dropped.
> ld can define __rela_iplt_start in -pie mode to drop the unneeded difference
> in diff -u =(ld.bfd --verbose) =(ld.bfd -pie --verbose) output.

Not going to happen.

-- 
H.J.


More information about the Binutils mailing list