Bug 30976 - rtld: resolve ifunc relocations after JUMP_SLOT/GLOB_DAT/etc
Summary: rtld: resolve ifunc relocations after JUMP_SLOT/GLOB_DAT/etc
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: dynamic-link (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-17 04:39 UTC by Fangrui Song
Modified: 2023-10-17 04:39 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2023-10-17 04:39:52 UTC
Related to PR ld/13302, but I think glibc rtld's ifunc resolving order can be improved.

You can check out https://maskray.me/blog/2021-01-18-gnu-indirect-function#relocation-resolving-order
for details, but I keep a simplified copy here:

Within a module, glibc rtld resolves relocations in order.
Assume that both `DT_RELA` (`.rela.dyn`) and `DT_PLTREL` (`.rela.plt`) are present, glibc logic is like the following:
```c
// Simplified from elf/dynamic-link.h
ranges[0] = {DT_RELA, DT_RELASZ, 0};
ranges[1] = {DT_JMPREL, DT_PLTRELSZ, do_lazy};
if (!do_lazy && ranges[0].start + ranges[0].size == ranges[1].start) { // the equality operator is always satisfied in practice
  ranges[0].size += size;
  ranges[1] = {};
}
for (int ranges_index = 0; ranges_index < 2; ++ranges_index)
  elf_dynamic_do_Rela (... ranges[ranges_index]);
```

`elf_dynamic_do_Rela`:
```c
// Simplified from elf/dl-rel.h
// Handle RELATIVE relocations.
for (; relative < r; ++relative)
  DO_ELF_MACHINE_REL_RELATIVE (map, l_addr, relative);
// Handle other relocations that are not IRELATIVE.
for (; r < end; ++r) {
  if (r is R_*_IRELATIVE) {
    if (r2 == NULL) r2 = r;
    end2 = r;
    continue;
  }
  elf_machine_rel (... r);
}
// Then handle IRELATIVE relocations.
if (r2 != NULL)
  for (; r2 <= end2; ++r2)
    if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
     elf_machine_rel (... r2);
```

A `R_*_IRELATIVE` relocation or a symbolic relocation (e.g. `R_X86_64_64`) referencing an ifunc symbol requires rtld to call an ifunc resolver.
The ifunc resolver may access variables or functions that require relocations (usually `R_*_JUMP_SLOT`, `R_*_GLOB_DAT`, or `R_*_RELATIVE`).
`R_*_GLOB_DAT` and `R_*_RELATIVE` relocations are resolved before `R_*_IRELATIVE`, so reading a variable or taking an address of a function is fine.

However, when lazy binding is enabled (neither `ld -z now` or `LD_BIND_NOW=1`; `do_lazy == 1`), `R_*_IRELATIVE` relocations in `.rela.dyn` are resolved before `R_*_JUMP_SLOT` in `.rela.plt`.
Therefore, calling a preemptible function in an ifunc resolver will crash due to accessing an unresolved GOTPLT entry.
This may work with GNU ld's x86-64 port, which places certain `R_X86_64_IRELATIVE` relocations in `.rela.plt`. The exact rules are complex and I am not interested in figuring it out.
In certain cases, GNU ld still produces an `R_X86_64_IRELATIVE` in `.rela.dyn`: in lazy PLT mode, glibc ld.so will call the ifunc resolver before the `R_X86_64_JUMP_SLOT` for `puts` is set up, and segfault.

---

In certain cases, GNU ld produces an R_X86_64_IRELATIVE in .rela.dyn instead of the usual .rela.plt. Then, in lazy PLT mode, glibc ld.so will call the ifunc resolver before the R_X86_64_JUMP_SLOT for puts is set up, and segfault.

cat > a.c <<eof
  #include <stdio.h>

  int a_impl() { return 42; }
  void *a_resolver() {
    puts("a_resolver");
    return (void *)a_impl;
  }
  int a() __attribute__((ifunc("a_resolver")));

  // .rela.dyn.rel => R_X86_64_64 referencing STT_GNU_IFUNC in .rela.dyn
  int (*fptr_a)() = a;

  int main() { printf("%d\n", a()); }
eof

cc -fpie -c a.c
cc -fuse-ld=bfd -pie a.o -o a

---

To address the brittleness , I think the following loop in elf/dynamic-link.h

  for (int ranges_index = 0; ranges_index < 2; ++ranges_index)

should be refactored to resolve non-ifunc relocations first, finally ifunc relocations.