Summary: | i386: Emit R_386_PLT32 instead of R_386_PC32 for `call/jmp foo` | ||
---|---|---|---|
Product: | binutils | Reporter: | Fangrui Song <i> |
Component: | gas | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED WONTFIX | ||
Severity: | normal | CC: | hjl.tools, sam |
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
See Also: |
https://sourceware.org/bugzilla/show_bug.cgi?id=20515 https://sourceware.org/bugzilla/show_bug.cgi?id=22791 |
||
Host: | Target: | i386-* | |
Build: | Last reconfirmed: | ||
Attachments: | A patch to generate R_386_PLT32 |
Description
Fangrui Song
2021-01-10 00:54:00 UTC
Since i386 doesn't have IP-relative addressing, non-PIC PLT is different from PIC PLT. Using R_386_PLT32 for "call/jmp foo" isn't appreciate. (In reply to H.J. Lu from comment #1) > Since i386 doesn't have IP-relative addressing, non-PIC PLT is different > from PIC PLT. Using R_386_PLT32 for "call/jmp foo" isn't appreciate. I know that this is a convention using R_386_PC32 for non-PIC PLT and R_386_PLT32. It is artificial and assembler/linker/ld.so do not need this convention for interop. On most other architectures branch relocation types are distinguishable from address taken relocation types (direct access). (In reply to Fangrui Song from comment #2) > (In reply to H.J. Lu from comment #1) > > Since i386 doesn't have IP-relative addressing, non-PIC PLT is different > > from PIC PLT. Using R_386_PLT32 for "call/jmp foo" isn't appreciate. > > I know that this is a convention using R_386_PC32 for non-PIC PLT and > R_386_PLT32. It is artificial and assembler/linker/ld.so do not need this > convention for interop. R_386_PLT32 should be used with the EBX based PLT and "call foo" doesn't require setting up EBX for PLT. > On most other architectures branch relocation types are distinguishable from > address taken relocation types (direct access). This can't be fixed with R_386_PLT32. (In reply to H.J. Lu from comment #3) > (In reply to Fangrui Song from comment #2) > > (In reply to H.J. Lu from comment #1) > > > Since i386 doesn't have IP-relative addressing, non-PIC PLT is different > > > from PIC PLT. Using R_386_PLT32 for "call/jmp foo" isn't appreciate. > > > > I know that this is a convention using R_386_PC32 for non-PIC PLT and > > R_386_PLT32. It is artificial and assembler/linker/ld.so do not need this > > convention for interop. > > R_386_PLT32 should be used with the EBX based PLT and "call foo" doesn't > require setting up EBX for PLT. > > > On most other architectures branch relocation types are distinguishable from > > address taken relocation types (direct access). > > This can't be fixed with R_386_PLT32. Does GNU ld use R_386_PC32/R_386_PLT32 to decide whether a non-PIC PLT or a PIC PLT should be used? It can use a non-PIC PLT in -no-pie mode and a PIC PLT in -pie/-shared mode. Then branch R_386_PC32 can be freely converted to PLT32. # a.s call foo # b.s call foo@plt gcc -fno-pic a.s -shared -o a.so -fuse-ld=bfd gcc -fno-pic b.s -shared -o b.so -fuse-ld=bfd do not have instruction difference. Sorry # a.s .globl main main: call puts # b.s .globl main main: call puts@plt gcc -m32 -no-pie a.s -o a -fuse-ld=bfd gcc -m32 -no-pie b.s -o b -fuse-ld=bfd do not have instruction difference. Created attachment 13109 [details] A patch to generate R_386_PLT32 I tried this patch when I made: commit bd7ab16b4537788ad53521c45469a1bdae84ad4a Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Feb 13 07:34:22 2018 -0800 x86-64: Generate branch with PLT32 relocation Since there is no need to prepare for PLT branch on x86-64, generate R_X86_64_PLT32, instead of R_X86_64_PC32, if possible, which can be used as a marker for 32-bit PC-relative branches. To compile Linux kernel, this patch: From: "H.J. Lu" <hjl.tools@gmail.com> Subject: [PATCH] x86: Treat R_X86_64_PLT32 as R_X86_64_PC32 On i386, there are 2 types of PLTs, PIC and non-PIC. PIE and shared objects must use PIC PLT. To use PIC PLT, you need to load _GLOBAL_OFFSET_TABLE_ into EBX first. There is no need for that on x86-64 since x86-64 uses PC-relative PLT. and got FAIL: visibility (hidden_normal) (non PIC) FAIL: visibility (hidden_normal) (non PIC, load offset) FAIL: visibility (hidden_normal) (PIC main, non PIC so) FAIL: visibility (hidden_weak) (non PIC) FAIL: visibility (hidden_weak) (non PIC, load offset) FAIL: visibility (hidden_weak) (PIC main, non PIC so) FAIL: visibility (protected) (non PIC) FAIL: visibility (protected) (non PIC, load offset) FAIL: visibility (protected) (PIC main, non PIC so) FAIL: visibility (protected_undef_def) (non PIC) FAIL: visibility (protected_undef_def) (non PIC, load offset) FAIL: visibility (protected_undef_def) (PIC main, non PIC so) FAIL: visibility (protected_weak) (non PIC) FAIL: visibility (protected_weak) (non PIC, load offset) FAIL: visibility (protected_weak) (PIC main, non PIC so) FAIL: visibility (normal) (non PIC) FAIL: visibility (normal) (non PIC, load offset) FAIL: visibility (normal) (PIC main, non PIC so) FAIL: ld-i386/pr19636-2a FAIL: ld-i386/pr19636-2b FAIL: ld-i386/pr19636-2c FAIL: ld-i386/pr19636-2d FAIL: ld-i386/pr19636-2e FAIL: ld-i386/pr20515 FAIL: shared (non PIC) FAIL: shared (non PIC, load offset) FAIL: shared (PIC main, non PIC so) in i386 linker tests. Applied your R_386_PLT32 patch. # of unexpected failures 6 make -C Debug check-ld RUNTESTFLAGS=ld-shared/shared.exp # passed for me. ld/testsuite/ld-i386/pr20515.d is an expected failure due to no-longer-relevant diagnostic. It can be repaired by using .reloc .-4, R_386_PC32, foo-4 #error: unsupported non-PIC call to IFUNC `foo' For ld/testsuite/ld-i386/pr19636-2a.d (I think bcd are similar) both gold and LLD export undefined weak foo in .dynsym, regardless of R_386_PC32/R_386_PLT32. GNU ld exports undefined weak foo for R_386_PC32 but not for R_386_PLT32. I think there is some simplification which can be made. Branch relocations to undefined weak symbols have varying behaviors across architectures (e.g. some resolve it to the current pc, some resolve it to link-time 0, ppc32 might resolve it to the beginning of text segment/section (I forgot the details but it is strange)). In addition, I don't think users understand or use dynamic-undefined-weak in practice, so altering the 2017 changed behavior should not be a problem... (In reply to Fangrui Song from comment #7) > Applied your R_386_PLT32 patch. > > # of unexpected failures 6 > > make -C Debug check-ld RUNTESTFLAGS=ld-shared/shared.exp # passed for me. > You need to build i386 native linker to see all failures. I used: CC="/usr/gcc-10.1.1-32bit/bin/gcc -m32 -fcf-protection" CXX="/usr/gcc-10.1.1-32bit/bin/g++ -m32 -fcf-protection" /export/gnu/import/git/sources/binutils-gdb/configure \ \ i686-linux \ --enable-plugins --disable-gdb --disable-gdbserver --disable-libdecnumber --disable-readline --disable-sim --with-sysroot=/ --with-system-zlib \ --prefix=/usr/local \ --with-local-prefix=/usr/local to configure binutils on Linux/x86-64. /usr/gcc-10.1.1-32bit/bin/gcc is 32bit GCC. ld uses R_386_PC32 to tell if call site supports PIC PLT. Calling an IFUNC function in static PIE requires PLT. If call site doesn't support PIC PLT, linker will issue an error: https://sourceware.org/bugzilla/show_bug.cgi?id=20515 (In reply to H.J. Lu from comment #9) > ld uses R_386_PC32 to tell if call site supports PIC PLT. Calling an IFUNC > function in static PIE requires PLT. If call site doesn't support PIC PLT, > linker will issue an error: > > https://sourceware.org/bugzilla/show_bug.cgi?id=20515 Let me rephrase what PR20515 is about: For a call to a hidden function declaration, the compiler produces an R_386_PC32 relocation. The relocation is an indicator that EBX may not be set up. If the declaration refers to an ifunc definition, the linker will resolve the R_386_PC32 to an IPLT entry. For -pie and -shared links, the IPLT entry references EBX. If the call site does not set up EBX, the IPLT entry call will be incorrect. The resolution to PR20515 has implemented the diagnostic. If we change the compiler/assembler to use R_386_PLT32 for non-default visibility function declarations, this diagnostic will be lost. So unfortunately we cannot find a satisfactory relocation type for branches to undefined symbols: * R_386_PC32: canonical PLT entries (similar to copy relocations) which may break -Bsymbolic or --dynamic-list usage. * R_386_PLT32: lose a diagnostic for non-default ifunc in -pie/-shared modules. I agree that the assembler needs a notation to differentiate R_386_PC32/R_386_PLT32 branches. So perhaps this should be implemented in GCC instead: for a default visibility function declaration, emit `call/jmp foo@plt` instead of `call/jmp foo`. This does not degrade the ld diagnostics for non-default visibility ifunc. i386 is legacy. Let's leave it alone. These native i386 tests: FAIL: visibility (hidden_normal) (non PIC) FAIL: visibility (hidden_normal) (non PIC, load offset) FAIL: visibility (hidden_normal) (PIC main, non PIC so) FAIL: visibility (hidden_weak) (non PIC) FAIL: visibility (hidden_weak) (non PIC, load offset) FAIL: visibility (hidden_weak) (PIC main, non PIC so) FAIL: visibility (protected) (non PIC) FAIL: visibility (protected) (non PIC, load offset) FAIL: visibility (protected) (PIC main, non PIC so) FAIL: visibility (protected_undef_def) (non PIC) FAIL: visibility (protected_undef_def) (non PIC, load offset) FAIL: visibility (protected_undef_def) (PIC main, non PIC so) FAIL: visibility (protected_weak) (non PIC) FAIL: visibility (protected_weak) (non PIC, load offset) FAIL: visibility (protected_weak) (PIC main, non PIC so) FAIL: visibility (normal) (non PIC) FAIL: visibility (normal) (non PIC, load offset) FAIL: visibility (normal) (PIC main, non PIC so) FAIL: shared (non PIC) FAIL: shared (non PIC, load offset) FAIL: shared (PIC main, non PIC so) track the R_386_PLT32 vs R_386_PC32 issue. Just to be clear. Some i386 shared libraries are compiled without -fPIC on purpose to improve performance. When ld sees R_386_PC32 of an undefined symbol in a shared library, it creates a dynamic R_386_PC32 relocation in the .text section. Replace R_386_PC32 with R_386_PLT32 will break this. |