Bug 24784 - elf_x86_64_check_tls_transition should allow R_X86_64_GOTPCREL (-fno-plt)
Summary: elf_x86_64_check_tls_transition should allow R_X86_64_GOTPCREL (-fno-plt)
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.33
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-07 13:00 UTC by Fangrui Song
Modified: 2023-03-10 21:09 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2019-07-07 13:00:43 UTC
thread_local int a;
int main() { return a; }

% g++ -fno-plt -fpic a.cc -Wa,-mrelax-relocations=no -fuse-ld=bfd
/usr/bin/ld.bfd: /tmp/ccLggG8h.o: TLS transition from R_X86_64_TLSGD to R_X86_64_GOTTPOFF against `a' at 0x8 in section `.text' failed
/usr/bin/ld.bfd: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status

static thread_local int a;
int main() { return a; }

% g++ -fno-plt -fpic a.cc -Wa,-mrelax-relocations=no -fuse-ld=bfd
/usr/bin/ld.bfd: /tmp/ccO9KvlL.o: TLS transition from R_X86_64_TLSGD to R_X86_64_TPOFF32 against `_ZL1a' at 0x8 in section `.text' failed
/usr/bin/ld.bfd: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status

Due to -Wa,-mrelax-relocations=no, R_X86_64_GOTPCREL instead of R_X86_64_GOTPCRELX is generated.

      a9:       callq   *(%rip)
                00000000000000ad:  R_X86_64_GOTPCREL    __tls_get_addr-4

// bfd/elf-x86-64.c:elf_x86_64_check_tls_transition
...
    if (largepic)
      return r_type == R_X86_64_PLTOFF64;
    else if (indirect_call)
      return r_type == R_X86_64_GOTPCRELX; // It should allow R_X86_64_GOTPCREL
    else
      return (r_type == R_X86_64_PC32 || r_type == R_X86_64_PLT32);
Comment 1 H.J. Lu 2019-07-12 15:04:02 UTC
Since the new linker is required to support no-PLT TLS sequence,
R_X86_64_GOTPCRELX should be used here.  R_X86_64_GOTPCREL won't
work with the old linker anyway.
Comment 2 Fangrui Song 2019-07-13 01:08:11 UTC
(In reply to H.J. Lu from comment #1)
> Since the new linker is required to support no-PLT TLS sequence,
> R_X86_64_GOTPCRELX should be used here.  R_X86_64_GOTPCREL won't
> work with the old linker anyway.

I agree, but for the sake of completeness there is still value supporting GOTPCREL.

https://sourceware.org/bugzilla/show_bug.cgi?id=20216 GOTPCREL was added to gold in commit 9b3777ae068e67e08b27914abbab11297f1f8d1b.
Comment 3 Fangrui Song 2019-07-13 01:15:16 UTC
This was discovered by a Rust user. They use clang integrated assembler (not GNU as) with GNU ld. The integrated assembler isn't configured with -Wa,-mrelax-relocations=yes on by default. I worked around that in https://reviews.llvm.org/D64304 but it'd be nice if binutils 2.33 (or 2.34) will fix the issue.
Comment 4 H.J. Lu 2019-07-13 16:06:04 UTC
(In reply to Fangrui Song from comment #3)
> This was discovered by a Rust user. They use clang integrated assembler (not
> GNU as) with GNU ld. The integrated assembler isn't configured with
> -Wa,-mrelax-relocations=yes on by default. I worked around that in
> https://reviews.llvm.org/D64304 but it'd be nice if binutils 2.33 (or 2.34)
> will fix the issue.

The bug should be fixed in clang.
Comment 5 Fangrui Song 2019-07-14 00:39:48 UTC
I don't see how this bug is "invalid". I've already worked around the bug in clang, but I hope this hack can be deleted in a few years.

> g++ -fno-plt -fpic a.cc -Wa,-mrelax-relocations=no -fuse-ld=bfd

I mentioned the combination of -fno-plt and -Wa,-mrelax-relocations=no doesn't work. So we have 3 resolutions:

1. Ignore it.
  gcc -fno-plt -fpic -Wa,-mrelax-relocations=no  has the linker error
2. Document in Chapter 11 Alternate Code Sequences For Security of x86-64 psABI,
  call *__tls_get_addr@GOTPCREL(%rip)
  must use R_X86_64_GOTPCRELX.

  This kinda justifies 1., though it is not obvious why R_X86_64_GOTPCREL can't be used.
3. One line change to bfd/elf-x86-64.c:elf_x86_64_check_tls_transition
- return r_type == R_X86_64_GOTPCRELX; // It should allow R_X86_64_GOTPCREL
+ return r_type == R_X86_64_GOTPCREL || r_type == R_X86_64_GOTPCRELX;
  https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9b3777ae068e67e08b27914abbab11297f1f8d1b

With 3, the `gcc -fno-plt -fpic -Wa,-mrelax-relocations=no` issue can be fixed.
Comment 6 H.J. Lu 2019-07-15 15:04:02 UTC
The psABI also says

For the occurrence of name@GOTPCREL in the following assembler instructions:

call *name@GOTPCREL(%rip)
...
the R_X86_64_GOTPCRELX relocation, or the R_X86_64_REX_GOTPCRELX relocation
if the REX prefix is present, should be generated.

R_X86_64_GOTPCRELX should be generated for:

call __tls_get_addr@GOTPCREL(%rip)
Comment 7 Fangrui Song 2019-07-16 06:02:09 UTC
> For the occurrence of name@GOTPCREL in the following assembler instructions:

> call *name@GOTPCREL(%rip)

I agree, yet it also says:

> (PIC) the base of GOT is within 2GB an indirect call to the GOT entry might be implemented like so:
> foo (); call *(foo@GOT); R_X86_64_GOTPCREL

:)
Comment 8 H.J. Lu 2019-07-16 15:59:10 UTC
(In reply to Fangrui Song from comment #7)
> > For the occurrence of name@GOTPCREL in the following assembler instructions:
> 
> > call *name@GOTPCREL(%rip)
> 
> I agree, yet it also says:
> 
> > (PIC) the base of GOT is within 2GB an indirect call to the GOT entry might be implemented like so:
> > foo (); call *(foo@GOT); R_X86_64_GOTPCREL
> 
> :)

There is a typo. Relocation should be R_X86_64_GOT32, not R_X86_64_GOTPCREL.
Comment 9 Fangrui Song 2023-01-06 02:43:00 UTC
rustc runs into this problem as well as it currently somehow defaults to disable R_X86_64_*GOTPCRELX/R_386_GOT32X.

Since the fix is so trivial, I sent

https://sourceware.org/pipermail/binutils/2023-January/125506.html
https://sourceware.org/pipermail/binutils/2023-January/125507.html
Comment 10 H.J. Lu 2023-01-06 17:05:22 UTC
Since the new TLS sequence was added after R_X86_64_GOTPCRELX was
required for call, R_X86_64_GOTPCREL is invalid in this TLS sequence.
Comment 11 Sourceware Commits 2023-03-10 19:53:35 UTC
The master branch has been updated by Fangrui Song <maskray@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d58854b6dd88e05dbf2a5d1c32c5acb7bd6ea274

commit d58854b6dd88e05dbf2a5d1c32c5acb7bd6ea274
Author: Fangrui Song <maskray@google.com>
Date:   Fri Mar 10 11:53:32 2023 -0800

    ld: Allow R_X86_64_GOTPCREL for call *__tls_get_addr@GOTPCREL(%rip)
    
    _Thread_local int a;
    int main() { return a; }
    
    % gcc -fno-plt -fpic a.c -fuse-ld=bfd -Wa,-mrelax-relocations=no
    /usr/bin/ld.bfd: /tmp/ccSSBgrg.o: TLS transition from R_X86_64_TLSGD to R_X86_64_GOTTPOFF against `a' at 0xd in section `.text' failed
    /usr/bin/ld.bfd: failed to set dynamic section sizes: bad value
    collect2: error: ld returned 1 exit status
    
    This commit fixes the issue.
    
    There is an argument that the -fno-plt TLS sequence was added after
    R_X86_64_GOTPCRELX was required for call, so R_X86_64_GOTPCREL was
    intended to be unsupported.
    
    Unfortunately this standpoint has caused interop difficulty: some
    projects specify -mrelax-relocations=no to build relocatable object
    files compatible with older linkers (e.g.
    https://github.com/IHaskell/IHaskell/issues/636) or do so by accident
    (e.g. https://github.com/rust-lang/rust/pull/106511 not addressed as of
    today).  Many uses have not been cleaned up in practice, and compiling
    with -fno-plt will lead to the `TLS transition from R_X86_64_TLSGD ...`
    error which is hard to reason about.
    
    There is another argument which may be weaker but relevant to the
    necessity of -mrelax-relocations=no: HWAddressSanitizer x86-64 will
    likely need some assembler support to disable relaxation.  Without the
    support and if the compiler needs to support many gas version, the
    simplest solution would be to use -Wa,-mrelax-relocations=no.
    
        PR ld/24784
        * bfd/elf64-x86-64.c (elf_x86_64_check_tls_transition): Allow
          R_X86_64_GOTPCREL.
Comment 12 Sourceware Commits 2023-03-10 19:55:12 UTC
The master branch has been updated by Fangrui Song <maskray@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=2cef48423031d23d319692ceced5ad0110625f69

commit 2cef48423031d23d319692ceced5ad0110625f69
Author: Fangrui Song <maskray@google.com>
Date:   Fri Mar 10 11:55:09 2023 -0800

    ld: Allow R_386_GOT32 for call *__tls_get_addr@GOT(%reg)
    
    Similar to d58854b6dd88e05dbf2a5d1c32c5acb7bd6ea274 for x86_64.
    
    _Thread_local int a;
    int main() { return a; }
    
    % gcc -m32 -fno-plt -fpic a.c -fuse-ld=bfd -Wa,-mrelax-relocations=no
    /usr/bin/ld.bfd: /tmp/ccR8Yexy.o: TLS transition from R_386_TLS_GD to R_386_TLS_IE_32 against `a' at 0x15 in section `.text' failed
    /usr/bin/ld.bfd: failed to set dynamic section sizes: bad value
    collect2: error: ld returned 1 exit status
    
    This commit fixes the issue.
    
    There is an argument that the -fno-plt TLS sequence was added after
    R_386_GOT32X was required for call *func@GOT(%ebx), so R_386_GOT32 was
    intended to be unsupported.
    
    Unfortunately this standpoint has caused interop difficulty: some
    projects specify -mrelax-relocations=no to build relocatable object
    files compatible with older linkers (e.g.
    https://github.com/IHaskell/IHaskell/issues/636) or do so by accident
    (e.g. https://github.com/rust-lang/rust/pull/106511 not addressed as of
    today).  Many uses have not been cleaned up in practice, and compiling
    with -fno-plt will lead to the `TLS transition from R_386_TLS_GD ...`
    error which is hard to reason about.
    
    It seems easier to apply this simple change to prevent the footgun.
    
        PR ld/24784
        * bfd/elf32-i386.c (elf_i386_check_tls_transition): Allow R_386_GOT32.
Comment 13 Fangrui Song 2023-03-10 21:09:22 UTC
Fixed for 2.41