Bug 25694

Summary: R_RISCV_TPREL_HI20 relocations cause riscv64 to add TEXTREL bit on executables
Product: binutils Reporter: Sergei Trofimovich <slyich>
Component: ldAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: alice, andrew, i, nelsonc1225, palmer, sam, wilson
Priority: P2    
Version: 2.34   
Target Milestone: ---   
See Also: https://sourceware.org/bugzilla/show_bug.cgi?id=22263
Host: x86_64-pc-linux-gnu Target: riscv64-unknown-linux-gnu
Build: x86_64-pc-linux-gnu Last reconfirmed:

Description Sergei Trofimovich 2020-03-18 22:28:27 UTC
Initial bug is reported as a https://bugs.gentoo.org/713082 where gdb-9.1 was observed to be complied with TEXTRELs. Here is the minimal reproducer:

// $ cat a.cc
static int a () { return 1; }

static thread_local int (*tlf) () = &a;

int main(void) { return tlf(); }
[sf] /tmp/z:cat a.cc
static int a () { return 1; }

static thread_local int (*tlf) () = &a;

int main(void) { return tlf(); }

$ riscv64-unknown-linux-gnu-g++ a.cc -o a -Wl,-z,text -pie -fPIE
/usr/libexec/gcc/riscv64-unknown-linux-gnu/ld: read-only segment has dynamic relocations
collect2: error: ld returned 1 exit status

$ riscv64-unknown-linux-gnu-g++ -S a.cc -o a.S -Wl,-z,text -pie -fPIE
$ cat a.S
...
main:
...
	lui	a5,%tprel_hi(_ZL3tlf)
	add	a5,a5,tp,%tprel_add(_ZL3tlf)
	ld	a5,%tprel_lo(_ZL3tlf)(a5)
	jalr	a5
...


I assume it's a binutils deficiency as I see no relocations in read-only sections:

$ riscv64-unknown-linux-gnu-g++ a.cc -o a -pie -fPIE
/usr/libexec/gcc/riscv64-unknown-linux-gnu/ld: warning: creating a DT_TEXTREL in object
$ LANG=C objdump -D -R a | egrep 'R_RISC|section ' | egrep -A2 'section '
Disassembly of section .interp:
Disassembly of section .note.ABI-tag:
Disassembly of section .gnu.hash:
Disassembly of section .dynsym:
Disassembly of section .dynstr:
Disassembly of section .gnu.version:
Disassembly of section .gnu.version_r:
Disassembly of section .rela.dyn:
Disassembly of section .rela.plt:
Disassembly of section .plt:
Disassembly of section .text:
Disassembly of section .rodata:
Disassembly of section .eh_frame_hdr:
Disassembly of section .eh_frame:
Disassembly of section .tdata:
			1da0: R_RISCV_RELATIVE	*ABS*+0x72a
Disassembly of section .preinit_array:
			1da8: R_RISCV_RELATIVE	*ABS*+0x68e
Disassembly of section .init_array:
			1db0: R_RISCV_RELATIVE	*ABS*+0x728
Disassembly of section .fini_array:
			1db8: R_RISCV_RELATIVE	*ABS*+0x6ee
Disassembly of section .dynamic:
Disassembly of section .data:
			2000: R_RISCV_RELATIVE	*ABS*+0x2000
Disassembly of section .got:
			2018: R_RISCV_JUMP_SLOT	__libc_start_main@GLIBC_2.27
			2020: R_RISCV_JUMP_SLOT	__stack_chk_fail@GLIBC_2.27
--
Disassembly of section .bss:
Disassembly of section .comment:
			0: R_RISCV_NONE	*ABS*


riscv_elf_check_relocs() is probably too conservative at attributing thread_local relocations in PIEs as TEXTRELs.
Comment 1 Jim Wilson 2020-03-19 17:01:58 UTC
I believe this is the same as bug 22263, which added a testcase, so we already have a ld testsuite failure for it.  The problem also shows up in openembedded builds.
Comment 2 Sergei Trofimovich 2020-03-19 21:55:44 UTC
(In reply to Jim Wilson from comment #1)
> I believe this is the same as bug 22263, which added a testcase, so we
> already have a ld testsuite failure for it.  The problem also shows up in
> openembedded builds.

Yeah, looks like it's exactly it!

Don't know if it should be duped or fixed in this bug. That bug accumulated a lot of fixes which are nice to look at how to fix things but make it hard to track what targets were not fixed yet.
Comment 3 Fangrui Song 2020-04-05 04:15:19 UTC
Instead of --warn-shared-textrel, we can make bfd_link_info::error_textrel default to 1 and delete warn_shared_textrel.

The bi-state approach is used by LLD: either -z notext or -z text. There is no state where DT_TEXTREL/DF_TEXTREL is added on demand. The diagnostic should remind the user that -z notext may be needed.
Comment 4 Nelson Chu 2023-05-03 10:57:24 UTC
Haven't test these changes for the riscv-gnu-toolchain regressions, and not yet test for other project.  However, at least TLS LE testcase mentioned here, and the pr22263-1 testcase looks good after applying the following changes.

For the R_RISCV_TPREL_HI20 in check_reloc, not really sure why we used to goto static_reloc when pie/pde.  Refer to spare and loongarch, it's just pass and won't reserve local dynamic reloc for TLS LE in pie/pde, and similar for other targets.  So we probably too conservative before.

As for the TLS GD/IE in riscv_elf_relocate_section, I refer to what Maciej did for pr22570, commits 9143e72c6d4d and 1cb83cac9a89, seems also the right way to do the same thing for risc-v.


diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 77a732b6a08..bbb5d0cd5ff 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -824,7 +824,7 @@ riscv_elf_check_relocs (bfd *abfd, struct bfd_link_info *info,
          break;

        case R_RISCV_TLS_GOT_HI20:
-         if (bfd_link_pic (info))
+         if (bfd_link_dll (info))
            info->flags |= DF_STATIC_TLS;
          if (!riscv_elf_record_got_reference (abfd, info, h, r_symndx)
              || !riscv_elf_record_tls_type (abfd, h, r_symndx, GOT_TLS_IE))
@@ -920,11 +920,12 @@ riscv_elf_check_relocs (bfd *abfd, struct bfd_link_info *info,
          goto static_reloc;

        case R_RISCV_TPREL_HI20:
+         /* This is not allowed in the pic, but okay in pie.  */
          if (!bfd_link_executable (info))
            return bad_static_reloc (abfd, r_type, h);
          if (h != NULL)
            riscv_elf_record_tls_type (abfd, h, r_symndx, GOT_TLS_LE);
-         goto static_reloc;
+         break;

        case R_RISCV_HI20:
          if (bfd_link_pic (info))
@@ -2797,24 +2798,20 @@ riscv_elf_relocate_section (bfd *output_bfd,
              if (htab->elf.srelgot == NULL)
                abort ();

-             if (h != NULL)
-               {
-                 bool dyn, pic;
-                 dyn = htab->elf.dynamic_sections_created;
-                 pic = bfd_link_pic (info);
-
-                 if (WILL_CALL_FINISH_DYNAMIC_SYMBOL (dyn, pic, h)
-                     && (!pic || !SYMBOL_REFERENCES_LOCAL (info, h)))
-                   indx = h->dynindx;
-               }
+             bool dyn = elf_hash_table (info)->dynamic_sections_created;
+             if (h != NULL
+                 && h->dynindx != -1
+                 && WILL_CALL_FINISH_DYNAMIC_SYMBOL (dyn, bfd_link_pic (info), h)
+                 && (bfd_link_dll (info) || !SYMBOL_REFERENCES_LOCAL (info, h)))
+               indx = h->dynindx;

              /* The GOT entries have not been initialized yet.  Do it
                 now, and emit any relocations.  */
-             if ((bfd_link_pic (info) || indx != 0)
+             if ((bfd_link_dll (info) || indx != 0)
                  && (h == NULL
                      || ELF_ST_VISIBILITY (h->other) == STV_DEFAULT
                      || h->root.type != bfd_link_hash_undefweak))
-                   need_relocs = true;
+               need_relocs = true;

              if (tls_type & GOT_TLS_GD)
Comment 5 Andreas Schwab 2023-05-23 11:59:01 UTC
This still generates a spurious R_RISCV_NONE relocation for the pr22263-1 test.
Comment 6 Sourceware Commits 2023-05-27 01:25:07 UTC
The master branch has been updated by Nelson Chu <nelsonc1225@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=225df051d3d4cf714d1791b9035966a6686b3f3d

commit 225df051d3d4cf714d1791b9035966a6686b3f3d
Author: Nelson Chu <nelson@nelson.ba.rivosinc.com>
Date:   Thu May 4 17:08:50 2023 +0800

    [PR ld/22263][PR ld/25694] RISC-V: Avoid dynamic TLS relocs in PIE.
    
    Lots of targets already fixed the TEXTREL problem for TLS in PIE.
    
    * For PR ld/25694,
    In the check_reloc, refer to spare and loongarch, they don't need to reserve
    any local dynamic reloc for TLS LE in pie/pde, and similar to other targets.
    So it seems like riscv was too conservative to estimate the TLS LE before.
    Just break and don't goto static_reloc for TLS LE in pie/pde can fix the
    TEXTREL problem.
    
    * For PR ld/22263,
    The risc-v code for TLS GD/IE in the relocate_section seems same as MIPS port.
    So similar to MIPS, pr22570, commits 9143e72c6d4d and 1cb83cac9a89, it seems
    also the right way to do the same thing for risc-v.
    
    On risc-v, fixes
    FAIL: Build pr22263-1
    
    RISC-V haven't supported the TLS transitions, so will need the same fix (use
    bfd_link_dll) in the future.
    
    bfd/
            PR ld/22263
            PR ld/25694
            * elfnn-riscv.c (riscv_elf_check_relocs): Replace bfd_link_pic with
            bfd_link_dll for TLS IE.  Don't need to reserve the local dynamic
            relocation for TLS LE in pie/pde, and report error in pic just like
            before.
            (riscv_elf_relocate_section): For TLS GD/IE, use bfd_link_dll rather
            than !bfd_link_pic in determining the dynamic symbol index.  Avoid
            the index of -1.
Comment 7 Nelson Chu 2023-05-27 01:53:03 UTC
Thanks for the information, Andreas.

There is a proposed solution as follows, which suggested by Alan,
https://sourceware.org/pipermail/binutils/2023-May/127653.html

The idea is that make sure using same condition check in the relocate_section and allocate_dynrelocs for TLS GD/IE, so won't reserve extra spaces for the dynamic relocations.  It seems like there is no longer any R_RISCV_NONE for pr22263-1 test after applying this fix.
Comment 8 Nelson Chu 2023-06-08 00:09:59 UTC
Since the redundant nops for TLS have been fixed in bug 24676, so marked resolved and fixed here.