Bug 25207 - ld: support --image-base= for elf (and -Ttext-segment -z separate-code strangeness)
Summary: ld: support --image-base= for elf (and -Ttext-segment -z separate-code strang...
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: 2.44
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 32461 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-11-19 23:26 UTC by Fangrui Song
Modified: 2024-12-19 19:53 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2019-11-19 23:26:29 UTC
% cat a.c
int main() {}

% gcc -fuse-ld=bfd a.c -Wl,-Ttext-segment,0x300000 -z noseparate-code -o a; readelf -Wl a
...
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000300040 0x0000000000300040 0x0001f8 0x0001f8 R   0x8
  INTERP         0x000238 0x0000000000300238 0x0000000000300238 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000300000 0x0000000000300000 0x0007a8 0x0007a8 R E 0x1000
  LOAD           0x000e18 0x0000000000301e18 0x0000000000301e18 0x000210 0x000218 RW  0x1000
...

When -z separate-code is specified, there will be two R PT_LOAD. Notably, -Ttext-segment specifies the address of the first R, instead of the text segment (RX).
Or we may argue that the traditional "text segment" includes both the first R and the RX...

% gcc -fuse-ld=bfd a.c -Wl,-Ttext-segment,0x300000 -z separate-code -o a; readelf -Wl a
...
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000300040 0x0000000000300040 0x000268 0x000268 R   0x8
  INTERP         0x0002a8 0x00000000003002a8 0x00000000003002a8 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000300000 0x0000000000300000 0x000530 0x000530 R   0x1000
  LOAD           0x001000 0x0000000000301000 0x0000000000301000 0x00019d 0x00019d R E 0x1000
  LOAD           0x002000 0x0000000000302000 0x0000000000302000 0x000148 0x000148 R   0x1000
  LOAD           0x002e18 0x0000000000303e18 0x0000000000303e18 0x000210 0x000218 RW  0x1000
...

As a better name for specifying the base address, we can introduce a new elf option --image-base=0x300000 (it exists in pe). The LLVM linker lld has supported --image-base since 2016-07-12 (https://reviews.llvm.org/D22116).
Comment 1 Hakan 2024-10-21 22:45:23 UTC
It has been five years since the initial bug report was filed, and I just reconfirmed with the latest linker that the reported -Ttext-segment behaviour when used with -z separate-code remains the same.

Effectively, -Ttext-segment historically and currently stands for specifying the base address of the ELF file - not the first executable segment. Changing the current behaviour of the the option has a high chance of breaking existing projects that use this flag for specifying the base address, so in my opinion it is more clever to preserve it.

The LLVM linker has dropped support for -Ttext-segment, and instead suggests specifying --image-base with the following:
~ > ld.lld code.o -Ttext-segment=0x80000
ld.lld: error: -Ttext-segment is not supported. Use --image-base if you intend to set the base address

All things considered, I suggest preserving the current behaviour of -Ttext-segment for maintaining compatibility with old projects and introducing a new option --image-base for compatibility with the LLVM linker.

Here is the patch that introduces the new option --image-base as described. I also added a test case to ensure -Ttext-segment behaviour stays the same.

diff --git a/ld/NEWS b/ld/NEWS
index 1f14dd6bc77..e9ced61c1db 100644
--- a/ld/NEWS
+++ b/ld/NEWS
@@ -2,6 +2,9 @@
 
 Changes in 2.44:
 
+* Add --image-base=<ADDR> option to behave the same as -Ttext-segment
+  for compatibility with LLD.
+
 * Add a "--build-id=xx" option, if built with the xxhash library.  This
   produces a 128-bit hash, 2-4x faster than md5 or sha1.
 
diff --git a/ld/emultempl/elf.em b/ld/emultempl/elf.em
index 2e865728587..327f95d62e5 100644
--- a/ld/emultempl/elf.em
+++ b/ld/emultempl/elf.em
@@ -817,6 +817,7 @@ fragment <<EOF
     {"compress-debug-sections", required_argument, NULL, OPTION_COMPRESS_DEBUG},
     {"rosegment", no_argument, NULL, OPTION_ROSEGMENT},
     {"no-rosegment", no_argument, NULL, OPTION_NO_ROSEGMENT},
+    {"image-base", required_argument, NULL, OPTION_IMAGE_BASE},
 EOF
 if test x"$GENERATE_SHLIB_SCRIPT" = xyes; then
 fragment <<EOF
diff --git a/ld/ld.texi b/ld/ld.texi
index 90182c436ec..f51227b4b0f 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -2746,11 +2746,17 @@ Same as @option{--section-start}, with @code{.bss}, @code{.data} or
 @item -Ttext-segment=@var{org}
 @cindex text segment origin, cmd line
 When creating an ELF executable, it will set the address of the first
-byte of the text segment.  Note that when @option{-pie} is used with
+byte of the first segment.  Note that when @option{-pie} is used with
 @option{-Ttext-segment=@var{org}}, the output executable is marked
 ET_EXEC so that the address of the first byte of the text segment will
 be guaranteed to be @var{org} at run time.
 
+@kindex --image-base=@var{org}
+@item --image-base=@var{org}
+@cindex image base address, cmd line
+Same as @option{-Ttext-segment}, with both options effectively setting
+the base address of the ELF executable.
+
 @kindex -Trodata-segment=@var{org}
 @item -Trodata-segment=@var{org}
 @cindex rodata segment origin, cmd line
diff --git a/ld/lexsup.c b/ld/lexsup.c
index 8982073bc91..887bede2a79 100644
--- a/ld/lexsup.c
+++ b/ld/lexsup.c
@@ -1478,6 +1478,8 @@ parse_args (unsigned argc, char **argv)
 	case OPTION_TTEXT:
 	  set_segment_start (".text", optarg);
 	  break;
+	case OPTION_IMAGE_BASE:
+	  /* --image-base and -Ttext-segment behavior is the same */
 	case OPTION_TTEXT_SEGMENT:
 	  set_segment_start (".text-segment", optarg);
 	  break;
diff --git a/ld/testsuite/ld-elf/pr25207.d b/ld/testsuite/ld-elf/pr25207.d
new file mode 100644
index 00000000000..6b9965ab591
--- /dev/null
+++ b/ld/testsuite/ld-elf/pr25207.d
@@ -0,0 +1,11 @@
+#source: pr25207.s
+#ld: -z separate-code -Ttext-segment=0x120000
+#readelf: -l --wide
+#target: *-*-linux* *-*-gnu* arm*-*-uclinuxfdpiceabi
+# changing -Ttext-segment behaviour will break --image-base (pr25207)
+# -Ttext-segment=<ADDR> should set the first segment address,
+# not necessarily the first executable segment.
+
+#...
+  LOAD +0x0+ 0x0*120000 0x0*120000 0x0*[0-9a-f][0-9a-f][0-9a-f] 0x0*[0-9a-f][0-9a-f][0-9a-f] R   .*
+#pass
diff --git a/ld/testsuite/ld-elf/pr25207.s b/ld/testsuite/ld-elf/pr25207.s
new file mode 100644
index 00000000000..ffa11bbc550
--- /dev/null
+++ b/ld/testsuite/ld-elf/pr25207.s
@@ -0,0 +1,8 @@
+        .section .text, "ax"
+	.globl  _start
+_start:
+	.space 1
+
+	.section .rodata
+	.globl	foo
+foo:	.space 1
Comment 2 Hakan 2024-10-29 09:42:45 UTC
An updated patch has been pushed to main branch.
Comment 3 Sourceware Commits 2024-10-30 21:08:03 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=afc1f137e18363138d5fa4c88a9a2816926a3d5c

commit afc1f137e18363138d5fa4c88a9a2816926a3d5c
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Oct 31 05:04:15 2024 +0800

    ld-elf/pr25207.d: Pass --no-rosegment to ld
    
    Pass --no-rosegment to ld to support linker configured with
    --enable-rosegment,
    
            PR ld/25207
            * ld-elf/pr25207.d: Pass --no-rosegment to ld.
    
    Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Comment 4 Sam James 2024-12-18 20:10:49 UTC
commit f4e363cae297ec3e24dec0d95f3d422879f498a3
Author: Hakan Candar <hakancandar@protonmail.com>
Date:   Mon Oct 28 11:01:59 2024 +0000

    ld/ELF: Add --image-base command line option to the ELF linker
    
    LLD has dropped the option -Ttext-segment for specifying image base
    addresses, instead forcing the use of the --image-base option for both
    ELF and PE targets. As it stands, GNU LD and LLVM LLD are incompatible,
    having two different options for the same functionality.
    
    This patch enables the use of --image-base on ELF targets, advancing
    consistency and compatibility.
    
    See: https://reviews.llvm.org/D70468
         https://maskray.me/blog/2020-11-15-explain-gnu-linker-options#address-related
         https://sourceware.org/bugzilla/show_bug.cgi?id=25207
    
    Moreover, a new test has been added to ensure -z separate-code behaviour
    when used with -Ttext-segment stays the same. When this combination is
    used, -Ttext-segment sets the address of the first segment (R), not the
    text segment (RX), and like with -z noseparate-code, no segments lesser
    than the specified address are created. If this behaviour was to change,
    the first (R) segment of the ELF file would begin in a lesser address
    than the specified text (RX) segment, breaking traditional use of this
    option for specifying image base address.

All done for 2.44 I think. Please reopen if I'm mistaken.
Comment 5 H.J. Lu 2024-12-19 19:53:50 UTC
*** Bug 32461 has been marked as a duplicate of this bug. ***