Bug 33236 - ld riscv: Relocatable linking challenge with R_RISCV_ALIGN
Summary: ld riscv: Relocatable linking challenge with R_RISCV_ALIGN
Status: NEW
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-07-31 07:06 UTC by Fangrui Song
Modified: 2026-06-05 06:30 UTC (History)
2 users (show)

See Also:
Host:
Target: riscv*
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2025-07-31 07:06:11 UTC
Updated https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation to mention this issue:

A specific issue arises in relocatable linking when a section that does not use linker relaxation is preceded by a section that does.

Without linker relaxation enabled for a particular relocatable file or section (e.g., using .option norelax), the assembler will not generate R_RISCV_ALIGN relocations for alignment directives. This becomes problematic in a two-stage linking process:

cat > a.s <<e
.globl _start
_start:
  call foo

.section .text1,"ax"
.globl foo
foo:
e
cat > b.s <<e
.option push
.option norelax
# Assembler will not generate R_RISCV_ALIGN here
.balign 8
b0:
  .word 0x3a393837
.option pop
e
clang --target=riscv64 -mrelax -c a.s b.s

# Single-stage linking
ld.lld a.o b.o -o ab

# Two-stage linking
ld.lld -r a.o b.o -o ab.o
ld.lld ab.o -o ab.r

When ab.o is linked into an executable, the preceding relaxed section (a.o's content) might shrink. Since there's no R_RISCV_ALIGN relocation in b.o for the linker to act upon, the .word 0x3a393837 data in b.o may end up unaligned in the final executable. This produces an output that differs from a direct, single-stage link of ld.lld a.o b.o -o ab, which would correctly align the data.

This issue likely doesn't lead to significant problems in practice, primarily due to these factors:

* Infrequent use of relocatable linking (ld -r).
* Rarity of data within text sections. It's even less frequent for such data to reside at the beginning of a section, or so early in a section that there isn't any preceding linker-relaxable instruction.



To address the issue, I am modifying LLD to synthesize an `R_RISCV_ALIGN` relocation at the beginning of a text section with an explicit alignment requirement. it would provide the linker with the necessary handle to adjust that section's start address and potentially insert or remove padding to maintain the desired alignment.

See also this LLVM assembler change https://github.com/llvm/llvm-project/pull/150816
Comment 1 Nelson Chu 2025-07-31 13:41:39 UTC
Looks like lld force to add a R_RISCV_ALIGN and NOPS at the begin of every input section if they disable relaxations but linked with other relaxable sections under -r?  Not really sure how to do that in GNU ld side, roughly trace the code, maybe elf_backend_update_relocs should be the place to try if people wants to do similar hacks.
Comment 2 Nelson Chu 2025-08-01 03:02:02 UTC
Oh, if needs to add nops for ALIGN reloc then section content and size will be changed, so probably need to re-assign the section addresses, then maybe it should be done in/before the relaxation stage rather than elf_backend_update_relocs.
Comment 3 Fangrui Song 2025-08-01 05:10:13 UTC
In LLD, a single address assignment pass is sufficient for -r links. Although multiple iterations could theoretically be supported if the linker script uses ADDR, I haven't seen such scripts.
(Additionally, all output section addresses are reset to zero for -r links.)

In my patch, I do this:

When called with an input section (`sec` is not null): If the section
alignment is >= 4, advance `dot` to insert NOPs and synthesize an ALIGN
relocation.

When called after all input sections are processed (`sec` is null): The
output relocation section is updated with all the newly synthesized ALIGN
relocations. When processing the .text output section, the patch inserts relocations in the first input section within .rela.text that contains RELAX.

I use $alignment-2 for simplicity, regardless of RVC. I think linkers handle $alignment-2, even without the C extension.
Comment 4 mengqinggang 2026-04-30 03:26:44 UTC
Change b.s to:
  1 # Assembler will not generate R_RISCV_ALIGN here
  2 .balign 4
  3 .option push
  4 .option norelax
  5 nop
  6 .balign 8
  7 b0:
  8   .word 0x3a393837
  9 .option pop

The .word 0x3a393837 data in b.o still may end up unaligned in the final executable.

It may caused by the follow code:
    bool hasAlignRel = llvm::any_of(rels, [](const RelTy &rel) {
      return rel.r_offset == 0 && rel.getType(false) == R_RISCV_ALIGN;
    });
    if (!hasAlignRel) {
      synthesizedAligns.emplace_back(dot - baseSec->getVA(),
                                     sec->addralign - 2);
      dot += sec->addralign - 2;
      return true;
    }
Comment 5 mengqinggang 2026-06-05 06:30:02 UTC
A patch for LoongArch:
https://sourceware.org/pipermail/binutils/2026-May/149462.html