Created attachment 12174 [details] Full TEST.ld with the 72 sections. Steps to reproduce: (1) Build binutils 2.33.1 (or a prior version, tested repro on 2.31 as well) with --target=riscv64-elf (2) Using gcc (also built with --target=riscv64-elf), build a simple dummy object file: $ cat dummy.c int main(void) { return 0; } $ gcc -c dummy.c -o dummy.o -nostdlib (3) Create the following .ld file: $ cat TEST.ld MEMORY { virt : ORIGIN = 0x100000, LENGTH = 0x20000 phys : ORIGIN = 0x10000, LENGTH = 0x20000 } __phys_ptr = ORIGIN(phys); SECTIONS { . = ORIGIN(virt); .fakeSection0 . : AT(__phys_ptr) { *(.text); . += 0x100; } __phys_ptr += SIZEOF(.fakeSection0); . += 0x100; .fakeSection1 . : AT(__phys_ptr) { . += 0x100; } __phys_ptr += SIZEOF(.fakeSection1); . += 0x100; // !!!!!!!!!!!!!!!!! (file cropped, please see attachment for full) !!!!!!!! .fakeSection70 . : AT(__phys_ptr) { . += 0x100; } __phys_ptr += SIZEOF(.fakeSection70); . += 0x100; .fakeSection71 . : AT(__phys_ptr) { . += 0x100; } } (4) Build a RISCV .elf: $ ./riscv64-elf/bin/ld dummy.o -o TEST.elf -T TEST.ld $ ./riscv64-elf/bin/readelf --all TEST.elf > TEST.readelf (5) Open TEST.readelf and look for the Program Headers. This is what it looks like for me: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x00000000000ff000 0x000000000000f000 0x0000000000001110 0x0000000000001110 R E 0x1000 LOAD 0x0000000000001210 0x0000000000100210 0x0000000000010110 0x0000000000000000 0x0000000000000100 RW 0x1000 LOAD 0x0000000000001410 0x0000000000100410 0x0000000000010210 0x0000000000000000 0x0000000000000100 RW 0x1000 LOAD 0x0000000000001610 0x0000000000100610 0x0000000000010310 0x0000000000000000 0x0000000000000100 RW 0x1000 ... The first section doesn't make sense, PhysAddr should start at 0x10000 but it's 0xf000. The subsequent sections look fine. Note that there should be EXACTLY 72 LOAD-s in the Program Headers. If you add a section or remove a section, the problem goes away and the headers in TEST.readelf start looking like this: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000001000 0x0000000000100000 0x0000000000010000 0x0000000000000110 0x0000000000000110 R E 0x1000 LOAD 0x0000000000001210 0x0000000000100210 0x0000000000010110 0x0000000000000000 0x0000000000000100 RW 0x1000 LOAD 0x0000000000001410 0x0000000000100410 0x0000000000010210 0x0000000000000000 0x0000000000000100 RW 0x1000 LOAD 0x0000000000001610 0x0000000000100610 0x0000000000010310 0x0000000000000000 0x0000000000000100 RW 0x1000 ... Adding a section assumes adding something like this to the .ld: ... __phys_ptr += SIZEOF(.fakeSection71); . += 0x100; .fakeSection72 . : AT(__phys_ptr) { . += 0x100; } ... Note that Offset, FileSiz and MemSiz also become different, by the same absolute amount of 0x1000 (on a more complex repro the same happened with a different absolute amount instead of 0x1000). *(.text) being in the first section doesn't matter, you can put it anywhere else and the problem persists. But add or remove a section and the problem goes away. The issue only occurs when you have EXACTLY 72 Program Headers (and, in the case of this simple ld script, exactly 72 sections).
Reproducing this and debugging it, I find the problem is with code in elf.c in _bfd_elf_map_sections_to_segments that does this if (count != 0 && (((sections[0]->lma & addr_mask) & (maxpagesize - 1)) >= (phdr_size & (maxpagesize - 1)))) /* For compatibility with old scripts that may not be using SIZEOF_HEADERS, add headers when it looks like space has been left for them. */ phdr_in_segment = TRUE; For riscv64-*, if there are 72 segments, then phdr_size is 4096, which is exactly the page size, so the calculation (phdr_size & (maxpagesize - 1)) is 0, the test always succeeds regardless of how the first section is defined, and the linker decides that we must put the phdr in the first segment. That causes the first segment to be LOAD off 0x0000000000000000 vaddr 0x00000000000ff000 paddr 0x000000000000 f000 align 2**12 filesz 0x0000000000001110 memsz 0x0000000000001110 flags r-x when what the user wants is a load offset of 0x1000, a vaddr of 0x100000, a filesz of 0x110, and a memsize of 0x110. I see some obvious workarounds. 1) Don't have 72 segments. This is impractical, since in general you can't know the number of segments until after linking. 2) Turn off demand paging by adding the --nmagic option. This will disable placement of phdr into a segment. It also disables some other stuff like page alignment of sections/segments. It does appear to give the right result for this testcase. 3) Use the "-z separate-code" option. This forces the phdrs into its own segment which may not be what the user wants. This requires that the first segment contain code. And requires that the port has shared library support enaboled, which is true for riscv64-linux but not riscv64-elf (missing newlib support) so this doesn't actually work in this case. I had to use a riscv64-linux toolchain to see that it does work. 4) Increase the page size to avoid the conflict. Adding "-z max-page-size=0x2000" avoids the problem. This has the side effect of adding an extra 4KB to the executable file, but it doesn't change any of the segment addresses or sizes. Just the offset that they are found in the executable file. 5) Include ". += SIZEOF_HEADERS" in the linker script, right before the first section. The load address for the first segment is now correct, but its size is 4KB larger than desired because it contains the phdrs. This means your program will use 4KB more memory than desired. The problematic code was added by Alan Modra in 2018. commit 5d695627883b32cf33adb529c8fc7271b46dcf55 Author: Alan Modra <amodra@gmail.com> Date: Sat Oct 6 12:24:28 2018 +0930 Use p_vaddr_offset to set p_vaddr on segments without sections p_vaddr is currently set from the first section vma if a segment has sections, and to zero if a segment has no sections. This means we lose p_vaddr when objcopy'ing executables if a segment without sections has a non-zero p_vaddr. This patch saves p_vaddr to p_vaddr_offset, and to make the use of p_vaddr_offset consistent, inverts the sign. (It's now added to section vma to get segment vaddr, and added to zero when there are no sections.) * elf.c (assign_file_positions_for_load_sections): Set p_vaddr from m->p_vaddr_offset for segments without sections. Invert sign of p_vaddr_offset. (rewrite_elf_program_header, copy_elf_program_header): Save old segment p_vaddr to p_vaddr_offset. Invert sign of p_vaddr_offset.
Sorry, that is the wrong commit id, it is the one immediately after commit 64029e93683a266c38d19789e780f3748bd6a188 Author: Alan Modra <amodra@gmail.com> Date: Fri Oct 5 11:40:54 2018 +0930 Separate header PT_LOAD for -z separate-code This patch, along with previous patches in the series, supports putting the ELF file header and program headers in a PT_LOAD without sections. Logic governing whether headers a loaded has changed a little: The primary reason to include headers is now the presence of SIZEOF_HEADERS in a linker script. However, to support scripts that may have reserved space for headers by hand, we continue to add headers whenever the first section address is past the end of headers modulo page size. include/ * bfdlink.h (struct bfd_link_info): Add load_phdrs field. bfd/ * elf-nacl.c (nacl_modify_segment_map): Cope with header PT_LOAD lacking sections. * elf.c (_bfd_elf_map_sections_to_segments): Assume file and program headers are required when info->load_phdrs. Reorganize code handling program headers. Generate a mapping without sections just for file and program headers when -z separate-code would indicate they should be on a different page to the first section.