Created attachment 12971 [details] bss_lma_adjust.s Consider a linker script that allows .text, .data and .bss to reside in the same segment. When an empty .bss section resides between non-empty .text and .data sections in the linked output file, an objcopy of this output file adjusts the sh_offset of the .bss section, and gives the segment containing it positive size. The segment layout of the linked output file is: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000001000 0x0000000000000000 0x0000000000000000 0x000000000000003a 0x000000000000003a RWE 0x1000 LOAD 0x0000000000000038 0x0000000000000038 0x0000000000000038 0x0000000000000000 0x0000000000000000 RW 0x1000 Section to Segment mapping: Segment Sections... 00 .text .note.gnu.property .bss .data 01 .bss The two instances of the empty .bss is potentially a red flag itself. After running objcopy, memsiz of the second segment increase from 0 to 2 bytes, along with the aforementioned change in sh_offset. I've attached a linker script and assembly file that reproduces the bug. $(AS) bss_lma_adjust.s $(LD) a.out -o before.out -T bss_lma_adjust.ld $(OBJCOPY) before.out after.out > objcopy: after.out: section .bss lma 0x38 adjusted to 0x3a Note that the assignment to '.' in .bss is required to reproduce the issue (any assignment to '.' will do).
Created attachment 12972 [details] bss_lma_adjust.ld
It's not just the objcopy going wrong here. ld shouldn't be creating two PT_LOAD headers where one would suffice.
(In reply to Alan Modra from comment #2) > It's not just the objcopy going wrong here. ld shouldn't be creating two > PT_LOAD headers where one would suffice. Is it intended that empty .bss sections are considered part of a segment's contents in the first place? Other types of empty section are simply ignored and don't affect segment layout.
The master branch has been updated by Alan Modra <amodra@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=8d748d1dc56406228c2c76de2563859213364cbf commit 8d748d1dc56406228c2c76de2563859213364cbf Author: Alan Modra <amodra@gmail.com> Date: Sat Nov 28 08:45:02 2020 +1030 PR26907, segment contains empty SHT_NOBITS section Section ordering is important for _bfd_elf_map_sections_to_segments and assign_file_positions_for_load_sections, which are only prepared to handle sections in increasing LMA order. When zero size sections are involved it is possible to have multiple sections at the same LMA. In that case the zero size sections must sort before any non-zero size sections regardless of their types. bfd/ PR 26907 * elf.c (elf_sort_sections): Don't sort zero size !load sections after load sections. ld/ * testsuite/ld-elf/pr26907.ld, * testsuite/ld-elf/pr26907.s, * testsuite/ld-elf/pr26907.d: New test.
Fixed
Created attachment 13010 [details] pr26907-2.ld Thanks. However, I'm getting a similar issue with the attached linker script - an empty SHT_NOBITS section is given its own segment. It still looks to me like a problem that an empty output section with a wild rule containing ".bss" is marked with SHF_ALLOC, even when it has no contents. We can change that wild rule to anything else, including other "special sections" which get SHF_ALLOC set automatically when they have non-zero size, and the segment mapping comes out fine. I was going to file a separate PR, but the crux of the problem is the same so I'll leave it here for now. $ as-new pr26907-2.s -o pr26907-2.o $ ld-new pr26907-2.o -T pr26907-2.ld $ readelf -lS pr26907-2.out > [ 1] .text PROGBITS 0000000000002000 00001000 > 0000000000000000 0000000000000000 AX 0 0 1 > ... > [ 3] .data PROGBITS 0000000000001000 00002000 > 0000000000000002 0000000000000000 WA 0 0 1 > [ 4] .empty_rom PROGBITS 0000000000002032 00002004 > 0000000000000000 0000000000000000 W 0 0 1 > [ 5] .empty_bss_in_ram NOBITS 0000000000001002 00001002 > 0000000000000000 0000000000000000 WA 0 0 1 > [ 6] .data2 PROGBITS 0000000000001002 00002002 > 0000000000000002 0000000000000000 WA 0 0 1 > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > LOAD 0x0000000000000002 0x0000000000001002 0x0000000000001002 > 0x0000000000000000 0x0000000000000000 RW 0x1000 > LOAD 0x0000000000001000 0x0000000000002000 0x0000000000002000 > 0x0000000000000000 0x0000000000000000 R E 0x1000 > LOAD 0x0000000000001000 0x0000000000002000 0x0000000000002000 > 0x0000000000000030 0x0000000000000030 R 0x1000 > LOAD 0x0000000000002000 0x0000000000001000 0x0000000000002030 > 0x0000000000000004 0x0000000000000004 RW 0x1000 > ... > Section to Segment mapping: > Segment Sections... > 00 .empty_bss_in_ram > 01 .text > 02 .text .note.gnu.property > 03 .data .empty_bss_in_ram .data2 > 04 .note.gnu.property > 05 .text .note.gnu.property $ objcopy pr26907-2.out > objcopy: sthKY3wa: section `.data' can't be allocated in segment 1 > LOAD: .empty_bss_in_ram .data .data2
Created attachment 13011 [details] pr26907-2.out
Created attachment 13012 [details] pr26907-2.s
Yes, if you play games with lma and vma you can easily create layouts that _bfd_elf_map_sections_to_segments won't handle very well. A section that has a different lma to vma relationship to the previous section can't be put in the same load segment. That can easily lead to empty load segments, and indeed must if a section is empty but your script said it should be kept and its lma to vma relation doesn't match any non-empty section. (LMA is not specified in ELF section headers, only in load headers.) I'm not interested in trying to make every weird user script produce the minimum number of load headers, or even to work with objcopy.
(In reply to Alan Modra from comment #9) > Yes, if you play games with lma and vma you can easily create layouts that > _bfd_elf_map_sections_to_segments won't handle very well. A section that > has a different lma to vma relationship to the previous section can't be put > in the same load segment. That can easily lead to empty load segments, and > indeed must if a section is empty but your script said it should be kept and > its lma to vma relation doesn't match any non-empty section. (LMA is not > specified in ELF section headers, only in load headers.) > > I'm not interested in trying to make every weird user script produce the > minimum number of load headers, or even to work with objcopy. Fair enough, I agree the linker script is weird. A sensible layout that at least groups output sections by LMA region averts the objcopy issues and reduces the number of segments required.
(In reply to Jozef Lawrynowicz from comment #10) > (In reply to Alan Modra from comment #9) > > I'm not interested in trying to make every weird user script produce the > > minimum number of load headers, or even to work with objcopy. > > Fair enough, I agree the linker script is weird. A sensible layout that at > least groups output sections by LMA region averts the objcopy issues and > reduces the number of segments required. Correction: there can still be issues if you group by LMA region but intersperse LMA != VMA section with LMA == VMA sections. It seems the "most sensible" layout would follow a few principles: - Primarily group sections with LMA != VMA, which have matching LMA and VMA regions - Secondarily group sections by region. VMA region probably makes the most sense, when considering the execution view of the program, but I don't think it matters too much in relation to the segment layout. - Order these groups of output sections by increasing origin address of the region they are grouped by. Of course it appears these issues only manifest with empty .bss sections, so it is a real edge case.