Created attachment 13838 [details] A testcase [hjl@gnu-cfl-2 aligned-1]$ make gcc -pie -fPIE -Wl,-z,max-page-size=0x200000 -O2 -o pie main.c load.c gcc -no-pie -Wl,-z,max-page-size=0x200000 -O2 -o pde main.c load.c gcc -O2 -fPIC -c -o load.o load.c gcc -shared -Wl,-z,max-page-size=0x200000 -o libload.so load.o gcc -no-pie -Wl,-z,max-page-size=0x200000 -O2 -o dso main.c libload.so -Wl,-R,. ./pde foo: 0xe00000 ./pie foo: 0x55ea36a00000 ./dso foo: 0x7fafe6308000 make: *** [Makefile:8: all] Aborted (core dumped) [hjl@gnu-cfl-2 aligned-1]$ Kernel loader honors 2MB p_align in PIE. But ld.so doesn't honor 2MB p_align in DSO.
I also opened a kernel bug: https://bugzilla.kernel.org/show_bug.cgi?id=215275
Fixed by commit 718fdd87b1b98ef88e883a37d9c18867256fa5a4 Author: Rongwei Wang <rongwei.wang@linux.alibaba.com> Date: Fri Dec 10 20:39:10 2021 +0800 elf: Properly align PT_LOAD segments [BZ #28676] When PT_LOAD segment alignment > the page size, allocate enough space to ensure that the segment can be properly aligned. This change helps code segments use huge pages become simple and available. This fixes [BZ #28676]. Signed-off-by: Xu Yu <xuyu@linux.alibaba.com> Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
This is not a bug per se. My reasoning is at https://sourceware.org/pipermail/libc-alpha/2021-December/134121.html `__attribute__((aligned(0x200000))) = 1;` does not necessarily mean the large alignment needs to be satisfied on a system with a smaller page size. Now "elf: Properly align PT_LOAD segments [BZ #28676]" overaligns memory mappings to p_align to make the use case work. It's a new feature, not a bugfix. The 2020 kernel commit https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce81bb256a224259ab686742a6284930cbe4f1fa has no waste, so it is more justifiable. The emulation in ld.so, however, would incur significant overhead, so it needs more thoughts. At the very least, the max-page-size=0x200000 for -z noseparate-code on x86-64 may be problematic. It looks like we need more thoughts how tools like objcopy and ld can make the cost small, or whether such changes are justifiable.
Linker may set p_align of a PT_LOAD segment larger than p_align of the first PT_LOAD segment to satisfy a section alignment: Elf file type is DYN (Shared object file) Entry point 0x0 There are 10 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000834 0x0000000000000834 R E 0x1000 LOAD 0x0000000000000e00 0x0000000000001e00 0x0000000000001e00 0x0000000000000230 0x0000000000000230 RW 0x1000 LOAD 0x0000000000400000 0x0000000000400000 0x0000000000400000 0x0000000000000004 0x0000000000000008 RW 0x400000 ... Section to Segment mapping: Segment Sections... 00 .note.gnu.property .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.v ersion .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodat a .eh_frame_hdr .eh_frame 01 .init_array .fini_array .data.rel.ro .dynamic .got .got.plt 02 .data .bss We should align the first PT_LOAD segment to the maximum p_align of all PT_LOAD segments, similar to the kernel commit: commit ce81bb256a224259ab686742a6284930cbe4f1fa Author: Chris Kennelly <ckennelly@google.com> Date: Thu Oct 15 20:12:32 2020 -0700 fs/binfmt_elf: use PT_LOAD p_align values for suitable start address A patch with a testcase is posted at https://sourceware.org/pipermail/libc-alpha/2021-December/134219.html
The master branch has been updated by H.J. Lu <hjl@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e22a4557eb39d7cba9a74d70f4582c13f1a7a83a commit e22a4557eb39d7cba9a74d70f4582c13f1a7a83a Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Dec 15 13:58:33 2021 -0800 elf: Properly align all PT_LOAD segments [BZ #28676] Linker may set p_align of a PT_LOAD segment larger than p_align of the first PT_LOAD segment to satisfy a section alignment: Elf file type is DYN (Shared object file) Entry point 0x0 There are 10 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000834 0x0000000000000834 R E 0x1000 LOAD 0x0000000000000e00 0x0000000000001e00 0x0000000000001e00 0x0000000000000230 0x0000000000000230 RW 0x1000 LOAD 0x0000000000400000 0x0000000000400000 0x0000000000400000 0x0000000000000004 0x0000000000000008 RW 0x400000 ... Section to Segment mapping: Segment Sections... 00 .note.gnu.property .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame 01 .init_array .fini_array .data.rel.ro .dynamic .got .got.plt 02 .data .bss We should align the first PT_LOAD segment to the maximum p_align of all PT_LOAD segments, similar to the kernel commit: commit ce81bb256a224259ab686742a6284930cbe4f1fa Author: Chris Kennelly <ckennelly@google.com> Date: Thu Oct 15 20:12:32 2020 -0700 fs/binfmt_elf: use PT_LOAD p_align values for suitable start address This fixes BZ #28676. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fixed.