Creating a follow-up of what was discussed in https://sourceware.org/bugzilla/show_bug.cgi?id=28824 after its resolution -- it's probably easier to act on an open bz. The end of the PT_GNU_RELRO segment has been changed to be aligned from commonpagesize to maxpagesize, and this bloats up binaries on e.g. arm64 where maxpagesize is 64k -- small programs that used to take a few KB now use up a full 64KB of disk space. From the follow-up discussions, it looks like there's agreement that the on-disk size could be reduced by adding an additional LOAD program-header so the "relro-boundary" page is mapped twice (end of relro segment, start of non-relro segment) as Rui Ueyama suggested; just nobody's working on it as far as I'm aware. I honestly won't have time to contribute much, but we're still interested (coming from alpine's issue tracking the size increase, https://gitlab.alpinelinux.org/alpine/aports/-/issues/14126 ), so just making sure this is properly tracked. Sorry/thanks!
Unfortunately, I believe this is nearly infeasible to fix, as long as the alignment still follows max-page-size, as switching to a two-RW-PT_LOAD layout in GNU ld is very difficult. I have a short summary of different design choices: https://maskray.me/blog/2020-11-15-explain-gnu-linker-options#z-relro > GNU ld uses one RW PT_LOAD program header with padding at the start. The first half of the PT_LOAD overlaps with PT_GNU_RELRO. The padding is added so that the end of PT_GNU_RELRO is aligned by max-page-size. (See ld.bfd --verbose output.) Prior to GNU ld 2.39, the end was aligned by common-page-size. GNU ld's one RW PT_LOAD layout makes the alignment increase the file size. max-page-size can be large, such as 65536 for many systems, causing wasted space. > > lld utilitizes two RW PT_LOAD program headers: one for RELRO sections and the other for non-RELRO sections. Although this might appear unusual initially, it eliminates the need for alignment padding as seen in GNU ld's layout. I implemented the current layout in 2019 (https://reviews.llvm.org/D58892). > > The layout used by mold is similar to that of lld. In mold's case, the end of PT_GNU_RELRO is padded to max-page-size by appending a SHT_NOBITS .relro_padding section. This approach ensures that the last page of PT_GNU_RELRO is protected, regardless of the system page size. However, when the system page size is less than max-page-size, the map from the first RW PT_LOAD is larger than needed. > > In my opinion, losing protection for the last page when the system page size is larger than common-page-size is not at all an issue. Protecting .got.plt is the main purpose of -z now. Protecting a small portion of .data.rel.ro doesn't really make the program more secure, given that .data and .bss are so huge and full of attach targets. If users are really anxious, they can set common-page-size to match their system page size. > > I am unsure whether lld's hidden assumption about common-page-size <= system page-size is an issue.
Is there any hope for fixing this issue, which manifests itself as a significant size increase of binaries on arm64? The problem may seem insignificant when looking at a single binary, but for a full rootfs I see significant increases of 10-15% . More background of this issue is in bug #28824 .
Is this truly a practical problem? Virtually every space-constrained system makes use of compressed firmware images, or has readily access to that tecnology. Thus, I wouldn't expect the accumulated binary size to actually consume significantly more physical storage. I ask, because the alleged "fix" [1] to this space issue introduced a regression [2] which effectively banned GNU/Linux from systems with larger pagesizes, which, contrary to the assumption in the commit, exist commercially and in large numbers. [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1a26a53a0dee39106ba58fcb15496c5f13074652 [2] https://sourceware.org/bugzilla/show_bug.cgi?id=32048
> Is this truly a practical problem? Virtually every space-constrained system makes use of compressed firmware images, or has readily access to that tecnology. We're using plain ext4 for the root file system at $work, as it's not read-only. Since this change I've been pushing for btrfs with compression enabled, but it's a workaround at best. Afaik, container images layers are also stored uncompressed for example, so there's plenty of "at scale" effect too... > I ask, because the alleged "fix" [1] to this space issue introduced a regression [2] which effectively banned GNU/Linux from systems with larger pagesizes, which, contrary to the assumption in the commit, exist commercially and in large numbers. That's not a fix; it wasn't even considered for arm64 because larger pages are known to be used there. The fix was described in #28824 (make multiple mappings), but nobody's gone out of their way to implement it yet (for my part, mostly because the workaround is cheaper to implement and $work priorities aren't aligned with "doing the right thing")
(to be clear, I'm not dismissing the issue you pointed at for arm32 and agree that commit is probably best reverted in the short term, I'm just asserting that size increases still cause all sort of headaches for some people and some "better" fix would still be appreciated so this bz ought to be kept open if someone can find time to figure it out. Perhaps if arm32 binaries grow a well someone will be afforded time to look at it again...)
Your explanation makes perfectly sense: this issue is about Aarch64 binaries, and my argument about compressed filesystems is definitely limited to native Aarch32 platforms only.