Bug 28824 - relro security issues
Summary: relro security issues
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.38
: P2 normal
Target Milestone: 2.39
Assignee: Alan Modra
URL:
Keywords:
: 29639 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-01-27 00:21 UTC by Alan Modra
Modified: 2023-10-07 22:46 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Modra 2022-01-27 00:21:10 UTC
powerpc64le-linux often runs with 64k memory pages, which is the default maxpagesize for this target. The end of the PT_GNU_RELRO segment is only aligned to commonpagesize (4k for powerpc64le-linux).  These two facts can result in up to 60k at the end of the relro segment not being write-protected.  Other targets behave similarly.

How did we get to this situation?

I believe the fundamental underlying issue is that the x86 linker treats maxpagesize differently to its original intent, which is to lay out a binary for hardware paging of up to maxpagesize.  On x86, hardware paging is always the commonpagesize of 4k.  So on x86 a maxpagesize larger than 4k is merely for memory optimisation, and aligning the end of the relro segment to commonpagesize makes sense.  Unfortunately, that's wrong for practically every other target.
Comment 1 Alan Modra 2022-01-27 00:25:04 UTC
I'll note that this issue has been raised before, and at that time I decided that fixing it was difficult.
Comment 2 Alan Modra 2022-02-07 05:30:33 UTC
Actually, the problem is on architectures other than ppc64.  On ppc64 I hacked around the relro alignment problem with ELF_RELROPAGESIZE.
Comment 3 Sourceware Commits 2022-02-13 03:35:30 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9833b7757d246f22db4eb24b8e5db7eb5e05b6d9

commit 9833b7757d246f22db4eb24b8e5db7eb5e05b6d9
Author: Alan Modra <amodra@gmail.com>
Date:   Thu Jan 27 15:17:16 2022 +1030

    PR28824, relro security issues
    
    Background
    ==========
    There are constraints on layout of binaries to meet demand paging and
    memory protection requirements.  Demand paged binaries must have file
    offset mod pagesize equal to vma mod pagesize.  Memory protection
    (executable, read, write status) can only change at page boundaries.
    The linker's MAXPAGESIZE variable gives the page size for these layout
    constraints.
    
    In a typical basic executable with two memory segments, text (RE) and
    data (RW), the data segment must start on a different page to the
    last text segment page.  For example, with 64k pages and a small
    executable of 48k text and 1k data, the text segment might start at
    address 0x10000 and data at 0x20000 for a total of two 64k memory
    pages.  Demand paging would require the image on disk to be 64k+1k
    in size.  We can do better than that.  If the data segment instead
    starts at 0x2c000 (the end of the text segment plus one 64k page) then
    there are still only two memory pages, but the disk image is now
    smaller, 48k+1k in size.  This is why the linker normally starts the
    data segment at the end of the text segment plus one page.  That
    simple heuristic isn't ideal in all cases.  Changing our simple
    example to one with 64k-1 text size, following that heuristic would
    result in data starting at 0x2ffff.  Now we have two 64k memory data
    pages for a data segment of 1k!  If the data segment instead started
    at 0x30000 we'd get a single data segment page at the cost of 1 byte
    extra in the disk image, which is likely a good trade-off.  So the
    linker does adjust the simple heuristic.  Just how much disk image
    size increase is allowed is controlled by the linker's COMMONPAGESIZE
    variable.
    
    A PT_GNU_RELRO segment overlays the initial part of the data segment,
    saying that those pages should be made read-only after relocation by
    the dynamic loader.  Page granularity for memory protection means that
    the end of the relro segment must be at a page boundary.
    
    The problem
    ===========
    Unfortunately most targets currently only align the end of the relro
    segment to COMMONPAGESIZE.  That results in only partial relro
    protection if an executable is running with MAXPAGESIZE pages, since
    any part of the relro segment past the last MAXPAGESIZE boundary can't
    be made read-only without also affecting sections past the end of the
    relro segment.  I believe this problem arose because x86 always runs
    with 4k (COMMPAGESIZE) memory pages, and therefore using a larger
    MAXPAGESIZE on x86 is for reasons other than the demand paging and
    memory page protection boundary requirements.
    
    The solution
    ============
    Always end the relro segment on a MAXPAGESIZE boundary, except for
    x86.  Note that the relro segment, comprising of sections at the start
    of the data segment, is sized according to how those sections are laid
    out.  That means the start of the relro segment is fixed relative to
    its end.  Which also means the start of the data segment must be at a
    fixed address mod MAXPAGESIZE.  So for relro the linker can't play
    games with the start of the data segment to save disk space.  At
    least, not without introducing gaps between the relro sections.  In
    fact, because the linker was starting layout using its simple
    heuristic of starting the data segment at the end of the text segment
    plus one page, it was sometimes introducing page gaps for no reason.
    See pr28743.
    
            PR 28824
            PR 28734
            * ldexp.c (fold_segment_align): When relro, don't adjust up by
            offset within page.  Set relropagesize.
            (fold_segment_relro_end): Align to relropagesize.
            * ldexp.h (seg_align_type): Rename pagesize to commonpagesize.
            Add relropagesize.  Comment.
            * ldlang.c (lang_size_segment): Adjust to suit field renaming.
            (lang_size_relro_segment_1): Align relro_end using relropagesize.
Comment 4 Sourceware Commits 2022-02-13 03:35:36 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=31b4d3a16f200bf04db8439a63b72bba7af4e1be

commit 31b4d3a16f200bf04db8439a63b72bba7af4e1be
Author: Alan Modra <amodra@gmail.com>
Date:   Thu Feb 3 08:57:47 2022 +1030

    PR28824, relro security issues, x86 keep COMMONPAGESIZE relro
    
    x86 treats MAXPAGESIZE as a memory optimisation parameter, actual
    hardware paging is always COMMPAGESIZE of 4k.  Use COMMONPAGESIZE for
    the end of the relro segment alignment.
    
    The previous patch regresses pr18176, increasing the testcase file
    size from 322208 to 2099872 bytes.  Fixing this on x86 will require
    introducing a gap after the end of the relro segment (of up to
    relropagesize-1 bytes).
    
            PR 28824
            PR 18176
            * ld.h (ld_config_type): Add relro_use_commonpagesize field.
            * ldexp.c (fold_segment_align): Set relropagesize depending on
            relro_use_commonpagesize.
            * emultempl/elf-x86.em (elf_x86_create_output_section_statements):
            Set relro_use_commonpagesize.
            * testsuite/ld-x86-64/pr18176.d: xfail.
Comment 5 Alan Modra 2022-02-13 07:49:01 UTC
Fixed.
Comment 6 H.J. Lu 2022-02-13 15:28:54 UTC
Should COMMONPAGESIZE be changed to RELROPAGSIZE to indicate its purpose?
Comment 7 Alan Modra 2022-02-13 23:53:07 UTC
No, the purpose of COMMONPAGESIZE is as I described in comment #3, adjusting the linker layout strategy for non-relro binaries.
Comment 8 Fangrui Song 2022-05-11 20:19:59 UTC
x86 has switched the end of PT_GNU_RELRO to max-page-size as well by H.J.  in https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=3c4c0a18c8f0af039e65458da5f53811e9e43754
Comment 9 Rui Ueyama 2022-10-01 10:57:00 UTC
*** Bug 29639 has been marked as a duplicate of this bug. ***
Comment 10 Dominique Martinet 2023-01-21 01:34:16 UTC
Hello,

I understand making sure the pages are separated is important for security, but would it make sense to disable demand paging if the whole code is < maxpagesize, and keep the previous binary layout to have the linker map it correctly at runtime? (I assume the root of the problem is we cannot mmap at non-multiples of page sizes, so this cannot be lazy paged...)
This could also be advantageous for memory usage on systems with smaller page sizes, e.g. our system has 4K page size and does not need this much padding if loaded at runtime, as runtime knows the page size actually in use.
(I guess it is wasteful if the memory is never used though...)

In particular, I'm concerned that this change caused binaries in alpine linux to grow quite noticeably from 3.16 (binutils 2.38) to 3.17 (binutils 2.39), for example the iptables package which has ~112 small .so (previously 8-12KB) now all are ~68KB big, and the package jumped up from 1.5 to 7.5MB. This is quite annoying for small embedded systems, our rootfs jumped up from ~160 to ~200MB and we do not have much more space to spare.

If it could be possible to preserve small binary sizes that'd be really appreciated, my idea might have been stupid so happy to hear other thoughts and possibilities.

Related alpine issue: https://gitlab.alpinelinux.org/alpine/aports/-/issues/14126



(please say if I should open a new bug for this instead, I just wanted to leave a trace for people invovled in the original patch/problem first)
Comment 11 Alan Modra 2023-01-21 03:44:34 UTC
See commit 1a26a53a0dee
Comment 12 Rui Ueyama 2023-01-21 04:20:10 UTC
In the mold linker, we are dealing with the issue by mapping the page that is at the boundary of relro and non-relro twice as the last relro page and the first non-relro page. Here is an example of the mold-generated output for ARM64.

I believe GNU ld can do the same to not waste disk space for the NULL bytes for relro. Or am I missing something?

Note that we inserted a dummy section `.relro_padding` at the end of relro so that there's no bytes in the executable that are not covered by sections. The runtime doesn't care about that, but without it, the strip command would just truncate the relro segment, destroying the file.


$ ls -lh out/test/elf/aarch64/hello-dynamic/exe
-rwxrwxr-x 1 ruiu ruiu 8.6K Jan 21 12:10 out/test/elf/aarch64/hello-dynamic/exe

$ readelf -S --segments out/test/elf/aarch64/hello-dynamic/exe
There are 31 section headers, starting at offset 0x1a98:
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        00000000002002a8 0002a8 00001b 00   A  0   0  1
  [ 2] .note.gnu.build-id NOTE            00000000002002c4 0002c4 000024 00   A  0   0  4
  [ 3] .note.ABI-tag     NOTE            00000000002002e8 0002e8 000020 00   A  0   0  4
  [ 4] .gnu.hash         GNU_HASH        0000000000200308 000308 00001c 00   A  5   0  8
  [ 5] .dynsym           DYNSYM          0000000000200328 000328 000060 18   A  6   1  8
  [ 6] .dynstr           STRTAB          0000000000200388 000388 00003e 00   A  0   0  1
  [ 7] .gnu.version      VERSYM          00000000002003c6 0003c6 000008 02   A  5   0  2
  [ 8] .gnu.version_r    VERNEED         00000000002003d0 0003d0 000030 00   A  6   1  8
  [ 9] .rela.plt         RELA            0000000000200400 000400 000048 18   A  5  23  8
  [10] .eh_frame         PROGBITS        0000000000200448 000448 0000c8 00   A  0   0  8
  [11] .eh_frame_hdr     PROGBITS        0000000000200510 000510 000044 00   A  0   0  4
  [12] .rodata           PROGBITS        0000000000200558 000558 00000c 00   A  0   0  8
  [13] .rodata.cst       PROGBITS        0000000000200564 000564 000004 00  AM  0   0  4
  [14] .plt              PROGBITS        0000000000210570 000570 000050 00  AX  0   0 16
  [15] .fini             PROGBITS        00000000002105c0 0005c0 000014 00  AX  0   0  4
  [16] .init             PROGBITS        00000000002105d4 0005d4 000024 00  AX  0   0  4
  [17] .text             PROGBITS        0000000000210600 000600 000154 00  AX  0   0 64
  [18] .dynamic          DYNAMIC         0000000000220758 000758 0001a0 10  WA  6   0  8
  [19] .fini_array       FINI_ARRAY      00000000002208f8 0008f8 000008 00  WA  0   0  8
  [20] .init_array       INIT_ARRAY      0000000000220900 000900 000008 00  WA  0   0  8
  [21] .got              PROGBITS        0000000000220908 000908 000020 00  WA  0   0  8
  [22] .relro_padding    NOBITS          0000000000220928 000000 00f6d8 00  WA  0   0  1
  [23] .got.plt          PROGBITS        0000000000230928 000928 000030 00  WA  0   0  8
  [24] .data             PROGBITS        0000000000230958 000958 000010 00  WA  0   0  8
  [25] .tm_clone_table   PROGBITS        0000000000230968 000968 000000 00  WA  0   0  8
  [26] .bss              NOBITS          0000000000230968 000000 000001 00  WA  0   0  1
  [27] .strtab           STRTAB          0000000000000000 000968 000447 00      0   0  1
  [28] .symtab           SYMTAB          0000000000000000 000db0 000b40 18     27 117  8
  [29] .shstrtab         STRTAB          0000000000000000 0018f0 000129 00      0   0  1
  [30] .comment          PROGBITS        0000000000000000 001a19 00007b 00  MS  0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), p (processor specific)

Elf file type is EXEC (Executable file)
Entry point 0x210600
There are 11 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000268 0x000268 R   0x8
  INTERP         0x0002a8 0x00000000002002a8 0x00000000002002a8 0x00001b 0x00001b R   0x1
      [Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
  NOTE           0x0002c4 0x00000000002002c4 0x00000000002002c4 0x000044 0x000044 R   0x4
  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000568 0x000568 R   0x10000
  LOAD           0x000570 0x0000000000210570 0x0000000000210570 0x0001e4 0x0001e4 R E 0x10000
  LOAD           0x000758 0x0000000000220758 0x0000000000220758 0x0001d0 0x00f8a8 RW  0x10000
  LOAD           0x000928 0x0000000000230928 0x0000000000230928 0x000040 0x000041 RW  0x10000
  DYNAMIC        0x000758 0x0000000000220758 0x0000000000220758 0x0001a0 0x0001a0 RW  0x8
  GNU_EH_FRAME   0x000510 0x0000000000200510 0x0000000000200510 0x000044 0x000044 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x1
  GNU_RELRO      0x000758 0x0000000000220758 0x0000000000220758 0x0001d0 0x00f8a8 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .note.gnu.build-id .note.ABI-tag
   03     .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.plt .eh_frame .eh_frame_hdr .rodata .rodata.cst
   04     .plt .fini .init .text
   05     .dynamic .fini_array .init_array .got .relro_padding
   06     .got.plt .data .bss
   07     .dynamic
   08     .eh_frame_hdr
   09
   10     .dynamic .fini_array .init_array .got .relro_padding
Comment 13 Dominique Martinet 2023-01-21 04:26:40 UTC
> See commit 1a26a53a0dee

That commit is about arm32, which apparently had the same problem, but aarch64 is in a similar place except that larger page sizes are actually used (I use 4K, but asahi linux is configured with 16K, and HPC people do use 64K from what I've heard) so the same "fix" cannot be applied there.

I've confirmed at least master (yesterday's, b863a2687319cc8deae279f3cc7861ff506a0575 - 2023/01/20) still exhibit the problem with aarch64 (there's probably a better way to run this, but it seems to work):
```
$ echo 'int main() { return 0; }' | aarch64-linux-gnu-gcc -xc - -c -o foo.o
$ /opt/binutils/bin/aarch64-linux-gnu-ld.bfd --as-needed -dynamic-linker /lib/ld-linux-aarch64.so.1 -X -EL -maarch64linux --fix-cortex-a53-843419 -pie -o foo /usr/lib/gcc-cross/aarch64-linux-gnu/12/../../../../aarch64-linux-gnu/lib/../lib/Scrt1.o /usr/lib/gcc-cross/aarch64-linux-gnu/12/../../../../aarch64-linux-gnu/lib/../lib/crti.o /usr/lib/gcc-cross/aarch64-linux-gnu/12/crtbeginS.o -L/usr/lib/gcc-cross/aarch64-linux-gnu/12 -L/usr/lib/gcc-cross/aarch64-linux-gnu/12/../../../../aarch64-linux-gnu/lib/../lib -L/lib/aarch64-linux-gnu -L/lib/../lib -L/usr/lib/aarch64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc-cross/aarch64-linux-gnu/12/../../../../aarch64-linux-gnu/lib foo.o -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc --push-state --as-needed -lgcc_s --pop-state /usr/lib/gcc-cross/aarch64-linux-gnu/12/crtendS.o /usr/lib/gcc-cross/aarch64-linux-gnu/12/../../../../aarch64-linux-gnu/lib/../lib/crtn.o
$ ls -l foo
-rwxr-xr-x 1 user user  69K Jan 21 12:56 foo*
```

> In the mold linker, we are dealing with the issue by mapping the page that is
at the boundary of relro and non-relro twice as the last relro page and the
first non-relro page

I like this, thanks! (psykose/alice confirmed lld does not have the problem on alpine, but I am not sure if they do the correct thing™ here security-wise -- it's good to have a concrete idea here)
Comment 14 Rui Ueyama 2023-01-21 04:30:57 UTC
> I like this, thanks! (psykose/alice confirmed lld does not have the problem on alpine, but I am not sure if they do the correct thing™ here security-wise -- it's good to have a concrete idea here)

Yup, you can copy it if you like. I'd appreciate if you guys cite it in the release note or something.
Comment 15 Hans-Peter Nilsson 2023-01-21 20:26:23 UTC
(In reply to Rui Ueyama from comment #12)
> In the mold linker, we are dealing with the issue by mapping the page that
> is at the boundary of relro and non-relro twice as the last relro page and
> the first non-relro page.

IOW, an additional LOAD program-header.  I too have thought of that.  That that may be the most pragmatic solution for aarch64.
Comment 16 Hans-Peter Nilsson 2023-01-21 20:29:33 UTC
(In reply to Hans-Peter Nilsson from comment #15)
> that may be the most pragmatic solution for aarch64.

Correction: for *all* architectures that need to support large-enough page-sizes.
Perhaps a per-target hook, that returns whether to add another program header, or do it by padding.
Comment 17 Rui Ueyama 2023-01-21 21:41:46 UTC
> IOW, an additional LOAD program-header.  I too have thought of that.

It doesn’t need an additional program header. Relro sections are just at the end of the RW segment as you can see in my example.
Comment 18 Rui Ueyama 2023-01-22 03:10:02 UTC
> > IOW, an additional LOAD program-header.  I too have thought of that.
>
> It doesn’t need an additional program header. Relro sections are just at the end of the RW segment as you can see in my example.

That was not correct. It indeed needs an additional LOAD program header.
Comment 19 Alan Modra 2023-01-24 04:28:17 UTC
The extra PT_LOAD header makes a lot of sense, as it allows another break in the layout where padding can be hidden (as far as disk is concerned).  It might be even better to put the SHT_NOBITS padding on the header before the relro segment.  That way you keep the .got section parts together, which is better on some targets.
Comment 20 Alan Modra 2023-01-25 00:48:56 UTC
(In reply to Alan Modra from comment #19)
> It might be even better to put the SHT_NOBITS padding on the header before the
> relro segment.
No, that can't work.  Since the end of the relro segment must end on a maxpagesize boundary that fixes the start of the on-disk relro segment.  The end of the previous segment is also fixed, so padding there doesn't help fit the two segments together on disk.  Padding at the end works, because it simply extends the end of the relro segment to a maxpagesize boundary.  However there is a small problem with padding at the end, because the padding separates parts of the GOT.  For example, on x86_64-linux without -z now, 24 bytes at the start of .got.plt is supposed to be relro.  We lose that protection when padding.
Comment 21 Rui Ueyama 2023-01-25 00:56:46 UTC
How about splitting the .got section into two, one is for the first three words and the other for the rest of .got and then place the first one in a relro segment? It would be much easier to do than carefully placing .got at a relro boundary.
Comment 22 Fangrui Song 2023-01-25 03:15:37 UTC
> [...] (psykose/alice confirmed lld does not have the problem on alpine, but I am not sure if they do the correct thing™ here security-wise -- it's good to have a concrete idea here)

lld does the correct thing. I changed lld to adopt the two-RW-PT_LOAD approach in 2019.
I have some notes about different linkers' behaviors: https://maskray.me/blog/2020-11-15-explain-gnu-linker-options#z-noseparate-code
and https://maskray.me/blog/2021-08-22-freebsd-src-browsing-on-linux-and-my-rtld-contribution#p_memsz-of-pt_gnu_relro (where I fixed FreeBSD rtld to do similarly to glibc/musl . Without this, I'd be very careful changing lld's common-page-size padding behavior).

lld still pads p_memsz of PT_GNU_RELRO (the first RW PT_LOAD) to a common-page-size boundary instead of a max-page-size boundary.
If GNU ld now uses max-page-size boundary for all ports but x86, I think https://sourceware.org/binutils/docs/ld/Builtin-Functions.html 
"DATA_SEGMENT_ALIGN(maxpagesize, commonpagesize)" needs a clarification: it seels that commonpagesize is ignored for most ports?
Comment 23 Alan Modra 2023-01-25 04:41:52 UTC
(In reply to Rui Ueyama from comment #21)
> How about splitting the .got section into two
The .got.plt actually (which on x86_64 GNU ld is the section having 3 reserved entries then the R_X86_64_JUMP_SLOT entries).  I'm so rusty on the x86 ABIs that I can't give an answer on whether this proposal would break the ABI.

(In reply to Fangrui Song from comment #22)
> If GNU ld now uses max-page-size boundary for all ports but x86, I think
> https://sourceware.org/binutils/docs/ld/Builtin-Functions.html 
> "DATA_SEGMENT_ALIGN(maxpagesize, commonpagesize)" needs a clarification: it
> seels that commonpagesize is ignored for most ports?
GNU ld uses maxpagesize for the end of the relro segment for all targets, including x86.  And yes, DATA_SEGMENT_ALIGN behaves differently with -z relro.
Comment 24 Dominique Martinet 2023-07-04 00:25:46 UTC
Hi all -- it's been a while and this bz tracks the original patch, not the problem I reported, in hindsight I should have opened a new bz sorry.
I've done that just now: https://sourceware.org/bugzilla/show_bug.cgi?id=30612

Thanks for all the ideas and discussions that have happened I think we can see a way forward, it's been of great help.