Bug 22829

Summary: objcopy/strip removes PT_GNU_RELRO from lld binaries
Product: binutils Reporter: NGG <ngg>
Component: binutilsAssignee: Alan Modra <amodra>
Status: RESOLVED FIXED    
Severity: normal CC: hjl.tools, i, jeremip11, ngg
Priority: P2    
Version: 2.31   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed: 2018-02-10 00:00:00
Attachments: sample binary created with lld
prospective patch
a better fix

Description NGG 2018-02-10 13:03:11 UTC
The assign_file_positions_for_non_load_sections() function in bfd/elf.c removes a PT_GNU_RELRO segment if there is no corresponding PT_LOAD segment. This check is made by comparing the two segments' p_paddr fields for equality.

The lld linker can assign a single PT_LOAD for multiple sections even if only some of those will be marked as relro. In this case the p_paddr fields do not match and strip/objcopy wrongfully removes the PT_GNU_RELRO segment.

Unfortunately it's not that easy to fix because if the equality check is fixed to recognize an overlapping PT_LOAD segment then that segment's parameters (vaddr, paddr, offset, ...) override the PT_GNU_RELRO segment's parameters which is not what we would want.
Comment 1 H.J. Lu 2018-02-10 13:53:36 UTC
Please provide a small binary generated by lld.
Comment 2 NGG 2018-02-10 15:45:58 UTC
Created attachment 10802 [details]
sample binary created with lld

This sample was generated with clang and lld (version 5.0.1) like this:

clang -fuse-ld=lld -Wl,-z,relro -o sample sample.c

Both "strip sample" and "objcopy sample sample.copied" remove the PT_GNU_RELRO segment.

sample.c:

#include <stdlib.h>

int main(void)
{
	malloc(1);
	return 0;
}
Comment 3 Alan Modra 2018-02-11 04:39:07 UTC
Reducing importance to an enhancement request.
Comment 4 H.J. Lu 2018-02-11 12:45:25 UTC
binutils expects that PT_GNU_RELRO segment is placed right before
a read-only section so that it can be merged with previous read-only
PT_LOAD segment at run-time.  lld generates:

There are 30 section headers, starting at offset 0x36b0:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        0000000000200270 000270 00001c 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            000000000020028c 00028c 000020 00   A  0   0  4
  [ 3] .rodata           PROGBITS        00000000002002ac 0002ac 000004 00  AM  0   0  4
  [ 4] .dynsym           DYNSYM          00000000002002b0 0002b0 000048 18   A  8   1  8
  [ 5] .gnu.version      VERSYM          00000000002002f8 0002f8 000006 02   A  4   0  2
  [ 6] .gnu.version_r    VERNEED         0000000000200300 000300 000020 00   A  8   1  4
  [ 7] .hash             HASH            0000000000200320 000320 000020 04   A  4   0  4
  [ 8] .dynstr           STRTAB          0000000000200340 000340 000030 00   A  0   0  1
  [ 9] .rela.dyn         RELA            0000000000200370 000370 000018 18   A  4   0  8
  [10] .rela.plt         RELA            0000000000200388 000388 000018 18   A  4   0  8
  [11] .eh_frame_hdr     PROGBITS        00000000002003a0 0003a0 00002c 00   A  0   0  1
  [12] .eh_frame         PROGBITS        00000000002003d0 0003d0 0000c8 00   A  0   0  8
  [13] .text             PROGBITS        0000000000201000 001000 000182 00  AX  0   0 16
  [14] .init             PROGBITS        0000000000201184 001184 000017 00  AX  0   0  4
  [15] .fini             PROGBITS        000000000020119c 00119c 000009 00  AX  0   0  4
  [16] .plt              PROGBITS        00000000002011b0 0011b0 000020 00  AX  0   0 16
  [17] .data             PROGBITS        0000000000202000 002000 000010 00  WA  0   0  8
  [18] .tm_clone_table   PROGBITS        0000000000202010 002010 000000 00  WA  0   0  8
  [19] .got.plt          PROGBITS        0000000000202010 002010 000020 00  WA  0   0  8
  [20] .fini_array       FINI_ARRAY      0000000000203000 003000 000008 08  WA  0   0  8
  [21] .init_array       INIT_ARRAY      0000000000203008 003008 000008 08  WA  0   0  8
  [22] .dynamic          DYNAMIC         0000000000203010 003010 000180 10  WA  8   0  8
  [23] .got              PROGBITS        0000000000203190 003190 000010 00  WA  0   0  8
  [24] .bss              NOBITS          0000000000204000 0031a0 000001 00  WA  0   0  1
  [25] .comment          PROGBITS        0000000000000000 0031a0 00007e 00  MS  0   0  1
  [26] .gnu_debuglink    PROGBITS        0000000000000000 003220 000068 00      0   0  4
  [27] .symtab           SYMTAB          0000000000000000 003288 0001f8 18     29   7  8
  [28] .shstrtab         STRTAB          0000000000000000 003480 000110 00      0   0  1
  [29] .strtab           STRTAB          0000000000000000 003590 00011a 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

Elf file type is EXEC (Executable file)
Entry point 0x201000
There are 10 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000230 0x000230 R   0x8
  INTERP         0x000270 0x0000000000200270 0x0000000000200270 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000498 0x000498 R   0x1000
  LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x0001d0 0x0001d0 R E 0x1000
  LOAD           0x002000 0x0000000000202000 0x0000000000202000 0x0011a0 0x002001 RW  0x1000
  DYNAMIC        0x003010 0x0000000000203010 0x0000000000203010 0x000180 0x000180 RW  0x8
  GNU_RELRO      0x003000 0x0000000000203000 0x0000000000203000 0x0001a0 0x001000 R   0x1
  GNU_EH_FRAME   0x0003a0 0x00000000002003a0 0x00000000002003a0 0x00002c 0x00002c R   0x1
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
  NOTE           0x00028c 0x000000000020028c 0x000000000020028c 0x000020 0x000020 R   0x4

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .rodata .dynsym .gnu.version .gnu.version_r .hash .dynstr .rela.dyn .rela.plt .eh_frame_hdr .eh_frame 
   03     .text .init .fini .plt 
   04     .data .tm_clone_table .got.plt .fini_array .init_array .dynamic .got .bss 
   05     .dynamic 
   06     .fini_array .init_array .dynamic .got 
   07     .eh_frame_hdr 
   08     
   09     .note.ABI-tag 

.tm_clone_table section before PT_GNU_RELRO isn't read-only.

$ gcc x.c -Wl,-z,separate-code -Wl,-z,max-page-size=0x1000

generates:

There are 29 section headers, starting at offset 0x38d8:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        00000000004002a8 0002a8 00001c 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            00000000004002c4 0002c4 000020 00   A  0   0  4
  [ 3] .note.gnu.build-id NOTE            00000000004002e4 0002e4 000024 00   A  0   0  4
  [ 4] .gnu.hash         GNU_HASH        0000000000400308 000308 00001c 00   A  5   0  8
  [ 5] .dynsym           DYNSYM          0000000000400328 000328 000060 18   A  6   1  8
  [ 6] .dynstr           STRTAB          0000000000400388 000388 00003f 00   A  0   0  1
  [ 7] .gnu.version      VERSYM          00000000004003c8 0003c8 000008 02   A  5   0  2
  [ 8] .gnu.version_r    VERNEED         00000000004003d0 0003d0 000020 00   A  6   1  8
  [ 9] .rela.dyn         RELA            00000000004003f0 0003f0 000030 18   A  5   0  8
  [10] .rela.plt         RELA            0000000000400420 000420 000018 18  AI  5  22  8
  [11] .init             PROGBITS        0000000000401000 001000 000017 00  AX  0   0  4
  [12] .plt              PROGBITS        0000000000401020 001020 000020 10  AX  0   0 16
  [13] .text             PROGBITS        0000000000401040 001040 000162 00  AX  0   0 16
  [14] .fini             PROGBITS        00000000004011a4 0011a4 000009 00  AX  0   0  4
  [15] .rodata           PROGBITS        0000000000402000 002000 000010 00   A  0   0  8
  [16] .eh_frame_hdr     PROGBITS        0000000000402010 002010 000034 00   A  0   0  4
  [17] .eh_frame         PROGBITS        0000000000402048 002048 0000f0 00   A  0   0  8
  [18] .init_array       INIT_ARRAY      0000000000403e10 002e10 000008 08  WA  0   0  8
  [19] .fini_array       FINI_ARRAY      0000000000403e18 002e18 000008 08  WA  0   0  8
  [20] .dynamic          DYNAMIC         0000000000403e20 002e20 0001d0 10  WA  6   0  8
  [21] .got              PROGBITS        0000000000403ff0 002ff0 000010 08  WA  0   0  8
  [22] .got.plt          PROGBITS        0000000000404000 003000 000020 08  WA  0   0  8
  [23] .data             PROGBITS        0000000000404020 003020 000004 00  WA  0   0  1
  [24] .bss              NOBITS          0000000000404024 003024 000004 00  WA  0   0  1
  [25] .comment          PROGBITS        0000000000000000 003024 000058 01  MS  0   0  1
  [26] .symtab           SYMTAB          0000000000000000 003080 0005a0 18     27  43  8
  [27] .strtab           STRTAB          0000000000000000 003620 0001af 00      0   0  1
  [28] .shstrtab         STRTAB          0000000000000000 0037cf 000103 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

Elf file type is EXEC (Executable file)
Entry point 0x401040
There are 11 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R   0x8
  INTERP         0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x000438 0x000438 R   0x1000
  LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x0001ad 0x0001ad R E 0x1000
  LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000138 0x000138 R   0x1000
  LOAD           0x002e10 0x0000000000403e10 0x0000000000403e10 0x000214 0x000218 RW  0x1000
  DYNAMIC        0x002e20 0x0000000000403e20 0x0000000000403e20 0x0001d0 0x0001d0 RW  0x8
  NOTE           0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000044 0x000044 R   0x4
  GNU_EH_FRAME   0x002010 0x0000000000402010 0x0000000000402010 0x000034 0x000034 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x002e10 0x0000000000403e10 0x0000000000403e10 0x0001f0 0x0001f0 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt 
   03     .init .plt .text .fini 
   04     .rodata .eh_frame_hdr .eh_frame 
   05     .init_array .fini_array .dynamic .got .got.plt .data .bss 
   06     .dynamic 
   07     .note.ABI-tag .note.gnu.build-id 
   08     .eh_frame_hdr 
   09     
   10     .init_array .fini_array .dynamic .got 

.eh_frame section before PT_GNU_RELRO segment is read-only.
Comment 5 NGG 2018-02-11 19:51:58 UTC
This is not an enhancement request, it is clearly a bug.
The ELF header is perfectly valid the way lld creates it, and strip reduces its security.

This is why Chromium on Android switched to elfutils strip (see https://chromium-review.googlesource.com/c/chromium/src/+/644908)
It also affects my project where I want to separate debug symbols after linking and I also want to use lld's thin-lto optimizations.
Also embedded and source-based Linux distros will not be able to support lld without either fixing this or switching to an alternative.
Comment 6 Alan Modra 2018-02-12 03:12:02 UTC
Created attachment 10808 [details]
prospective patch

The importance fields are for use by binutils maintainers.  They are not for users to flag how important the bug is to them.  You selected "critical" which is quite obviously wrong from the point of view of the binutils project, and perhaps I overreacted by marking your bug all the way down to "enhancement".

What you don't know is that I'd already investigated the problem to the point of writing a fix, which would have worked but runs foul of what looks to be a lld bug.  Why is lld putting p_memsz of the relro header larger than p_filsz?  So I looked and found https://reviews.llvm.org/D28267.  Apparently lld doesn't try to align the end of the relro segment.  That has the unfortunate effect of wasting space where it matters for targets with limited addressing, whereas wasting space at the begining of the relro segment doesn't matter for such targets (you have to waste space somewhere), reinforcing my opinion that lld is a toy linker.
Comment 7 NGG 2018-02-12 06:51:23 UTC
Thank you for the quick response and patch proposal.

Sorry for overreacting. I assigned it to critical due to its possible security implications and because of the wide range of products it might affect. I should have either written this at my initial description or leave it at normal and let you decide its severity.

Your patch seems to work in my case, I've tried it with a few examples with all of ld.bfd, ld.gold, ld.lld.

Although it is strange that for example on my sample input the
Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
GNU_RELRO      0x003000 0x0000000000203000 0x0000000000203000 0x0001a0 0x001000 R   0x1
becomes (after stripping)
GNU_RELRO      0x003000 0x0000000000203000 0x0000000000203000 0x001000 0x001000 R   0x1

The whole file is smaller than 0x4000 so p_filesz cannot be correct.
It's working because the glibc dynamic linker only checks p_vaddr, p_memsz (https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/dl-load.c;hb=7bb087bd7bfe3616c4c0974a3f7352b593353ea5#l1095)
Comment 8 Alan Modra 2018-02-12 13:42:07 UTC
Created attachment 10810 [details]
a better fix
Comment 9 NGG 2018-02-12 20:55:34 UTC
The second patch works in all the cases I've tried, thanks!
Comment 10 Sourceware Commits 2018-02-13 09:09:03 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f2731e0c374e5323ce4cdae2bcc7b7fe22da1a6f

commit f2731e0c374e5323ce4cdae2bcc7b7fe22da1a6f
Author: Alan Modra <amodra@gmail.com>
Date:   Mon Feb 12 13:06:07 2018 +1030

    PR22829, objcopy/strip removes PT_GNU_RELRO from lld binaries
    
    lld lays out the relro segment differently to GNU ld, not bothering to
    include the first few bytes of .got.plt and padding out to a page at
    the end of the segment.  This patch teaches binutils to recognize the
    different (and somewhat inferior) layout as valid.
    
    bfd/
    	PR 22829
    	* elf.c (assign_file_positions_for_non_load_sections): Rewrite
    	PT_GNU_RELRO setup.
    ld/
    	* testsuite/ld-x86-64/pr14207.d: Adjust relro p_filesz.
Comment 11 Alan Modra 2018-02-13 09:11:15 UTC
Patch applied.
Comment 12 Fangrui Song 2020-02-24 03:33:03 UTC
The lld issue was introduced by https://reviews.llvm.org/rL291523 
and was fixed by https://reviews.llvm.org/D56828 .
It affected lld [4.0,8.0). Before 8.0, lld was not very stable.

I appreciate that you mitigated the lld problem, but I think we don't need to try hard to mitigate lld<8.0 problems. We can revert the commit f2731e0c374e5323ce4cdae2bcc7b7fe22da1a6f to simplify assign_file_positions_for_non_load_sections.
Comment 13 NGG 2020-02-24 11:48:18 UTC
(In reply to Fangrui Song from comment #12)
> The lld issue was introduced by https://reviews.llvm.org/rL291523 
> and was fixed by https://reviews.llvm.org/D56828 .
> It affected lld [4.0,8.0). Before 8.0, lld was not very stable.

The fix you mentioned is not in the 8.0 lld release, but only in 9.0
(https://github.com/llvm-mirror/lld/blob/release_90/ELF/Writer.cpp contains the fix, but https://github.com/llvm-mirror/lld/blob/release_80/ELF/Writer.cpp does not)