PT_GNU_RELRO is somewhat broken

Fangrui Song maskray@google.com
Wed May 11 20:50:16 GMT 2022


On 2022-05-11, H.J. Lu wrote:
>On Wed, May 11, 2022 at 12:43 PM Fangrui Song <maskray@google.com> wrote:
>>
>>
>> On 2022-05-11, H.J. Lu wrote:
>> >On Wed, May 11, 2022 at 11:17 AM Fangrui Song <maskray@google.com> wrote:
>> >>
>> >> On 2022-05-11, H.J. Lu via Libc-alpha wrote:
>> >> >On Wed, May 11, 2022 at 9:59 AM Florian Weimer via Libc-alpha
>> >> ><libc-alpha@sourceware.org> wrote:
>> >> >>
>> >> >> PT_GNU_RELRO is supposed to identify a region in the process image which
>> >> >> has to be flipped to PROT_READ (only) permission after relocation
>> >> >> (“Read-Only after RELocation”).
>> >> >>
>> >> >> glibc has this code in the dynamic loader in elf/dl-reloc.c:
>> >> >>
>> >> >> | void
>> >> >> | _dl_protect_relro (struct link_map *l)
>> >> >> | {
>> >> >> |   ElfW(Addr) start = ALIGN_DOWN((l->l_addr
>> >> >> |                                  + l->l_relro_addr),
>> >> >> |                                 GLRO(dl_pagesize));
>> >> >> |   ElfW(Addr) end = ALIGN_DOWN((l->l_addr
>> >> >> |                                + l->l_relro_addr
>> >> >> |                                + l->l_relro_size),
>> >> >> |                               GLRO(dl_pagesize));
>> >> >> |   if (start != end
>> >> >> |       && __mprotect ((void *) start, end - start, PROT_READ) < 0)
>> >> >> |     {
>> >> >> |       static const char errstring[] = N_("\
>> >> >> | cannot apply additional memory protection after relocation");
>> >> >> |       _dl_signal_error (errno, l->l_name, NULL, errstring);
>> >> >> |     }
>> >> >> | }
>> >> >>
>> >> >> I assume the intent is to conservatively apply the largest possible
>> >> >> RELRO region given GLRO(dl_pagesize), the run-time page size reported by
>> >> >> the kernel.  If the binary is built to a smaller page size (to save disk
>> >> >> space), glibc can still load it, but apply only some RELRO protection.
>> >> >> But _dl_relocate_object has a bug: to be conservative, it would have to
>> >> >> use ALGIN_UP for the start (lower) address of the range.
>> >> >>
>> >> >> But it turns out we can't make this change without incurring a loss of
>> >> >> hardening: BFD ld does not align the start address to a page boundary.
>> >> >> For example, /bin/true in Fedora 35 x86-64 has this:
>> >> >>
>> >> >> | $ readelf -l /bin/true
>> >> >> |
>> >> >> | Elf file type is DYN (Position-Independent Executable file)
>> >> >> | Entry point 0x1960
>> >> >> | There are 13 program headers, starting at offset 64
>> >> >> |
>> >> >> | Program Headers:
>> >> >> |   Type           Offset             VirtAddr           PhysAddr
>> >> >> |                  FileSiz            MemSiz              Flags  Align
>> >> >> |   PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
>> >> >> |                  0x00000000000002d8 0x00000000000002d8  R      0x8
>> >> >> |   INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
>> >> >> |                  0x000000000000001c 0x000000000000001c  R      0x1
>> >> >> |       [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
>> >> >> |   LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
>> >> >> |                  0x0000000000000ff8 0x0000000000000ff8  R      0x1000
>> >> >> |   LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
>> >> >> |                  0x00000000000029a1 0x00000000000029a1  R E    0x1000
>> >> >> |   LOAD           0x0000000000004000 0x0000000000004000 0x0000000000004000
>> >> >> |                  0x0000000000000d38 0x0000000000000d38  R      0x1000
>> >> >> |   LOAD           0x0000000000005c78 0x0000000000006c78 0x0000000000006c78
>> >> >> |                  0x0000000000000390 0x00000000000003a0  RW     0x1000
>> >> >> |   DYNAMIC        0x0000000000005c90 0x0000000000006c90 0x0000000000006c90
>> >> >> |                  0x00000000000001f0 0x00000000000001f0  RW     0x8
>> >> >> |   NOTE           0x0000000000000338 0x0000000000000338 0x0000000000000338
>> >> >> |                  0x0000000000000050 0x0000000000000050  R      0x8
>> >> >> |   NOTE           0x0000000000000388 0x0000000000000388 0x0000000000000388
>> >> >> |                  0x0000000000000044 0x0000000000000044  R      0x4
>> >> >> |   GNU_PROPERTY   0x0000000000000338 0x0000000000000338 0x0000000000000338
>> >> >> |                  0x0000000000000050 0x0000000000000050  R      0x8
>> >> >> |   GNU_EH_FRAME   0x00000000000049c4 0x00000000000049c4 0x00000000000049c4
>> >> >> |                  0x000000000000007c 0x000000000000007c  R      0x4
>> >> >> |   GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
>> >> >> |                  0x0000000000000000 0x0000000000000000  RW     0x10
>> >> >> |   GNU_RELRO      0x0000000000005c78 0x0000000000006c78 0x0000000000006c78
>> >> >> |                  0x0000000000000388 0x0000000000000388  R      0x1
>> >> >> | […]
>> >> >>
>> >> >> The virtual address for PT_GNU_RELRO is 0x388, which is definitely not
>> >> >> aligned to a 4K page.  (0x388 + 0x6c78 == 0x7000, so at least the end
>> >> >> address is aligned.)  In practice, this seems to work because the RELRO
>> >> >> area seems to be at the start of the RW LOAD segment, so we can safely
>> >> >> flip the slack space at the start of the page to RO.  It still looks
>> >> >> like a major wart to me, though.
>> >> >
>> >> >After relocation, we change the end of the RO segment (aligned down from
>> >> >the beginning of the RELRO area) to the end of the RELRO segment to RO.
>> >> >Since the end of the RELRO segment must be aligned to the page size,
>> >> >ALIGN_DOWN on the end of the RELRO segment doesn't lose any protection.
>> >> >
>> >> >> Any suggestions what should we do to fix this properly, mainly for
>> >> >> targets that have varying page size in practice?
>> >> >
>> >> >The end of the RELRO segment should be aligned to the maximum page
>> >> >size.
>> >> >
>> >>
>> >> PT_GNU_RELRO is designed/implemented this way:
>> >>
>> >> * there can be at most one PT_GNU_RELRO
>> >> * p_vaddr(PT_GNU_RELRO) = p_vaddr(first RW PT_LOAD); https://sourceware.org/binutils/docs/ld/Builtin-Functions.html DATA_SEGMENT_RELRO_END is designed this way
>> >> * p_vaddr(PT_GNU_RELRO) + p_memsz(PT_GNU_RELRO) is aligned by common-page-size. comon page size is chosen probably because of less waste
>> >
>> >ld aligns DATA_SEGMENT_RELRO_END to the maximum page size.
>>
>> Is p_vaddr(PT_GNU_RELRO) + p_memsz(PT_GNU_RELRO) aligned to max-page-size for non-x86 ports?
>> I know some changes have been made in binutils in recent months, but
>> don't know the exact state.
>> If so, the security looks good to me.
>>
>> With ld 2.38's x86-64 port, `-z max-page-size=2097152 -z separate-code`
>> aligns the end of PT_GNU_RELRO to common-page-size for an executable
>> (0xaa82000, not a multiple of 2097152.)
>
>It is fixed by:
>
>https://sourceware.org/bugzilla/show_bug.cgi?id=28824
>
>-- 
>H.J.

Thanks. I see that your
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=3c4c0a18c8f0af039e65458da5f53811e9e43754
(milestone: binutils 2.39) ported the
"align the end of PT_GNU_RELRO to max-page-size" change to x86-64. I added a
comment to https://sourceware.org/bugzilla/show_bug.cgi?id=28824

Then it looks like there is no action item on glibc side.


More information about the Binutils mailing list