[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section
- To: "H.J. Lu" <hjl.tools@gmail.com>
- Subject: Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section
- From: Sriraman Tallam <tmsriram@google.com>
- Date: Tue, 25 Apr 2017 11:30:45 -0700
- Authentication-results: sourceware.org; auth=none
- Cc: gnu-gabi@sourceware.org, binutils <binutils@sourceware.org>, Xinliang David Li <davidxl@google.com>, Cary Coutant <ccoutant@gmail.com>, Sterling Augustine <saugustine@google.com>, Paul Pluzhnikov <ppluzhnikov@google.com>, Ian Lance Taylor <iant@google.com>, Rahul Chaudhry <rahulchaudhry@google.com>, Luis Lozano <llozano@google.com>, Rafael Espíndola <rafael.espindola@gmail.com>, Peter Collingbourne <pcc@google.com>, Rui Ueyama <ruiu@google.com>
- Delivered-to: listarch-gnu-gabi@sourceware.org
- Delivered-to: mailing list gnu-gabi@sourceware.org
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=2yWeGbhvjWxm2dvu8kYU4SqQuKOSOxhVAgHgHrRQe34=; b=qEJo0bLx8tMKBBkeu37fs9R2gkegbU7DRLjj5xwRLa5qi5lodQJvR7GrjZj+oOr/Hf aDTCsmfHIszS9ONSy9xkRCOtVXnPMO5Se/Kz/o5pZzSZdpTF68ZgFEbLnoX+PwG1th1T we+PYpfXwcOjhpr1tYKIsJqdn/xNexGqL7M3jDM93SCuSH7HhT0eahcAvAAPaHdFwQEz S2/fDrTt8injMuDST2J9Kh4d+5IdCWj33IlBDu/YQV2hDz0I6SXPr2mdZXBq7Rz+AV08 bSguO8mphgCwcWApsQ5h5cXknQDPzSrs8ZHekIc69w20yclGvQ4Lsv6HmnS0ECJF12dg KXYw==
- In-reply-to: <CAMe9rOp793g3wD3kYktcHxk4_qQRtmxhqnS0pT_YWWGJJWqW=w@mail.gmail.com>
- List-help: <mailto:gnu-gabi-help@sourceware.org>
- List-id: <gnu-gabi.sourceware.org>
- List-post: <mailto:gnu-gabi@sourceware.org>
- List-subscribe: <mailto:gnu-gabi-subscribe@sourceware.org>
- Mailing-list: contact gnu-gabi-help@sourceware.org; run by ezmlm
- References: <CAAs8HmyKSjqo2GKD0TQy8R80sVcXB3mNORMpXZ_a6sDdmWQOdg@mail.gmail.com> <CAMe9rOp793g3wD3kYktcHxk4_qQRtmxhqnS0pT_YWWGJJWqW=w@mail.gmail.com>
- Sender: gnu-gabi-owner@sourceware.org
On Tue, Apr 25, 2017 at 11:02 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Apr 25, 2017 at 10:12 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> We identified a problem with PIE executables, more than 5% code size
>> bloat compared to non-PIE and we have a few proposals to reduce the
>> bloat. Please take a look and let us know what you think.
>>
>> * What is the problem?
>>
>> PIE is a security hardening feature that enables ASLR (Address Space
>> Layout Randomization) and enables the executable to be loaded at a
>> random virtual address upon every execution instance. On an average, a
>> binary when built as PIE is larger by 5% to 9%, as measured on a suite
>> of benchmarks used at Google where the average text size is ~100MB,
>> when compared to the one built without PIE. This is also independent
>> of the target architecture and we found this to be true for x86_64,
>> arm64 and power. We noticed that the primary reason for this code
>> size bloat is due to the extra dynamic relocations that are generated
>> in order to make the binary position independent. This proposal
>> introduces new ways to represent these dynamic relocations that can
>> reduce the code size bloat to just a few percent.
>>
>> As an example, to show the bloat in code size, here is the data from
>> one of our larger binaries,
>>
>> Without PIE, the binary’s code size in bytes is this as displayed by
>> the ‘size’ command:
>>
>> text data bss dec
>> 504663285 16242884 9130248 530036417
>>
>> With PIE, the binary’s code size in bytes is this as displayed by the
>> ‘size’ command:
>>
>> text data bss dec
>> 539781977 16242900 9130248 565155125
>>
>> The text size of the binary grew by 7% and the total size by 6.6%.
>> Our experiments have shown that the binary sizes grow anywhere from 5%
>> to 9% with PIE on almost all benchmarks we looked at. Notice that
>> almost all the code bloat comes from the “text” segment of the binary,
>> which contains the executable code of the application and any
>> read-only data. We looked into this segment to see why this is
>> happening and found that the size of the section that contains the
>> dynamic relocations for a binary explodes with PIE. For instance,
>> without PIE, for the above binary the dynamic relocation section
>> contains 46 entries whereas with PIE, the same section contains
>> 1463325 entries. It takes 24 bytes to store one entry, that is 3
>> integer values each of size 8 bytes. So, the dynamic relocations
>> alone need an extra space of (1463325 - 46) * 8 bytes which is 35
>> million bytes which is almost all the bloat incurred!.
>>
>> * What are these extra dynamic relocations that are created for PIE executables?
>>
>> We noticed that these extra relocations for PIE binaries have a common
>> pattern and are needed for the reason that it is not known until
>> run-time where the binary will be loaded. All of these extra dynamic
>> relocations are of the ELF type R_X86_64_RELATIVE. Let us show using
>> an example what these relocations do.
>> Let us take an example of a program that stores the address of a global:
>>
>> #include <stdio.h>
>>
>> const int a = 10;
>>
>> const int *b = &a;
>>
>> int main() {
>>
>> printf (“b = %p\n”, b);
>>
>> }
>>
>> First, let us look at the binary built without PIE. Let’s look at the
>> data section where ‘b’ and ‘a’ are allocated.
>>
>> 00000000004007d0 <a>:
>> 4007d0: 0a 00
>>
>>
>> 0000000000401b10 <b>:
>> 401b10: d0 07
>> 401b12: 40 00 00
>>
>> Variable ‘a’ is allocated at address 0x4007d0 which matches the output
>> when running the binary. ‘b’ is allocated at address 0x401b10 and its
>> contents in little-endian byte order is the address of ‘a’.
>>
>> Now, lets us examine the contents of the PIE binary:
>>
>> 00000000000008d8 <a>:
>> 8d8: 0a 00
>>
>> 0000000000001c50 <b>:
>> 1c50: d8 08
>> 1c50: R_X86_64_RELATIVE *ABS*+0x8d8
>> 1c52: 00 00
>> 1c54: 00 00
>>
>>
>> Notice there is a dynamic relocation here which tells the dynamic
>> linker that this value needs to be fixed at run-time. This is needed
>> because ASLR can load this binary anywhere in the address space and
>> this relocation fixes the address after it is loaded.
>>
>>
>> * More details about R_X86_64_RELATIVE relocations
>>
>> This relocation is worth 24 bytes and has three fields
>>
>> Offset
>>
>> Type - here it is R_X86_64_RELATIVE
>>
>> Addend (what extra value needs to be added)
>>
>> The offset field of this relocation is the address offset from the
>> start where this relocation applies. The type field indicates the
>> type of the dynamic relocation but we are interested in particularly
>> one type of dynamic relocation, R_X86_64_RELATIVE. This is important
>> because in the motivating example that we presented above, all the
>> extra dynamic relocations were of this type!
>>
>>
>> * We have these proposals to reduce the size of the dynamic relocations section:
>>
>
> There are 3 pieces of run-time relocation information:
>
> 1. Type and symbol. 4 or 8 bytes
> 2. Offset. 4 or 8 bytes
> 3. Addend. 4 or 8 bytes
>
> If we use REL instead of RELA, addend can be implicit and stored in-place.
> If we limit the type to relative relocation, we only need offset.
> This is for PIC,
> not just for PIE. An we can use special encoding scheme for offset table,
> which can be placed in DT_GNU_RELATIVE_REL with
> DT_GNU_RELATIVE_RELSZ.
I have not done an intrusive change like this before, so I am
wondering what are the various tools/pieces that need to be modified.
Pointers to how to go about this would be really helpful. I can think
of these:
* Linker - gold, lld, gnuld
* Dynamic Linker
* readelf
* objdump
* ABI changes - what is involved here?
Thanks
Sri
>
> --
> H.J.