[PATCH 00/19] CTF linking support
Nick Alcock
nick.alcock@oracle.com
Tue Jul 16 18:05:00 GMT 2019
This is the beginning of support for linking CTF sections. The type
deduplicator is not very good, creation of "subsections" for types and variables
with conflicting definitions in distinct TUs hardly ever happens even when it
should, and mingw might well be in trouble because I am using tmpfile() and am
not pulling in gnulib because I haven't figured out how to do it in an
Automake-using binutils project yet :) however, we only use tmpfile() when we
encounter conflicting definitions, so I figure we can fix that problem in the
next series. It may well also be broken on non-ELF platforms again: I'm not
even sure what an example of such a platform is, let alone how to target one
without piles of system headers etc that I don't have. (See patch 19 for
questions on this.)
This is *not* ready to go upstream yet -- it is broken on non-ELF. I mean to
fix that soon, but I thought I should provide some indication of the sort of
thing I'm doing, regardless. (The code changes to work on ELF will probably be
quite small, moving a bit of code out of ldlang.c into the elf32 emulation via
yet another callback.)
Also this is currently much too slow (10s to link ld itself, for instance), but
you only take the speed hit when CTF sections are present, and the delay is
entirely down to the thing I'm using as a deduplicator right now, which was
really not designed for it and will be rewritten.
Most of the job of linking is done by code in libctf itself, which I am the
maintainer of: I'm happy to have people look at it but I'm fairly confident
about what I'm doing in there. But the last patch... the last patch ties it
into the BFD and ld linking machinery. I really *need* review of that one,
and probably of the one before it as well since that too touches bfd. It
works where none of the dozens of other approaches I tried came close to
working, but I have no idea if it is the right way to do things at all.
Extensive questions for reviewers are in the commit log for that patch.
But despite the caveats in patch 19, in conjunction with a CTF-capable GCC,
it does work, without warnings or leaks, and merges the type sections down
quite satisfactorily:
oranix@loom 551 % size -A /tmp/gcc/bin/ld
/tmp/gcc/bin/ld :
section size addr
.interp 28 4194984
.note.gnu.build-id 36 4195012
.note.ABI-tag 32 4195048
.gnu.hash 172 4195080
.dynsym 3264 4195256
.dynstr 1116 4198520
.gnu.version 272 4199636
.gnu.version_r 144 4199912
.rela.dyn 216 4200056
.rela.plt 2808 4200272
.init 23 4206592
.plt 1888 4206624
.text 949825 4208512
.fini 9 5158340
.rodata 1447296 5160960
.eh_frame_hdr 19172 6608256
.eh_frame 122760 6627432
.init_array 8 6757888
.fini_array 8 6757896
.dynamic 480 6757904
.got 16 6758384
.got.plt 960 6758400
.data 25136 6759360
.bss 22976 6784512
.comment 68 0
.debug_aranges 6320 0
.debug_info 4323894 0
.debug_abbrev 144836 0
.debug_line 750267 0
.debug_str 231568 0
.debug_loc 2069864 0
.debug_ranges 179760 0
.ctf 212598 6815680
Total 10517820
oranix@loom 552 % PATH=/tmp/gcc/bin:$PATH objdump --ctf=.ctf /tmp/gcc/bin/ld
/tmp/gcc/bin/ld: file format elf64-x86-64
Contents of CTF section .ctf:
Header:
Magic number: dff2
Version: 4 (CTF_VERSION_3)
Flags: 0x1 (CTF_F_COMPRESS)
Variable section: 0x0 -- 0xedf (0xee0 bytes)
Type section: 0xee0 -- 0x133db3 (0x132ed4 bytes)
String section: 0x133db4 -- 0x14cbfc (0x18e49 bytes)
Labels:
Data objects:
Function objects:
Variables:
_xexit_cleanup -> a7e: void (*)() (size 0x8) -> a7d: void () (size 0x0)
bfd_x86_64_arch -> 53ee: const struct bfd_arch_info (size 0x50) -> 238: struct bfd_arch_info (size 0x50)
iamcu_elf32_vec -> afe9: const struct bfd_target (size 0x370) -> 286: struct bfd_target (size 0x370)
bfd_last_cache -> c9b6: struct bfd * (size 0x8) -> 1f4: struct bfd (size 0x6)
_CTF_NULLSTR -> 39bf: const char [0] (size 0x0)
[...]
Types:
1: long int (size 0x8)
[0x0] (ID 0x1) (kind 1) long int (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
2: ptrdiff_t (size 0x8) -> 1: long int (size 0x8)
[0x0] (ID 0x2) (kind 10) ptrdiff_t (aligned at 0x8)
3: long unsigned int (size 0x8)
[0x0] (ID 0x3) (kind 1) long unsigned int (aligned at 0x8, format 0x0, offset:bits 0x0:0x40)
4: size_t (size 0x8) -> 3: long unsigned int (size 0x8)
[0x0] (ID 0x4) (kind 10) size_t (aligned at 0x8)
5: int (size 0x4)
[0x0] (ID 0x5) (kind 1) int (aligned at 0x4, format 0x1, offset:bits 0x0:0x20)
6: wchar_t (size 0x4) -> 5: int (size 0x4)
[0x0] (ID 0x6) (kind 10) wchar_t (aligned at 0x4)
7: struct (size 0x20)
[0x0] (ID 0x7) (kind 6) struct (aligned at 0x8)
[0x0] (ID 0x8) (kind 1) long long int __max_align_ll (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
[0x80] (ID 0x9) (kind 2) long double __max_align_ld (aligned at 0x10, format 0x6, offset:bits 0x0:0x80)
8: long long int (size 0x8)
[0x0] (ID 0x8) (kind 1) long long int (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
9: long double (size 0x10)
[0x0] (ID 0x9) (kind 2) long double (aligned at 0x10, format 0x6, offset:bits 0x0:0x80)
a: struct (size 0x20)
[0x0] (ID 0xa) (kind 6) struct (aligned at 0x8)
[0x0] (ID 0x8) (kind 1) long long int __max_align_ll (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
[0x80] (ID 0x9) (kind 2) long double __max_align_ld (aligned at 0x10, format 0x6, offset:bits 0x0:0x80)
[...]
668: struct elf_internal_rela * (size 0x8) -> 632: struct elf_internal_rela (size 0x18)
[0x0] (ID 0x668) (kind 3) struct elf_internal_rela * (aligned at 0x8)
Strings:
0:
1: A
3: AOUTHDR
b: AOUTHDR64
15: AddressOfEntryPoint
29: Age
2d: B
(Yes, that's right: 6.4MiB of debuginfo, a meg of rodata, a meg of text, and
only 200KiB of CTF: 129 bytes per type after compression even if you include the
strtab, even counting all the structure member names! And this is with a *bad*
deduplicator that emits piles of dups for no especially good reason, and a type
table layout I already know how to shrink quite a lot more, and probably we
shouldn't be emitting variable entries for static variables either: those will
probably(?) go away when I do the function and variable info section linking
work. And I'm hoping to add lzma as an option as well.)
Hans-Peter Nilsson (1):
libctf: make it compile for old glibc
Nick Alcock (18):
libctf, include: ChangeLog format fixes
libctf: allow the header to change between versions
libctf, binutils: dump the CTF header
libctf, bfd: fix ctf_bfdopen_ctfsect opening symbol and string
sections
libctf: add the object index and function index sections
binutils: readelf: when dumping CTF, load strtab and symtab
automatically
binutils: objdump does not take --ctf-symbols or --ctf-strings options
libctf: Add iteration over non-root types
libctf: support getting strings from the ELF strtab
libctf: write CTF files to memory, and CTF archives to fds
libctf: fix memory leak on ctf_compress_write error path
libctf: dump: support non-root type dumping
libctf: dump: check the right error values when dumping functions
libctf: add the ctf_link machinery
libctf: map from old to corresponding newly-added types in
ctf_add_type
libctf: add linking of the variable section
libctf: get rid of a disruptive public include of <sys/param.h>
bfd: new functions for getting strings out of a strtab
bfd, ld: add CTF section linking
--
2.22.0.238.g049a27acdc
More information about the Binutils
mailing list