[PATCH 00/19] CTF linking support

Nick Alcock nick.alcock@oracle.com
Tue Jul 16 18:05:00 GMT 2019

This is the beginning of support for linking CTF sections.  The type
deduplicator is not very good, creation of "subsections" for types and variables
with conflicting definitions in distinct TUs hardly ever happens even when it
should, and mingw might well be in trouble because I am using tmpfile() and am
not pulling in gnulib because I haven't figured out how to do it in an
Automake-using binutils project yet :) however, we only use tmpfile() when we
encounter conflicting definitions, so I figure we can fix that problem in the
next series.  It may well also be broken on non-ELF platforms again: I'm not
even sure what an example of such a platform is, let alone how to target one
without piles of system headers etc that I don't have.  (See patch 19 for
questions on this.)

This is *not* ready to go upstream yet -- it is broken on non-ELF.  I mean to
fix that soon, but I thought I should provide some indication of the sort of
thing I'm doing, regardless.  (The code changes to work on ELF will probably be
quite small, moving a bit of code out of ldlang.c into the elf32 emulation via
yet another callback.)

Also this is currently much too slow (10s to link ld itself, for instance), but
you only take the speed hit when CTF sections are present, and the delay is
entirely down to the thing I'm using as a deduplicator right now, which was
really not designed for it and will be rewritten.

Most of the job of linking is done by code in libctf itself, which I am the
maintainer of: I'm happy to have people look at it but I'm fairly confident
about what I'm doing in there. But the last patch... the last patch ties it
into the BFD and ld linking machinery. I really *need* review of that one,
and probably of the one before it as well since that too touches bfd. It
works where none of the dozens of other approaches I tried came close to
working, but I have no idea if it is the right way to do things at all.
Extensive questions for reviewers are in the commit log for that patch.

But despite the caveats in patch 19, in conjunction with a CTF-capable GCC,
it does work, without warnings or leaks, and merges the type sections down
quite satisfactorily:

oranix@loom 551 % size -A /tmp/gcc/bin/ld
/tmp/gcc/bin/ld  :
section                  size      addr
.interp                    28   4194984
.note.gnu.build-id         36   4195012
.note.ABI-tag              32   4195048
.gnu.hash                 172   4195080
.dynsym                  3264   4195256
.dynstr                  1116   4198520
.gnu.version              272   4199636
.gnu.version_r            144   4199912
.rela.dyn                 216   4200056
.rela.plt                2808   4200272
.init                      23   4206592
.plt                     1888   4206624
.text                  949825   4208512
.fini                       9   5158340
.rodata               1447296   5160960
.eh_frame_hdr           19172   6608256
.eh_frame              122760   6627432
.init_array                 8   6757888
.fini_array                 8   6757896
.dynamic                  480   6757904
.got                       16   6758384
.got.plt                  960   6758400
.data                   25136   6759360
.bss                    22976   6784512
.comment                   68         0
.debug_aranges           6320         0
.debug_info           4323894         0
.debug_abbrev          144836         0
.debug_line            750267         0
.debug_str             231568         0
.debug_loc            2069864         0
.debug_ranges          179760         0
.ctf                   212598   6815680
Total                10517820

oranix@loom 552 % PATH=/tmp/gcc/bin:$PATH objdump --ctf=.ctf /tmp/gcc/bin/ld

/tmp/gcc/bin/ld:     file format elf64-x86-64

Contents of CTF section .ctf:

    Magic number: dff2
    Version: 4 (CTF_VERSION_3)
    Flags: 0x1 (CTF_F_COMPRESS)
    Variable section:   0x0 -- 0xedf (0xee0 bytes)
    Type section:       0xee0 -- 0x133db3 (0x132ed4 bytes)
    String section:     0x133db4 -- 0x14cbfc (0x18e49 bytes)


  Data objects:

  Function objects:

    _xexit_cleanup ->  a7e: void (*)() (size 0x8) -> a7d: void () (size 0x0)
    bfd_x86_64_arch ->  53ee: const struct bfd_arch_info (size 0x50) -> 238: struct bfd_arch_info (size 0x50)
    iamcu_elf32_vec ->  afe9: const struct bfd_target (size 0x370) -> 286: struct bfd_target (size 0x370)
    bfd_last_cache ->  c9b6: struct bfd * (size 0x8) -> 1f4: struct bfd (size 0x6)
    _CTF_NULLSTR ->  39bf: const char [0] (size 0x0)

     1: long int (size 0x8)
        [0x0] (ID 0x1) (kind 1) long int  (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
     2: ptrdiff_t (size 0x8) -> 1: long int (size 0x8)
        [0x0] (ID 0x2) (kind 10) ptrdiff_t  (aligned at 0x8)
     3: long unsigned int (size 0x8)
        [0x0] (ID 0x3) (kind 1) long unsigned int  (aligned at 0x8, format 0x0, offset:bits 0x0:0x40)
     4: size_t (size 0x8) -> 3: long unsigned int (size 0x8)
        [0x0] (ID 0x4) (kind 10) size_t  (aligned at 0x8)
     5: int (size 0x4)
        [0x0] (ID 0x5) (kind 1) int  (aligned at 0x4, format 0x1, offset:bits 0x0:0x20)
     6: wchar_t (size 0x4) -> 5: int (size 0x4)
        [0x0] (ID 0x6) (kind 10) wchar_t  (aligned at 0x4)
     7: struct  (size 0x20)
        [0x0] (ID 0x7) (kind 6) struct   (aligned at 0x8)
            [0x0] (ID 0x8) (kind 1) long long int __max_align_ll (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
            [0x80] (ID 0x9) (kind 2) long double __max_align_ld (aligned at 0x10, format 0x6, offset:bits 0x0:0x80)
     8: long long int (size 0x8)
        [0x0] (ID 0x8) (kind 1) long long int  (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
     9: long double (size 0x10)
        [0x0] (ID 0x9) (kind 2) long double  (aligned at 0x10, format 0x6, offset:bits 0x0:0x80)
     a: struct  (size 0x20)
        [0x0] (ID 0xa) (kind 6) struct   (aligned at 0x8)
            [0x0] (ID 0x8) (kind 1) long long int __max_align_ll (aligned at 0x8, format 0x1, offset:bits 0x0:0x40)
            [0x80] (ID 0x9) (kind 2) long double __max_align_ld (aligned at 0x10, format 0x6, offset:bits 0x0:0x80)
     668: struct elf_internal_rela * (size 0x8) -> 632: struct elf_internal_rela (size 0x18)
        [0x0] (ID 0x668) (kind 3) struct elf_internal_rela *  (aligned at 0x8)

    1: A
    3: AOUTHDR
    b: AOUTHDR64
    15: AddressOfEntryPoint
    29: Age
    2d: B

(Yes, that's right: 6.4MiB of debuginfo, a meg of rodata, a meg of text, and
only 200KiB of CTF: 129 bytes per type after compression even if you include the
strtab, even counting all the structure member names! And this is with a *bad*
deduplicator that emits piles of dups for no especially good reason, and a type
table layout I already know how to shrink quite a lot more, and probably we
shouldn't be emitting variable entries for static variables either: those will
probably(?) go away when I do the function and variable info section linking
work. And I'm hoping to add lzma as an option as well.)

Hans-Peter Nilsson (1):
  libctf: make it compile for old glibc

Nick Alcock (18):
  libctf, include: ChangeLog format fixes
  libctf: allow the header to change between versions
  libctf, binutils: dump the CTF header
  libctf, bfd: fix ctf_bfdopen_ctfsect opening symbol and string
  libctf: add the object index and function index sections
  binutils: readelf: when dumping CTF, load strtab and symtab
  binutils: objdump does not take --ctf-symbols or --ctf-strings options
  libctf: Add iteration over non-root types
  libctf: support getting strings from the ELF strtab
  libctf: write CTF files to memory, and CTF archives to fds
  libctf: fix memory leak on ctf_compress_write error path
  libctf: dump: support non-root type dumping
  libctf: dump: check the right error values when dumping functions
  libctf: add the ctf_link machinery
  libctf: map from old to corresponding newly-added types in
  libctf: add linking of the variable section
  libctf: get rid of a disruptive public include of <sys/param.h>
  bfd: new functions for getting strings out of a strtab
  bfd, ld: add CTF section linking


More information about the Binutils mailing list