[RFC] Symbol meta-information ELF extension

Jozef Lawrynowicz jozef.l@mittosystems.com
Tue Feb 18 10:25:00 GMT 2020


Hi,

I've been working with Texas Instruments to develop an ELF extension which
allows additional information about symbols ("symbol meta-information") to
be stored in ELF files, in a section called .symtab_meta.

The aim of symbol meta-information is to provide a extensible format for
propagating additional information about symbols from the source code through to
the link stage.

I hope I can get some feedback on this proposal, and its current implementation,
from the upstream community so I can make any adjustments required for the
eventual upstreaming of this feature.

TI spec'd out this functionality and are implementing it into their soon-to-be
released ARM Clang/LLVM toolchain fork. They will also be looking to upstream
it to Clang/LLVM, so it is not GNU-specific.

So far, the behaviour of two new C/C++ attributes, "retain" and "location", and
their mapping to symbol meta-information have been spec'd out:
* "retain" - a symbol meta-information entry for the symbol informs the linker
  not to garbage collect the section containing the symbol, even if it appears
  unused.
* "location" - a symbol meta-information entry describes the VMA the
  linker should try to place the corresponding symbol at (similar to the "at"
  attribute in the ARM compiler).

So far these attributes making use of symbol meta-information are mainly
targeted at embedded software developers, where the ability to save objects from
garbage collection or place objects at specific addresses can sometimes be
necessary. Being able to describe this behaviour in the source code without
having to modify linker scripts arguably improves the developer experience.

In the patches attached to this RFC, the general mechanism for the symbol
meta-information functionality has been implemented in the BFD library and
plumbed into much of Binutils:
* GAS supports the ".sym_meta_info" directive which describes a symbol
  meta-information entry. It consolidates symbol meta-information entries into a
  table which is placed in the .symtab_meta section of the output object file.
* LD supports the consolidation of symbol meta-information from input object
  files into a single table, and will save symbols with an accompanying
  SMK_RETAIN meta-information entry from garbage collection.
* objdump supports dumping the symbol meta-information table with the
  --symtab-meta option.
* objcopy maintains the integrity of symbol meta-information as the input BFD is
  modified.

The high-level list of functionality that is not currently implemented in this
RFC, but I intend to add before submitting the patches for inclusion into
Binutils is:
* Support the placement of symbols at a VMA specified with an SMK_LOCATION
  meta-information entry in LD.
* Dump the symbol meta-information table in readelf.
* Support output formats other than ELF when linking ELF input object files with
  symbol meta-information.
* Clarify expected behaviour of objcopy/strip with "retained" symbols. Might
  need a new option to enable/disable the removal of these symbols.

I've added documentation to the individual tools as appropriate, a generic
description of the format of .symtab_meta and the entries within it is in
the ELF backends section of the BFD manual.

I've benchmarked the performance of LD when linking the Linux kernel
with/without symbol meta-information. There are ~120k symbols in the linked
Linux kernel, and after adding 120k new, randomly named symbols with
meta-information, there is no observable change in the time it takes to link
between these new symbols having meta-information or not.
I tested this in 3 combinations:
- 60k local symbols, 60k global symbols
- 120k local symbols
- 120k global symbols

I have some specific questions which I hope someone can provide some suggestions
for:
* Does this functionality need to be gated with a configure flag?
  I guess that partly depends on if a maintainer decides it should be on or off
  by default.
* I've positioned the .symtab_meta section to be emitted immediately after
  .symtab. I couldn't find anything in the ELF spec that enforces the positions
  of .symtab relative to .shstrtab and .strtab, but I recognize that these
  sections' relative positions have been the same for a long time. Are there any
  possible issues relating to this?

Also, any general feedback on the proposal and/or implementation is welcome.

I've additionally attached a rough GCC patch which emits .sym_meta_info
directives when __attribute__((retain)) is used on declarations of functions or
data.

I've built the attached patches with --enable-targets=all and confirmed there
are no issues there.
I've regtested and confirmed the new tests work for msp430-elf, arm-eabi,
x86-64-pc-linux-gnu, and also built and regtested the native configuration on a
i686-pc-linux-gnu host.
I additionally regtested for i386-pe, to check a non-ELF target.

Thanks,
Jozef
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-BFD-Support-symbol-meta-information-ELF-extension.patch
Type: text/x-patch
Size: 51819 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20200218/dba4ceaf/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-TESTSUITE-Add-symbol-meta-information-tests.patch
Type: text/x-patch
Size: 58599 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20200218/dba4ceaf/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcc-support-retain-attribute-meta-info.patch
Type: text/x-patch
Size: 4262 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20200218/dba4ceaf/attachment-0002.bin>


More information about the Binutils mailing list