RFC: Program Properties
Nick Clifton
nickc@redhat.com
Fri Jan 1 00:00:00 GMT 2016
Hi Guys,
It looks like we are all working towards a common goal, which is good.
H.J. - I like your program property scheme, especially the idea of having
a lightweight, allocatable section that can be quickly parsed by the loader.
I assume that the NT_GNU_PROPERTY_TYPE_0 note type also serves as a version
indicator ? Ie future versions of the specification would use a different
value to indicate newer features ?
One thing that I would like to propose is an extension to the scheme to add
a second, larger note section that is not allocatable, but which instead
contains information to be parsed by static tools. In particular this
section would contain information about the tools used to build the binary
and the security features enabled (or disabled). Also this section would
be able to discriminate this information on a per-symbol basis if necessary,
so that multiple, conflicting properties can be recorded for a single file.
I have a preliminary proposal to implement this second section (see below)
and I would be very interested in any thoughts that you might have.
Cheers
Nick
The purpose of this non-allocatable note section is to provide a way for
package maintainers and distributions to answer questions about the binaries
in their distribution. Especially security related questions. Here is the
current, preliminary, specification:
* The information is stored in a new section in the file using the ELF
NOTE format. Creator tools (compilers, assemblers etc) place the notes
into the binary files. Consumer tools (none written yet, but readelf
and/or objdump could be enhanced for this purpose) read the notes and
answer questions about the binaries concerned. Static linkers need
special care to handle merging of the notes.
* The information is stored in a section called .gnu.build.attributes.
(The name can be changed - it is basically irrelevant anyway, it is
the new section flag (defined below) that matters). The section has
the SHT_NOTE type and a new section flag set: SHF_GNU_BUILD_ATTRIBUTES.
(Suggested value: 0x00100000). This ndicates the special needs when
merging notes (see below).
The sh_link field should be set to contain the index of symbol table
section. If this field is 0 then the consumer should assume that
the first section of type SHT_SYMTAB in the section headers is
symbol table being used.
* The specification breaks the name/description convention of ELF
notes to instead use a key/value/applies-to list. (This not a
problem as we are only breaking a convention not a requirement of
the ELF NOTE specification). The type of the note is the key. The
name of the note is the value and the description field is the
applies-to list.
By default the description field contains the filename of the source
file that was used to produce the binary. (FIXME: Absolute pathname
? Relative pathname ? Just the filename with no path ?) This
indicates that the key/value pair applies to all symbols in the
file. The length of this string must *not* be multiple of 4 (with
the terminating NUL byte included). If necessary the filename
should be padded with an extra NUL byte. (Note - this padding byte
is separate from the padding bytes used to align the description
field to its normal boundary).
This restriction is so that a description containing symbol names
(see below) can be distinguished from a description containing a
file name.
If a key/value pair applies to just some of the symbols in a file,
then the description instead contains a list of 4-byte or 8-byte wide
numbers. These are indices into the symbol table, (pointed to by
the sh_link field of the section header). Notes:
+ In unrelocated files the offset should instead be zero, with a
relocation present to set the actual value once the file is
linked. FIXME: Unable to implement at the moment. Instead
the relocation generated by the assembler evaluates to *value*
of the symbol not its index in the ELF symbol table section.
May have to change this spec if I cannot find a way around
this.
+ The numbers are stored in the same endian format as that
specified in the EI_DATA field of the ELF header of the file
containing the note.
+ The symbol table is indexed rather than the string table because
consumers are most likely to be interested the symbol as a
whole, not just its name. (FIXME: Is this true ?)
An empty description field is a special case. It should be treated
as if it had the same filename as the nearest preceding version
note. (See NT_GNU_BUILD_ATTRIBUTE_VERSION below). FIXME: This
assumes that a linker will preserve the order of notes when
linking. Does this actually happen ?
Multiple notes of the same key can exist, providing that they have
different values and that their applies-to lists do not intersect.
(FIXME: is this restriction necessary ? Perhaps there are times when
a symbol can have multiple values for the same key).
Where notes for the same key exist in both symbol index form and
filename form, the symbol index form takes precedence. Any symbol
in the given file not explicitly indexed by one of the notes will
take its value from the note using the filename form.
At most one note for a given key can exist containing a filename
rather than symbol indices. If this rule is broken then this
indicates that the file has been created by a linker that has not
been enhanced to support this specification. In such cases all
notes containing symbol indices should be ignored.
* When the linker merges two or more files containing these notes it
should ensure that the above rules are maintained, and that the
notes are merged appropriately.
The linker will create a new version note (see the definition of
NT_GNU_BUILD_ATTRIBUTE_VERSION below), with the output filename as
its description, and the name set to any version of this
specification that it chooses. Any input version notes that match
this version are discarded. Other version notes are preserved and
included in the output file.
When notes are merged the following rules apply:
1. If all input notes of a given type just contain filenames and
they all have the same value string then a single output note is
created with this type/value and the output filename as its
description. Otherwise:
2. If rule 1 would match except for one or more symbol containing
notes then rule 1 is executed, but the symbol containing notes
are also preserved and copied to the output. If this is a
relocatable link then the relocations associated with the symbol
indices should also be updated. Otherwise:
3. [This rule triggers if there are filename containing notes with
different value strings]. The linker chooses one of the input
value strings to be the default for the output and creates an
output note using this value. (Presumably the linker will
choose the value with the most matching input files). Input
notes containing filenames but with a value that does not match
this output value must be converted into symbol containing notes
listing *all* of the symbols in the input file. Failure to do
this breaks the requirement that there only be one filename
containing output note for the given key.
If this is a final link, then relocations on the notes should of
course be resolved.
The linker is also able to create and insert its own notes. Eg to
indicate that -z relro is enabled.
Linkers that have not been enhanced to support this proposal will
simply concatenate the notes. (They may also eliminate duplicate
notes, although this is not guaranteed. They may also sort the
notes which would break the use of empty description fields, as
mentioned above). In this case the output file is likely to contain
multiple notes with the same key/value pair. Consumers can detect
this situation by noticing that there is no
NT_GNU_BUILD_ATTRIBUTE_VERSION note with output file name, and hence
deduce that any notes containing symbol indices are broken. (The
linker will not have updated the indices when merging the notes).
Despite only supporting a file level granularity however, these
notes may still prove useful.
* Three new note types defined (so far):
Type: NT_GNU_BUILD_ATTRIBUTE_VERSION (0x100)
Name: A string identifying the version of this specification
that is implemented in the accompanying notes. Currently set
to "1.0".
Type: NT_GNU_BUILD_ATTRIBUTE_CREATOR (0x101)
Name: A string identifying the tool that created the symbols and
their associated code eg:
"gcc (GCC) 6.2.1 20160916 (Red Hat 6.2.1-2)"
includes name, date and version.
Type: NT_GNU_BUILD_ATTRIBUTE_OPTIONS (0x102)
Name: A string identifying the *significant* compile time options
affecting the specified symbols. Ie those that affect ABI,
security, etc.
Note: selection of *significant* compile time options may be
subject to debate. But the actual choice can vary over time,
this does not affect the current proposal.
More information about the Gnu-gabi
mailing list