RFC: Program Properties

Nick Clifton nickc@redhat.com
Fri Jan 1 00:00:00 GMT 2016


Hi Guys,

  It looks like we are all working towards a common goal, which is good.

  H.J. - I like your program property scheme, especially the idea of having
  a lightweight, allocatable section that can be quickly parsed by the loader.

  I assume that the NT_GNU_PROPERTY_TYPE_0 note type also serves as a version
  indicator ?  Ie future versions of the specification would use a different
  value to indicate newer features ?

  One thing that I would like to propose is an extension to the scheme to add
  a second, larger note section that is not allocatable, but which instead
  contains information to be parsed by static tools.  In particular this
  section would contain information about the tools used to build the binary
  and the security features enabled (or disabled).  Also this section would
  be able to discriminate this information on a per-symbol basis if necessary,
  so that multiple, conflicting properties can be recorded for a single file.

  I have a preliminary proposal to implement this second section (see below)
  and I would be very interested in any thoughts that you might have.

Cheers
  Nick

The purpose of this non-allocatable note section is to provide a way for
package maintainers and distributions to answer questions about the binaries
in their distribution.  Especially security related questions.  Here is the
current, preliminary, specification:

* The information is stored in a new section in the file using the ELF
  NOTE format.  Creator tools (compilers, assemblers etc) place the notes
  into the binary files.  Consumer tools (none written yet, but readelf 
  and/or objdump could be enhanced for this purpose) read the notes and
  answer questions about the binaries concerned.  Static linkers need
  special care to handle merging of the notes.

* The information is stored in a section called .gnu.build.attributes.
  (The name can be changed - it is basically irrelevant anyway, it is
  the new section flag (defined below) that matters).   The section has 
  the SHT_NOTE type and a new section flag set: SHF_GNU_BUILD_ATTRIBUTES.
  (Suggested value: 0x00100000).  This ndicates the special needs when
  merging notes (see below).

  The sh_link field should be set to contain the index of symbol table
  section.  If this field is 0 then the consumer should assume that
  the first section of type SHT_SYMTAB in the section headers is
  symbol table being used.

* The specification breaks the name/description convention of ELF
  notes to instead use a key/value/applies-to list.  (This not a
  problem as we are only breaking a convention not a requirement of
  the ELF NOTE specification).  The type of the note is the key.  The
  name of the note is the value and the description field is the
  applies-to list.

  By default the description field contains the filename of the source
  file that was used to produce the binary.  (FIXME: Absolute pathname
  ?  Relative pathname ?  Just the filename with no path ?)  This
  indicates that the key/value pair applies to all symbols in the
  file.  The length of this string must *not* be multiple of 4 (with
  the terminating NUL byte included).  If necessary the filename
  should be padded with an extra NUL byte.  (Note - this padding byte
  is separate from the padding bytes used to align the description
  field to its normal boundary).

  This restriction is so that a description containing symbol names
  (see below) can be distinguished from a description containing a
  file name.

  If a key/value pair applies to just some of the symbols in a file,
  then the description instead contains a list of 4-byte or 8-byte wide 
  numbers.  These are indices into the symbol table, (pointed to by 
  the sh_link field of the section header).  Notes:
  
    + In unrelocated files the offset should instead be zero, with a
      relocation present to set the actual value once the file is
      linked.  FIXME: Unable to implement at the moment.  Instead
      the relocation generated by the assembler evaluates to *value*
      of the symbol not its index in the ELF symbol table section.
      May have to change this spec if I cannot find a way around
      this.

    + The numbers are stored in the same endian format as that
      specified in the EI_DATA field of the ELF header of the file
      containing the note.

    + The symbol table is indexed rather than the string table because
      consumers are most likely to be interested the symbol as a
      whole, not just its name.  (FIXME: Is this true ?)      

  An empty description field is a special case.  It should be treated
  as if it had the same filename as the nearest preceding version
  note.  (See NT_GNU_BUILD_ATTRIBUTE_VERSION below).  FIXME: This
  assumes that a linker will preserve the order of notes when
  linking.  Does this actually happen ?

  Multiple notes of the same key can exist, providing that they have
  different values and that their applies-to lists do not intersect.
  (FIXME: is this restriction necessary ?  Perhaps there are times when
  a symbol can have multiple values for the same key).

  Where notes for the same key exist in both symbol index form and
  filename form, the symbol index form takes precedence.  Any symbol
  in the given file not explicitly indexed by one of the notes will
  take its value from the note using the filename form.

  At most one note for a given key can exist containing a filename
  rather than symbol indices.  If this rule is broken then this
  indicates that the file has been created by a linker that has not
  been enhanced to support this specification.  In such cases all
  notes containing symbol indices should be ignored.


* When the linker merges two or more files containing these notes it
  should ensure that the above rules are maintained, and that the
  notes are merged appropriately.

  The linker will create a new version note (see the definition of
  NT_GNU_BUILD_ATTRIBUTE_VERSION below), with the output filename as
  its description, and the name set to any version of this
  specification that it chooses.  Any input version notes that match
  this version are discarded.  Other version notes are preserved and
  included in the output file.

  When notes are merged the following rules apply:

   1. If all input notes of a given type just contain filenames and
      they all have the same value string then a single output note is
      created with this type/value and the output filename as its
      description.  Otherwise:

   2. If rule 1 would match except for one or more symbol containing
      notes then rule 1 is executed, but the symbol containing notes
      are also preserved and copied to the output.  If this is a
      relocatable link then the relocations associated with the symbol
      indices should also be updated.  Otherwise:

  3.  [This rule triggers if there are filename containing notes with
      different value strings].  The linker chooses one of the input
      value strings to be the default for the output and creates an
      output note using this value.  (Presumably the linker will
      choose the value with the most matching input files).  Input
      notes containing filenames but with a value that does not match
      this output value must be converted into symbol containing notes
      listing *all* of the symbols in the input file.  Failure to do
      this breaks the requirement that there only be one filename
      containing output note for the given key.

  If this is a final link, then relocations on the notes should of
  course be resolved.

  The linker is also able to create and insert its own notes.  Eg to
  indicate that -z relro is enabled.

  Linkers that have not been enhanced to support this proposal will
  simply concatenate the notes.  (They may also eliminate duplicate
  notes, although this is not guaranteed.  They may also sort the
  notes which would break the use of empty description fields, as
  mentioned above).  In this case the output file is likely to contain
  multiple notes with the same key/value pair.  Consumers can detect
  this situation by noticing that there is no
  NT_GNU_BUILD_ATTRIBUTE_VERSION note with output file name, and hence
  deduce that any notes containing symbol indices are broken.  (The
  linker will not have updated the indices when merging the notes).
  Despite only supporting a file level granularity however, these
  notes may still prove useful.


* Three new note types defined (so far):

  Type: NT_GNU_BUILD_ATTRIBUTE_VERSION  (0x100)
  Name: A string identifying the version of this specification
        that is implemented in the accompanying notes.  Currently set
        to "1.0".

  Type: NT_GNU_BUILD_ATTRIBUTE_CREATOR  (0x101)
  Name: A string identifying the tool that created the symbols and
        their associated code eg:
        "gcc (GCC) 6.2.1 20160916 (Red Hat 6.2.1-2)"
        includes name, date and version.

  Type: NT_GNU_BUILD_ATTRIBUTE_OPTIONS  (0x102)
  Name: A string identifying the *significant* compile time options
        affecting the specified symbols.  Ie those that affect ABI,
        security, etc. 

        Note: selection of *significant* compile time options may be
        subject to debate.  But the actual choice can vary over time,
        this does not affect the current proposal.



More information about the Gnu-gabi mailing list