1 What is Binary Annotation ?

Binary Annotation is a method for recording information about an application inside the application itself. It is an implementation of the Watermark specification defined here: https://fedoraproject.org/wiki/Toolchain/Watermark

Although mainly focused on recording security information, the system can be used to record any kind of data, even data not related to the application. One of the main goals of the system however is the ability to specify the address range over which a given piece of information is valid. So for example it is possible to specify that all of a program was compiled with the -O2 option except for one special function which was compiled with -O0 instead.

The range information is useful because it allows third parties to examine the binary and find out if its construction was consistent. IE that there are no gaps in the recorded information, and no special cases where a required feature was not active.

The system works by adding special sections to the application containing individual pieces of information along with an address range for which the information is valid. (Some effort has gone into the storing this information in a reasonably compact format).

The information is generated by a plugin that is attached to the compiler. The plugin extracts information from the internals of compiler and records them in the object file(s) being produced.

Note - the plugin method is just one way of generating the information. Any interested party can create and add information to the object file, providing that they follow the Watermark specification.

The information can be extracted from files via the use of tools like readelf and objdump. The annobin package itself includes a program called annocheck which can can also examine this information. Details on this program can be found elsewhere in this documentation.

Experience has shown however that storing the range information along with the data does tend to significantly increase the size of programs. So the system also provides an alternative implementation which uses a more compact format, at the cose of dropping the range data.