Infinity Note Format

This document is OUT OF DATE and retained for historical value only.

See https://infinitynotes.org/wiki/Note_format for the current version.

Basics

  1. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

  2. Please familiarize yourself with LEB128.

  3. Note providers are executables or shared libraries containing Infinity notes.

  4. Note consumers are tools that access Infinity notes from note providers.

  5. This document specifies three reasons to reject notes. Note consumers SHOULD differentiate between these in error messages, etc.

Note Rejection Reasons

Outer Format

Infinity notes are embedded in into executables or shared libraries, so part of the note format depends on the format of the containing file. The only currently supported containing file is ELF. If more formats are supported they should be added here.

For ELF files

Each Infinity note is contained within an ELF PT_NOTE with a note name of "GNU\0" and a note type of NT_GNU_INFINITY. The contents of the desc field are as described in Inner Format below. Each ELF note with a note name of "GNU\0" and a note type of NT_GNU_INFINITY MUST contain exactly one Infinity note.

NT_GNU_INFINITY is currently defined as 5, though this may change before Infinity becomes final.

Some information on ELF notes may be found here: http://www.netbsd.org/docs/kernel/elf-notes.html

Inner Format

Each Infinity note is built up from chunks. The format of each chunk is as follows:

uleb128/ur type_id
uleb128    version
uleb128    size
byte[size] data

Chunks

Info Chunks

Each note SHOULD have exactly one info chunk. Info chunks have a type_id of I8_CHUNK_INFO == 1. Info chunks are variable-length, so if the chunk ends before a field then the note does not have that field. Currently defined fields are as follows:

uleb128 provider_offset_in_stringtable
uleb128 name_offset_in_stringtable
uleb128 encoded_paramtypes_offset_in_stringtable
uleb128 encoded_returntypes_offset_in_stringtable
uleb128 max_stack

The four string table offsets reference strings in the note's string table and are used to construct the function's signature. Provider names starting with the string "i8" are reserved and are invalid here. max_stack is the maximum stack depth this function's code can generate. Info chunks SHOULD contain up to and including the max_stack field.

Code Chunks

Each note SHOULD have exactly 0 or 1 code chunks. Code chunks have a type_id of I8_CHUNK_CODE == 2. The format of a code chunk is as follows:

2byte               byte_order_mark
byte[rest_of_chunk] bytecode

The byte order mark is the number 26936 (0x6938), encoded in the same byte order as non-LEB128 multi-byte values in the bytecode. The bytecode is a serialized DWARF expression as described in InfinityBytecode.

Externals Table Chunks

Each note SHOULD have exactly 0 or 1 externals table chunks. Externals table chunks have a type_id of I8_CHUNK_ETAB == 3. An externals table contains one or more externals table entries concatenated together.

Function Reference Externals Table Entries

The format of a function reference external is as follows:

byte    'f'
uleb128 provider_offset_in_stringtable
uleb128 name_offset_in_stringtable
uleb128 encoded_paramtypes_offset_in_stringtable
uleb128 encoded_returntypes_offset_in_stringtable

The four fields have the same meaning as the fields of the same name in the info chunk and SHOULD be processed in the same way with the exception that providers starting with "i8" are allowed here.

Unrelocated Address Externals Table Entries

The format of an unrelocated address external is as follows:

byte    'x'
uleb128 address

The meaning of the address field is dependent on the format of the containing file.

For ELF files

The value stored in address MUST satisfy sh_addr <= address < sh_addr + sh_size for exactly one section of the containing ELF file. The value stored in address will be relocated in the same way as its containing section.

String Table Chunks

Each note SHOULD have exactly one string table chunk. String table chunks have a type_id of I8_CHUNK_STAB == 4. A string table contains one or more NUL-terminated strings concatenated together. Strings are encoded in Modified UTF-8 format to allow embedded NULs. Fields in other chunks reference strings by their offset from the start of the string table. Note that any offset into the table that yields a NUL-terminated Modified UTF-8 string is permitted, so in an I8_CHUNK_STAB chunk whose data field contains:

example\0string\xC0\x80table\0

The obvious strings are "example" at offset 0 and "string\0table" at offset 8, but note that an offset of 15 will yield the string "table" and offsets of 7 or 20 will both yield the empty string.

Note that no current use of strings in Infinity allows characters outside of A-Za-z0-9()_ (i.e. nobody needs to write a Modified UTF-8 decoder just yet!)

Encoded Type Lists

Lists of types (e.g. parameter types, return types) are encoded as strings. The basic types int, ptr and opaque are encoded as "i", "p" and "o" respectively. Function types are encoded as:

"F" + encoded return types + "(" + encoded parameter types + ")"

Examples:

  1. A function that accepts one ptr parameter and returns two int values has an encoded type of "Fii(p)".

  2. A function that accepts two parameters, 1) a function with an opaque parameter followed by an int parameter that returns an int and a ptr, and 2) an opaque parameter, that returns two ptr values has an encoded type of "Fpp(Fip(oi)o)".

  3. A function that accepts one int parameter and returns a function that accepts one ptr parameter and returns two int values has an encoded type of "FFii(p)(i)".

Processing:

Function Signatures

Infinity functions are referenced by their signature. A function's signature is constructed from its provider, its name, and its encoded parameter and return types lists as follows:

provider + "::" + name + "(" + encoded_paramtypes + ")" + encoded_returntypes

A function called "a_function" with provider "example_provider" that accepts one ptr parameter and returns two int values has a signature of "example_provider::a_function(p)ii".

None: NoteFormat (last edited 2016-06-06 12:18:23 by GaryBenson)

All content (C) 2008 Free Software Foundation. For terms of use, redistribution, and modification, please see the WikiLicense page.