ABI document

Florian Weimer fweimer@redhat.com
Mon Aug 31 12:54:04 GMT 2020


* Vivek Das Mohapatra:

> diff --git a/program-loading-and-dynamic-linking.txt b/program-loading-and-dynamic-linking.txt
> new file mode 100644
> index 0000000..751eaca
> --- /dev/null
> +++ b/program-loading-and-dynamic-linking.txt
> @@ -0,0 +1,272 @@
> +Program Headers
> +===============

Thanks for doing this, it's very much needed.

I can't tell whether the formatting is okay, and whether we need to add
markup (like ``) for program text.

> +These are GNU extended program header type values: They are typically
> +found in ElfW(Phdr).p_type.
> +
> +PT_GNU_EH_FRAME  0x6474e550
> +PT_SUNW_EH_FRAME 0x6474e550
> +
> +  Segment contains the EH_FRAME_HDR section (stack frame unwind information)
> +
> +  PT_SUNW_EH_FRAME is used by a non-GNU implementation for the same purpose,
> +  and has the same value (although this does not imply compatible contents).

Can you dig out a link to a document that describes for the format of
the GNU_EH_FRAME segment?

I think this should also say that the virtual address range must be
covered by an earlier PT_LOAD segment.

> +PT_GNU_STACK     0x6474e551
> +
> +  The p_flags member of this ElfW(Phdr) structure apply to the stack.
> +
> +  If present AND p_flags DOES NOT contain PF_X (0x1) then the stack
> +  should _not_ be executable.
> +
> +  Otherwise the stack is executable (the default).

The default depends on the architecture, I think.

I think we have differing behavior in regards to the size of the
segment.  glibc ignores it, other implementations may use it to set the
stack size.

> +PT_GNU_RELRO     0x6474e552
> +
> +  The specified segment should be made read-only once run-time linking
> +  has completed.

I think it's relocation of this object, not the entire linking
operation.

Also we should try to sketch the interaction with PT_LOAD.

> +PT_GNU_PROPERTY  0x6474e553
> +
> +  The Linux kernel uses this program header to locate the
> +  .note.gnu.property section.
> +
> +  If there is a program property that requires the kernel to perform
> +  some action before loading and ELF file (eg AArch64 BTI or intel CET)
> +  then this header MUST be present.

“Intel”

The requirement could be worded better.  It must only be present if
these features are to be enabled.

> +  The contents are laid out as follows:
> +
> +  Field      | Length   | Contents
> +  n_namsz    | 4        | 4
> +  n_descsz   | 4        | Size of n_desc (4 byte int, processor format)
> +  n_type     | 4        | NT_GNU_PROPERTY_TYPE_0 (0x5)
> +  n_name     | 4        | GNU\0
> +  n_desc     | n_descsz | property array
> +
> +  Each element of n_desc, in turn is:
> +
> +  typedef struct {
> +    Elf_Word pr_type;
> +    Elf_Word pr_datasz;
> +    unsigned char pr_data[PR_DATASZ];
> +    unsigned char pr_padding[PR_PADDING];
> +  } Elf_Prop;
> +
> +  Properties are sorted in ascending order of pr_type;
> +
> +  pr_data is aligned to 4 bytes in 32-bit objects and 8 bytes in 64-bit ones.

What's the overall alignment of the segment?  8 bytes on 64-bit?

This also has to say where the padding is inserted: before pr_data?
After Elf_Prop?  I think it's the latter, and that Elf_Prop is aligned
even if the pr_data member is absent.

This means that we should have Elf32_Prop and Elf64_Prop with different
alignment.  (We can avoid mentioning the type name in the ABI document,
I guess.)

> +  Defined properties are:
> +
> +  GNU_PROPERTY_STACK_SIZE  0x1
> +
> +  A native format & size integer specifying the minimum stack size.
> +  The linker should pick the highest instance of this from all relocatable
> +  objects in the link chain and ensure the stack is at least this big.

So this is an Elf*_Addr?

> +  GNU_PROPERTY_NO_COPY_ON_PROTECTED 0x2
> +
> +  The linker should treat protected data symbol as defined locally at
> +  run-time and copy this property to the output share object.
> +
> +  The linker should add this property to the output share object if
> +  any protected symbol is expected to be defined locally at run-time.
> +
> +  The run-time loader should disallow copy relocations against protected
> +  data symbols defined such objects.
> +
> +  This type has a PR_DATASZ of 0.

“pr_datasz field”?

> +DT_GNU_PRELINKED  0x6ffffdf5
> +
> +  The d_val field contains a time_t value giving the UTC time at which the
> +  object was (pre)linked.

Woah, I didn't know we had this.  Is this really the time when prelink
was run?  So running it multiple times does not always produce the same
results?  It seems it's this way indeed:

  if (! verify)
    info->ent->timestamp = (GElf_Word) time (NULL);
  dso->info_DT_GNU_PRELINKED = info->ent->timestamp;

That's not good for reproducibility (but then prelink results depend 

> +DT_GNU_CONFLICTSZ 0x6ffffdf6
> +
> +  Used in prelinked objects.
> +  d_val contains the size of the conflict segment.
> +
> +DT_GNU_LIBLISTSZ  0x6ffffdf7
> +
> +  Used in prelinked objects.
> +  d_val contains the size of the library list.

It would be nice to add a link there to the prelink documentation.
(There's a prelink.tex file in the sources.)

> +DT_GNU_HASH       0x6ffffef5
> +
> +  The d_ptr value gives the location of the GNU style symbol hash table.

Do we have a format documentation for those?

> +DT_GNU_CONFLICT   0x6ffffef8
> +
> +  Used in prelinked objects.
> +  The d_ptr value gives the location of the conflict segment.
> +  This will contain an array of ElfW(Rela) structs.
> +
> +  If DT_GNU_LIBLIST matches the library searchlist after loading
> +  then these relocation records are replayed immediately after
> +  run-time loading.
> +
> +DT_GNU_LIBLIST    0x6ffffef9
> +
> +  Used in prelinked objects.
> +  The d_ptr value gives the location of the ElfW(Lib) array giving the
> +  SONAME, checksum and timestamp or each library encountered at prelink time.
> +
> +  This is used to check that all required prelinked libraries are still
> +  present, loaded, and have the correct checksums at runtime.

Maybe group this with the earlier prelink items?

> +Section Headers
> +===============

> +SHT_GNU_verdef             0x6ffffffd
> +
> +SHT_GNU_verneed            0x6ffffffe
> +
> +SHT_GNU_versym             0x6fffffff

I think the canonical reference for these is:

  <https://www.akkadia.org/drepper/symbol-versioning>

> +Note section descriptors (SHT_NOTE extensions)
> +==============================================

> +NT_GNU_BUILD_ID        3
> +
> +  descsz bytes of build-id data.
> +  Typically presented as a hex string.

But stored in binary?

Maybe reference the ld documentation here, and say that the actual
computation mechanism is unspecified?

Thanks,
Florian



More information about the Gnu-gabi mailing list