RFC: syntax for a section ordering file

Fangrui Song i@maskray.me
Fri Apr 26 17:04:13 GMT 2024


On 2024-04-26, Nick Clifton wrote:
>Hi Fangrui,
>
>>* Apple ld -order_file. lld's MachO port ld64.lld has ported the option.
>>   The feature is like ld.lld --section-ordering-file='s superset with
>>   filename support. The syntax also supports "x86_64:" prefix, but this
>>   design seems quite unusual in linker features.
>>
>>   This option is used by iOS mobile applications.
>>
>>   example:
>>   https://github.com/llvm/llvm-project/blob/main/lld/test/MachO/order-file.s
>
>Hmm, the order files in that example appear to be specifying the order of symbols
>relative to each other, rather than sections.  Presumably the code locates the
>input section containing the symbol and places it before/after the input section
>containing the other symbol.  Assuming that both sections are going to be mapped
>onto the same output section.
>
>I am not sure about usefulness of the architecture specifiers.  I would have
>thought that if necessary the build system could have per-architecture ordering
>files.
>
>
>>* gold --section-ordering-file=: which might be most similar to this patch.
>>   I believe this option is effectively unused in the wild.
>
>I have had reports from customers saying that one of the reasons they do not
>want to switch from gold to lld or ld.bfd is that they use gold's section
>ordering file option.

Interesting. I wonder whether they are fine with changing

.text.a
.text.b
.text.c
.data.x
.data.y

to

a
b
c
x
y

>>   People find the section-based naming approach too inconvenient.
>>   This is incompatible with sections that are not suffixed and clang
>>   -fno-unique-section-names.
>
>I did not know about that option.  Thanks for pointing it out.
>
>Maybe an approach based upon symbol names would be better.  Harder
>to implement, but better from a user's point of view...  Hmm.
>

Yes...
The symbol order approach avoids the issue raised in the discussion on
".t/.text" at https://sourceware.org/pipermail/binutils/2024-April/133879.html .
There is a potentially more efficient implementation. See below.

>>* ld.lld --symbol-ordering-file=:
>>
>>   This option is used by Android and regular Linux folks focusing on server performance.
>>
>>   example:
>>   https://github.com/llvm/llvm-project/blob/main/lld/test/ELF/symbol-ordering-file.s
>>
>>   I have some notes at
>>   https://maskray.me/blog/2020-11-15-explain-gnu-linker-options#symbol-ordering-filefile
>
>Very hepful.  I wish that I had read this before starting to adapt H.J.'s code...
>

Thanks!

>>> To my mind if the section ordering file contains the following:
>>>
>>>    # A comment
>>>   .text.hot .text.cold,.text.warm
>>>   .data.big
>>>   .data.small
>>>   .text.foo*
>>
>>> Then this should be roughly equivalent to:
>>>
>>> SECTIONS
>>> {
>>>   .text : {
>>>     *(.text.hot)
>>>     *(.text.cold)
>>>     *(.text.warm)
>>>     *(.text.foo*)
>>>     *(.text)
>>>     }
>>>   .data : {
>>>     *(.data.big)
>>>     *(.data.small)
>>>     *(.data)
>>>     }
>>> }
>>>
>>> So all of the .text.<something> entries in the section ordering
>>> file are placed at the start of the output .text section (even
>>> if some of them occur after entries for other output sections)
>>> and all of the .data.<something> entries are placed at the start
>>> of the .data section.
>>>
>>> This will require co-operation from the linker script to have
>>> the "INCLUDE config.section-ordering-file" statements at the
>>> correct places, but I think that it could work.
>>
>>Hmm.  I am curious why the first INCLUDE (in .text) does not append
>>.data.big/.data.small (as requested).
>
>Because the entries in the ordering file are matched to the output
>section name.  So an entry that starts with .text will be matched
>to the .text output section and an entry that starts with .data
>will be matched to the .data output section.
>
>In the updated patch now posted to the binutils list, I have also
>implemented an explicit section name matching feature.  So if the
>entry in the ordering file looks like this:
>
>   .text(.data)
>
>then it will be matched to the .text output section and will place
>all input sections called .data at that point in the output.
>
>Cheers
>  Nick
>

It seems that section selection is followed by sorting (--sort-section/SORT* keywords).

In essence, the section order implementation combines the linker script
and the section order to create the final section selection. However,
this approach relies on heuristics (which can be somewhat fragile) to
correctly associate sections like .data.x and .data.y with the output
section .data instead of something like .text. (There will be more
ambiguity whether .text.a should be placed into .t)

A symbol order implementation iterates over the symbol list and
assigning a priority to the associated section for each defined symbol.
Unmatched sections receive a priority of 0. Here's an example:

     a.o:.text.a     -5 (highest priority)
     b.o:.text.b     -4
     a.o:.text.c     -3
     b.o:.data.x     -2
     b.o:.data.y     -1 (lowest priority)
     ...
     other sections   0 (no priority)

After the --sort-section/SORT* command sorts the input section
description, it's further sorted (stable sort) based on the assigned
priorities.
For example, with an input description of *(.text .text.*), the priorities will
guide the sorting of the matched sections.

In his presentation "Speeding up the BFD linker", Michael optimized
section selection.  He might have insights into where symbol ordering
could be implemented effectively.


More information about the Binutils mailing list