[RFC] Allow linker scripts to specify multiple output regions for an output section?

Wed Mar 1 07:12:00 GMT 2017

On 28.02.17 12:11, Tejas Belagod wrote:
> On 28/02/17 05:51, Erik Christiansen wrote:
> > 
> > Would GAP use ALIGNMENT, or introduce a new parameter?
> 
> I wouldn't want to overload ALIGNMENT here - what if its needed
> simultaneously with ALIGNMENT. Can we not leave this space unassigned? More
> often than not if one's filling a memory region automatically, would they
> really care what goes into the gaps (if security is not a concern)?

My reason for raising the issue of ALIGNMENT was concern about splitting
instructions at the edge of a hole during flowing. I see below that the
proposed method avoids that problem entirely.

> OTOH, if security is a concern, we can explore introducing a new
> syntax with a default behavior of zero-filling the gaps.

We already have FILL to cover that, so I wouldn't worry.

> > How would the target-specific relocations required to break code across
> > the hole be handled by ld? E.g. break a small AVR code loop (with 6-bit
> > relative addressing range) and you'll need a LJMP to bridge the hole,
> > and another with reversed loop conditionality to close the loop.
> > Multiply that task by all the possible relocs, and again by all the
> > possible CPU targets, and it's never-ending work for a software team for
> > life.
> > 
> 
> As I understand, compilers generate references to objects within a section with a
> 
>  .<input_section_name> + offset_within_section
> 
> Now when a section that spans 2 or more regions inserts holes/padding to
> prevent an object from straddling 2 regions, the offsets within the section
> to other objects will change. This means all the compiler-generated "section
> + offset" of all objects that come after the padding will need to be fixed
> up. Its really difficult to know which ones to fix up - the relocations are
> only on the section label, not the object in the section. So, what I'm
> proposing here will not split the input sections - input sections will move
> as a block.

Aha! That is a commendably inexpensive way to avoid a great deal of pain.
A little bit of SIZEOF and LENGTH arithmetic in ld easily predicts
whether the current input section will fit in the current region, and
the start address of the next region becomes the new base for offsets,
without the need for additional arithmetic. Very neat. (So long as we
size our input sections modestly.)

...
> > If LMA can also be flowed around a hole, then runtime init code must be
> > able to handle not only non-contiguous delivery, but gapped pick-up. Has
> > the complexity of simultaneously handling different gaps in both been
> > considered?
> > 
> 
> I haven't thought about that. Can it be worked on the principle that when
> one specifies an LMA and there is user-written init code to copy blocks, the
> init code programmer knows the LMA gap layout and can handle the gaps
> accordingly?

I was playing devil's advocate there - the likelihood of gapped LMA
seems low in practice, as flash would mostly be larger than fast RAM.
It's just the worst case.

On many projects we used either a commercial or FOSS RTOS, and in each
case the init code was auto-generated. (Really nothing more than picking
up start/end addresses for read/write from the linker script, to use in
a single provided copy loop.) I have written my own less than half the
time - there may be embedded developers out there who have never done a
"bare metal" development. For them, once start and end labels, including
gap edges, are provided in the linker script, a small example in the ld
info would be the minimum needed.

> It could be the case currently where code from different
> non-contiguous ROMs are copied into a RAM during startup. This IMHO,
> is always specific to the particular embedded system being deployed.

OK, I'd thought that rare these days, as ROMs are so much bigger than
in my youth, but you did mention the case of overlays. It is easy to
imagine a separate ROM for one or several RAM-sized overlays. Then
overlay handling is as easy as manually handling gapped LMA, just done
in an overlay handler, rather than init.

With granularity equal to input sections, the proposal seems eminently
feasible, and an interesting project. I don't know what relocs might
ensue from bumping the ld location counter to the other side of a hole,
as when two input sections from one compile unit are separated to straddle
it, or whether ld would handle that without intervention. I'd be more
confident where the input sections are from separate compile units, and
connected only by globals.

I hope it goes well!

Erik