On 22.02.17 15:28, Thomas Preudhomme wrote:
There has been some interest in the past in having syntactic support for
specifying mapping of an output section to multiple memory regions in the
GNU LD scripting language (eg.
https://sourceware.org/bugzilla/show_bug.cgi?id=14299). I would like to
propose a scheme here and welcome any feedback.
TL;DR: Detailed response begins after 6 paragraphs.
OK, in the absence of prior discussion, I'll just think aloud as I
correlate the proposal with my experience in three decades developing
embedded systems. Unfortunately, the one time an MMU was involved, that
was done by the time I became involved, but memory holes are all black.
The closest scenario I recall is where there were disparate physical
memories, both on and off chip, I simply added a MEMORY region for each
such block, e.g. Flash, 16bit SRAM, 8bit SRAM, a couple of small ones
for specific memory mapped system chips with bunches of config
registers, and maybe an FPGA in the mix. Add comments for device names
and the waitstate generator values, and the script serves as central
documentation too.
With that one-to-one region mapping, there was never any conflict over
where stuff should be located, and non were interchangeable. It is as
described by "some on-chip memory and some off-chip memory, but at
non-contiguous addresses" in the above link. And where we had both 8 and
16 bit SRAMS, it was most definitely consistent with "a region of
on-chip SRAM which performs better for code, and the remainder performs
better for data", except that using the wrong one was fatal rather than
merely inferior.
One issue I've encountered is detecting region overflow when multiple
output sections contribute to its content, but existing syntax supports
that, e.g.:
MEMORY
{
flash (rx) : ORIGIN = 0, LENGTH = 32K
ram (rw!x) : ORIGIN = 0x800060, LENGTH = 2K
eeprom (rw!x) : ORIGIN = 0x810000, LENGTH = 1K
}
. = ASSERT (_etext + SIZEOF (.data) <= LENGTH(flash) , "Error: .text + .data
collectively overflow the flash memory." ) ;
But the need to flow across memory holes never eventuated in practice,
as a modest chunk of on-chip RAM could always be used for e.g. sdata,
leaving no need for flowing. All other regions were always incompatible,
making flowing impossible.
...
If LMA is specified, the image(startup code etc.) most likely handles
the copying from load address to output section VMA.
Yes, it does. And in the generic init code I've encountered, it has just
been a single copy loop for e.g. bss, performing a contiguous block copy.
(And when I've written it, that was true too.)
Multiple segment spec means the output section can be part of more
than one segment and ‘fillexp’ simply fills the output section loaded
with the fill value.
Trans-hole flowing would also require a runtime copy loop for each
non-contiguous block, or a table-driven multi-block copier, with the
run-time table somehow initialised from the linker script. (I can
imagine using variables defined in the linker script, and the .RPT
assembler directive - maybe.)
Now, this does not have a method to specify output section spanning multiple
memory regions. For example, if there are 2 RAM regions RAML and RAMU and
the user wants an output section to first fill RAML and then when RAML is
full, i.e. when the remaining space in RAML cannot accommodate a full input
section, start filling RAMU, the user has to split the sections into
multiple output sections. If we extend this syntax to specify multiple
output regions, we can make the linker map the output section to multiple
regions by filling the output region with input sections in the order
specified in the ‘output-section-command’ and when its full (meaning when
the remaining gap in a region cannot accommodate one full input section, it
starts from the next output region.
This seems to be the alternate view of the problem of asking ld to flow
code around holes in a region, something it still can't do, IIRC. I
state it that way, because two non-contiguous memory regions over which
code (or data) may be interchangeably flowed, are identical to a single
region with a hole.
The proposal does seem to be a way to think about addressing that issue:
Eg.
MEMORY
{
RAML (rwx) : ORIGIN = 0x1FFF0000, LENGTH = 0x00010000
RAMU (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00040000
RAMZ (rwx) : ORIGIN = 0x20040000, LENGTH = 0x00040000
}
SECTIONS
{
.text 0x1000 : { *(.text) _etext = . ; }
.mdata :
AT ( ADDR (.text) + SIZEOF (.text) )
{ _data = . ; *(.data) *(.data.*); _edata = . ; } > RAML, RAMU, RAMZ
}
Without the need for new syntax or complex init code generators,
having gcc flow code across up to 5 pages of flash plus .lowtext and a
floating .hightext was compatible with the linker script and tests shown
here:
http://lists.nongnu.org/archive/html/avr-gcc-list/2012-12/msg00044.html
While details have faded from wet RAM, ISTR that holes were
manufacturable by not populating any of the 5 pages, which gcc sees as
named spaces. The gcc stuff was done in the AVR back end, IIRC, while an
implementation in ld would be generic.
Illustration:
Consider an example where we have the following input .data sections:
.data: size 0x0000FFF0
.data.a : size 0x000000F0
.data.b : size 0x00003000
.data.c : size 0x00000200
With the above scheme, this will be mapped in the following way to RAML,RAMU
and RAMZ:
RAML : (0x1FFF0000 - 0x1FFFFFF0): .data
(0x1FFFFFF0 - 0x1FFFFFFF): *** GAP ***
Would GAP use ALIGNMENT, or introduce a new parameter?