Help porting newlib to a new CPU architecture (sorta)

Wed Jul 7 20:23:46 GMT 2021

Greetings,

On 7/7/21 2:43 PM, Hans-Bernhard Bröker wrote:
> Am 06.07.2021 um 22:46 schrieb Orlando Arias:
> 
>> Consider the AVR architecture, where program and data spaces have
>> distinct address spaces. We have a pointer to a string literal that
>> resides in program memory.
> 
> You're already mixing stuff up again.  The memory C string literals are
> in is, by definition, _not_ program memory.  It's read-only data memory.
>  That distinction is crucial.
> 
> Small-ish embedded CPUs do not usually implement the strict Harvard
> architecture principle, precisely because that does not support constant
> data.  A strict Harvard would have all data, including the
> const-qualified parts, in RAM, and initialize it all by running a very
> boring piece of program that just writes it all using immediate
> operands.  const data would thus consume normal RAM, without any
> write-protection by the hardware at all.

At the risk of further derailing the initial conversation, I feel like
there is some misunderstanding here on the AVR architecture. Address 0
in program memory contains the code that executes as part of the reset
vector. Address 0 in data memory is a mirror of r0. There are two
physically different address spaces in that architecture. This is very
explicitly stated in the datasheet for any megaAVR or tinyAVR
microcontroller. The C compiler (gcc) treats (void*)0 as address 0 in
data memory. To initialize the .data section, the C runtime has to copy
data from one address space to a different address space. This is where
the lpm instruction comes into play: it allows you to load data across
physical address spaces. There is [unfortunately, we can debate] no
remapping/mirroring that takes place.

Now, in AVR, as you mention C string literals are expected to be in data
memory, so they need to be copied over. Because of limited SRAM,
however, the compiler provides an extension to the C language to keep
string literals in the program memory address space. String literals
stored this way are not copied over to SRAM by the runtime. Declaring
the literal as:

const char* m = "hello, world!\n";

is not enough to keep them in program memory. You have to utilize the
PROGMEM macro:

const char* m PROGMEM = "hello, world!\n";

which actually expands to __attribute__((section(".progmem"))) or some
such. To access them, you need to utilize very specific macros/functions
since the load has to be done with the lpm instruction. It may be
confusing looking at a flat dump of the binary, since gcc still treats
the end result as a "flat single address space" but in reality, that is
not how the hardware operates. There are two physically distinct address
spaces, and addresses between them share nothing in common.

This is in contrast to something like a Cortex-M based core, where
address 0 contains the initial value for the main stack pointer. The C
runtime still has to initialize .data using information from a read only
memory [usually flash]. However, this read only memory shares the same
address space as RAM. Yes, the Cortex-M core has multiple AMBA AXI ports
to connect into a bus matrix, but the memory system is still unified.
Both program memory [flash/ROM/FeRAM...] and data memory [SRAM...] are
in the same address space.

You can declare something like:

const char*m = "hello, world!\n";

and the compiler is smart enough to keep that data in the read only
portions of memory [namely flash/ROM/FeRAM...]. They will not be copied
over to SRAM by the C runtime. Accesses and references will be performed
[using the ldr* family of instructions]. In fact, the C compiler will
embed large integer literals in program code, and load them directly
from read only memory into registers. This is because there is a limit
as to how large of an integer literal can be encoded in a mov
instruction. This is also how things like jump tables are implemented by
gcc on Arm architectures [both A and M profiles, can not say for R
profiles since I have not used them, but I imagine it is the same].

> Micro controller designers have pulled different kinds of tricks to get
> around the need to have constants directly in ROM, ranging from the
> simple loop-hole instruction that does read from program memory anyway
> (like the 8051's MOVC), to various kinds of mirroring schemes that just
> map ROM into data space, essentially breaking the Harvard architecture
> rather fundamentally.

I have seen the the mirroring scheme at work before. The STM32F4
microcontrollers [Cortex-M4F cores], for example, map internal flash to
both at address 0 and address 0x08000000. SRAM begins at 0x20000000 as
per the Armv7-M standard mandates, followed by a bit banding region for
SRAM. From the perspective of the Cortex core, this is all in a single,
unified address space. Yes, both flash and SRAM are different memory
types, with different characteristics and power, clock, and access
requirements, but they all lie in the same address space. This is unlike
AVR, where no such schemes are available.

This also has the side effect that you can not really do code injection
in AVR. You can copy as much shellcode as you want to SRAM, but you will
not be able to execute it, unless the currently executing code is in the
bootloader section of program memory, the bootloader copies the code to
program memory, then proceeds to execute it [and this requires a rather
convoluted process]. In Arm Cortex-M cores, however, you can have code
execute from SRAM as if it was executing from a read only memory [MPU
permissions notwithstanding]. OpenOCD does this a lot, actually, when
dealing with Arm-based microcontrollers. In order to load a program into
flash, they inject code into SRAM which configures the [memory mapped]
flash controller for the core you are working with to allow for writes,
then proceed to have the flash controller store the program. Because how
the address map in an Armv7-M [and Armv8-M for that matter] core is
structured, the end result is the program code available starting at
address 0 [with the initial vector table at that location].

> But that's ultimately a problem for the implementer of the C compiler
> and run-time library to address, if they decide to try doing that on
> such small architectures.
> 
>> The problem with this code is that we are treating a as a pointer in
>> data memory. Declaring a to be PROGMEM does not help. We actually need
>> to rewrite the code to force the compiler to use the proper instruction:
> That's what you get for throwing Standard C compatibility out the window
> by declaring that string constant using a compiler extension like PROGMEM.
> 
> Generally the compiler would be required by the Standard to implement
> "generic pointers" that can reach _all_ kinds of data defined without
> use of non-standard means.  If it doesn't do that, it is by definition
> not a C compiler.  Which can be fine, e.g. if the architecture just
> cannot have a correct C implementation otherwise, or only a horribly
> inefficient one.
> 
> But porting a generic standard C library like newlib or glibc onto a
> platform that needs non-standard compiler extensions just to emulate
> strcmp() may quickly turn into a lost cause.
> 

Except that you need to do this, because it is how the architecture
works. If you do not care about conserving SRAM in AVR, you can declare
your literals as const. The compiler will do its thing and assume
constness for optimization purposes, but the runtime will happily copy
them over to SRAM at startup and you can use your standard C library
functions. Now, if you want to be more conscious about your SRAM usage,
you need to use the non-standard means I mentioned. The fact that there
are two physically distinct address spaces requires that.

Cheers,
Orlando.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
URL: <https://sourceware.org/pipermail/newlib/attachments/20210707/f62fac3e/attachment-0001.sig>