Help porting newlib to a new CPU architecture (sorta)

Hans-Bernhard Bröker HBBroeker@t-online.de
Wed Jul 7 18:43:36 GMT 2021


Am 06.07.2021 um 22:46 schrieb Orlando Arias:

 > Consider the AVR architecture, where program and data spaces have
 > distinct address spaces. We have a pointer to a string literal that
 > resides in program memory.

You're already mixing stuff up again.  The memory C string literals are 
in is, by definition, _not_ program memory.  It's read-only data memory. 
  That distinction is crucial.

Small-ish embedded CPUs do not usually implement the strict Harvard 
architecture principle, precisely because that does not support constant 
data.  A strict Harvard would have all data, including the 
const-qualified parts, in RAM, and initialize it all by running a very 
boring piece of program that just writes it all using immediate 
operands.  const data would thus consume normal RAM, without any 
write-protection by the hardware at all.

Micro controller designers have pulled different kinds of tricks to get 
around the need to have constants directly in ROM, ranging from the 
simple loop-hole instruction that does read from program memory anyway 
(like the 8051's MOVC), to various kinds of mirroring schemes that just 
map ROM into data space, essentially breaking the Harvard architecture 
rather fundamentally.

But that's ultimately a problem for the implementer of the C compiler 
and run-time library to address, if they decide to try doing that on 
such small architectures.

 > The problem with this code is that we are treating a as a pointer in
 > data memory. Declaring a to be PROGMEM does not help. We actually need
 > to rewrite the code to force the compiler to use the proper instruction:
That's what you get for throwing Standard C compatibility out the window 
by declaring that string constant using a compiler extension like PROGMEM.

Generally the compiler would be required by the Standard to implement 
"generic pointers" that can reach _all_ kinds of data defined without 
use of non-standard means.  If it doesn't do that, it is by definition 
not a C compiler.  Which can be fine, e.g. if the architecture just 
cannot have a correct C implementation otherwise, or only a horribly 
inefficient one.

But porting a generic standard C library like newlib or glibc onto a 
platform that needs non-standard compiler extensions just to emulate 
strcmp() may quickly turn into a lost cause.

 > Now, I believe that doing something like (char*)fn_ptr
 > in C is either undefined behavior

It quite explicitly is.


More information about the Newlib mailing list