Reduce stack usage of _vfiprintf_r()

Joel Sherrill joel.sherrill@oarcorp.com
Wed Oct 10 19:33:00 GMT 2012


On 10/10/2012 12:05 PM, Freddie Chopin wrote:
> W dniu 2012-10-10 18:25, Corinna Vinschen pisze:
>>> In case of my code I
>>> don't expect it to be ever called, because I don't use unbuffered
>>> streams.
>> Now I'm puzzled.  If you're not using unbuffered IO, how come that
>> you notice a difference, given that the code in question is only
>> called for unbuffered IO?!?
>    From looking at the code you may come to conclusion that it does not
> get called if your stream is buffered, but - as I wrote in my first
> message - the __sbprintf() gets inlined (it's static and not used
> anywhere else, it will always be inlined with optimization enabled) in
> _vfiprintf_r(), thus the 1024 byte buffer on stack is allocated on EVERY
> entry to to _vfiprintf_r() - no matter what the stream is.
>
> With my change, the dynamic allocation is performed only when the
> execution actually reaches the code path.
>
> Another solution could be to make __sbprintf() not-static, so that it
> would not be inlined.
>
> I would never complain if the allocation would happen only for
> unbuffered stream, but it doesn't...
>
>> Keep in mind that we have to serve targets with size constraints as well
>> as targets which go for speed.  If you have a lot of output to an
>> unbuffered stream like stderr, calling malloc may slow down output
>> noticably.
> You can't expect unbuffered I/O to be fast... If I'm concerned it can be
> as slow as 1bps if it does not allocate 1kB on stack (; On Cortex-M3
> dynamic allocation takes about few-hundred cycles, so I guess that it's
> not a significant problem. By "overhead" I actually meant that
> allocation of 1024 bytes with malloc() can take up more memory, I was
> not refering to speed, as that's not the problem here.
>
> Bottom line: newlib _IS_ too big for ARM microcontrollers, that's why
> most people from the embedded world don't use anything more than math
> library. That's why commercial IDE/toolchains using GCC for such devices
> have their own - smaller - version of libc, not newlib (just to name
> CrossWorks and CodeRed). That's why AVR microcontrollers have their own
> avr-libc, which would probably better suit ARM microcontrollers if it
> was not targeted especially at AVR architecture.
One thing to keep in mind is that many of these other libc
implementations are far from complete. They implement a
small subset of capabilities.

Newlib aims for high compatibility with standards while still
being suitable for use in embedded systems.

As you note, avr-libc focuses heavily on AVRs with little (no?)
concern for other CPU architectures. My recollection is that
it also is a libc subset. Different project goal.

Corinna's to lower the buffer size or move the routine so
it isn't inlined would on a first order pass both be acceptable.
It may make sense to do both.

Another design consideration which sometimes comes into
play is to limit or forbid malloc()'s after the program completes
initialization.  The malloc() solution would push against this
rule and require analysis to ensure that all paths free the
memory. [1]

Sorry for the ramble.

[1] Disclaimer. I didn't review the patch in detail. I am
commenting more on how picking different top level
design rules and goals can influence the appropriateness
of a potential solution.


> Regards,
> FCh



More information about the Newlib mailing list