[PATCH] SPU allocator

Patrick Mansfield patmans@us.ibm.com
Thu Feb 21 16:00:00 GMT 2008


On Tue, Feb 19, 2008 at 06:17:09PM -0500, Jeff Johnston wrote:
> Patrick Mansfield wrote:
>> On Tue, Feb 19, 2008 at 10:46:20AM -0800, Patrick Mansfield wrote:
>>   
>>> On Mon, Feb 18, 2008 at 01:01:31PM -0800, Patrick Mansfield wrote:
>>>     
>>>> Hi Jeff -
>>>>
>>>>       
> 1. Have you rigorously tested this?  The malloc code in place has been 
> around for a while.
>    Have you looked at general performance in addition to just the number of 
> mallocs
>    possible?  The current code is meant to be a good compromise of 

Yes, well tested functionally. I found the SPU sbrk bug (allocating data
that was currently in use on the stack) while testing this allocator, as
well as an alignment issue on one of the internal data structures.
This allocator is also the malloc supplied by the "SDK 3.0" release.

I should have posted 16 byte allocation comparison too, since that is
where we see the biggest delta.

Performance was not a major goal. The main goal was to keep the size
down, per your snip from mallocr.c, we want a more space-conserving malloc
implementation.

I will try to write up some code and compare performance of the two
allocators.

Note: I will be out the next two weeks so it might be a while!

> everything:
>
>    ------- from mallocr.c -------
>    * Why use this malloc?
>
>    This is not the fastest, most space-conserving, most portable, or
>   most tunable malloc ever written. However it is among the fastest
>   while also being among the most space-conserving, portable and tunable.
>   Consistent balance across these factors results in a good general-purpose
>   allocator. For a high-level description, see
>       http://g.oswego.edu/dl/html/malloc.html
>   ------- end of snippet ---------
>
> 2.  The malloc.h changes would need cleaning up.  I want us to move away 
> from
>     all the special clauses.  You can either replace malloc.h for spu 
> entirely or else
>     you can use machine/malloc.h to define the macros in addition to some 
> flags that
>     malloc.h can test for.  For now, you will have to leave in the Cygwin 
> tests which
>     I can remove later.  Note the functions are also declared in stdlib.h.  
> If you are
>     looking to short-circuit internal calls, you need to consider this.

OK ... I will look into cleaner malloc.h changes.

> 3. You don't have thread locking macros.  This isn't a problem with single 
> threaded
>    only,  but it would make sense to think about it early if future 
> multi-threaded support is
>    desired.

We (IBM as well as some Sony developers) have no plans for multi-threaded
SPU, and I don't see it ever happening. There are scheduler/kernels that
run on the SPU, like SPURS and some other open source code in progress,
but they do not use any threading.

> 4. You don't have other mem object files such as vallocr  replaced.  This 
> will cause you
>    grief because libc/stdlib will build them based on the original 
> mallocr.c.

This is not a major problem (vallocr can't call stdlib malloc code) since
we build with -DMALLOC_PROVIDED. Note there are extraneous "ifdef
MALLOC_PROVIDED" in the current stdlib/mallocr.c, as nearly the entire
file is wrapped by "ifdef MALLOC_PROVIDED".

I also checked the symbols in the corresponding libc.a, I have no vallocr
defined, I only see:

[...]

lib_a-pvallocr.o:
00000000 D _dummy_mallocr

[...]
lib_a-valloc.o:

lib_a-vallocr.o:
00000000 D _dummy_mallocr
[...]

But I will add a valloc, and I guess pvalloc too, unless anyone objects :-/

-- Patrick Mansfield



More information about the Newlib mailing list