[ECOS] Scheduler lock in Doug Lea's malloc implementation - be real careful

bob.koninckx@o-3s.com bob.koninckx@o-3s.com
Mon Sep 11 13:56:00 GMT 2006

>    Bob> Obviously, you need protection there. What I do not
>    Bob> understand is why it is implemented using scheduler
>    Bob> lock/unlock instead of using mutexes or semaphores. Also the
>    Bob> proposed use of memory pools is not really clear to me (and
>    Bob> it certainly raises portability issues when moving from one
>    Bob> OS to another). As a general rule, I _always_ consider
>    Bob> dynamic memory allocation an undeterministic process, so I
>    Bob> _never_ use it in time critical code. It nevertheless seems
>    Bob> odd to me that you shouldn't be allowed to use dynamic memory
>    Bob> allocation in low priority threads with no real-time
>    Bob> requirements at all, just because of timing requirements in
>    Bob> higher priority threads. I see no reason why scanning the
>    Bob> heap could not be interrupted (put on hold for a while) to
>    Bob> give a real-time thread time to finish its job. This is not
>    Bob> possible if you protect things with a big kernel lock, but
>    Bob> would be if searching for memory would be protected by a
>    Bob> mutex.
>    Bob> Any comments ? Am I making a fundamental reasoning error here
>    Bob> ? If not, I am going to try to change the implementation
>    Bob> and see if it solves our problem and propose a patch.
>You seem to have missed my point. Let me try again.

I don't think so. I think we are thinking very much along the same lines. Let me also try again :-)

>The current code behaves as follows:
>    lock scheduler
>    alloc() or free() or realloc()
>    unlock scheduler

Which locks the scheduler for an _undeterministic_ amount of time, depending on the size of the heap, its fragmentation etc...

>Let T_alloc be the worst case time spent in the alloc(), free() or
>realloc() routines, noting that the realloc() implementations in the
>heap will fail if the chunk cannot be resized and the alternative
>alloc()/memcpy()/free() implementation happens elsewhere. T_alloc will
>depend on factors like heap fragmentation so ideally would be measured
>in the context of a real application, but some of testcases in the
>memalloc package probably come close enough.
>What you are proposing is:
>    lock_mutex
>      lock scheduler
>      mutex lock code
>      unlock scheduler
>    alloc() or free() or realloc()
>    unlock_mutex
>      lock scheduler
>      mutex unlock code
>      unlock scheduler


>Let T_lock be the time spent in the mutex lock code and T_unlock the
>time spent in the mutex unlock code. These should probably be measured
>for the case where the mutex is contested, priority inversion
>protection triggers, and there are threads blocked on the mutex, not
>for the simple uncontested case. The other important number is T_max -
>the maximum time the system will spend with the scheduler locked
>because of other code. This is rather tricky to measure and will
>depend in part on the application and its I/O behaviour.
>If T_alloc is comparable to max(T_lock, T_unlock) then there is
>nothing to be gained by using a mutex rather than the scheduler lock.
>You are adding cpu cycle overhead to the memory allocation routines
>for no improvement in real-time performance.


>If T_alloc < T_max, then switching from a scheduler lock to a mutex
>lock/unlock will not change the maximum dispatch latency and will not
>affect the worst case real-time characteristics. It may still affect
>the average dispatch latency.


>If T_alloc == T_max, i.e. the worst case dispatch latency is caused by
>the memory allocation code, then we have a problem that needs


>I believe the author of the original code assumed that T_alloc was
>comparable to max(T_lock, T_unlock), or at least that (T_alloc < T_max).
>Hence using a scheduler lock saved some cpu cycles for every memory
>allocation operation without affecting real-time behaviour. I suspect
>that assumption is false but do not have any numbers to prove it.

>From what we see happening in our application, I am almost sure that it is false.

>There are several memory allocators, but the ones we are interested in
>are the fixed block, variable block and dlmalloc ones. If for all
>three we have T_alloc >> MAX(T_lock, T_unlock) then we should probably
>switch to using a mutex by default on the grounds of improving average
>dispatch latency. If for all three we have T_alloc == T_max then we
>must switch to using a mutex by default to reduce the maximum dispatch
>latency. Unfortunately this requires measuring T_max.

Anything we can do to provide you with the numbers you need ?

>If on the other hand we find that for say the fixed memory allocator
>T_alloc is much the same as MAX(T_lock, T_unlock) but dlmalloc behaves
>very differently then we would need to introduce new templates, with
>the old ones still using the scheduler lock and the new ones using a
>I do not have time right now to experiment and get some real numbers.
>However I would very much like to see such numbers before or
>accompanying any patch that changes the behaviour. Also it would be
>important to run not just the memory allocation package's own tests
>but also the uITRON ones.

I don't have that much time myself. I'd like to hear your ideas on a suitable test for measuring thing such that we can make a sound decision.

Thanks for your input



Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

More information about the Ecos-discuss mailing list