[ECOS] Why to signal condvar with mutex held?

Sergei Organov osv@topconrd.ru
Wed Dec 15 16:38:00 GMT 2004

sandeep <shimple0@yahoo.com> writes:
> > And I said I do understand what Nick said. I just don't believe it is
> > significant enough to be an argument against wait morphing.
> Nick> I considered this during the design phase and decided against it. This
> Nick> would be the only instance where such a thing was needed, and putting
> Nick> intimate details of the threads state into the condvar code would have
> Nick> made some configuration options more difficult to handle.
> i am trying to understand what could be reasons/pitfalls that might
> have been considered during that decision making. as such it is a
> Configurable os :) i guess, whenever he has some free time at hand, he
> can elaborate the pitfalls objectively. but surely wait morphing can
> be provided as a configurable option and kept as experimental feature
> for couple of months. if there are problems due to it, those should
> surface in large enough user base of eCos, otherwise it could be taken
> in as regular feature.
> since you have done some code modifications for that, maintainers shouldn't
> have  big issues with accepting your patch(es) for the same.

Well, I've got eCos kernel architect's (Nick) response that wait
morphing has been considered and abandoned. This obviously doesn't
suggest wait morphing implementation patch would be welcome.

> > Do you know when the thread gets preempted by higher priority thread in
> > the Nick's case? There should be some external event for the high
> > priority thread to become ready to preempt those medium priority thread.
> > My case is just slightly different in that external event awakes medium
> > priority thread, not the high priority one.
> As i understood his original scenario. i am trying to put that here with an
> example.
> T2 (medium priority) is executing and moving towards acquiring the mutex. A
> high priority thread T1 becomes active due to some event and preempts T2. In
> course of it's execution, T1 signals T3 (low priority). later T1 takes some
> action(s) that causes it to go out of runqueue. at this point only T2 and T3
> are runnable.
> In existing scenario, mutex is in unlocked state and in fight b/w T2
> and T3, T2 will get it.
> in wait morphing condition, T3 is passed the ownership, so though T2 will run
> because of higher priority, it won't be able to get the mutex and get out of
> runqueue.
> is my understanding of things correct?

Not exactly in the last part, I'm afraid. See below for explanations.

> something that i don't quite get about the implementation of
> waitmorphing. can you please post your patch for waitmorphing?

Yes, I can, but it would take time to carefully separate it as my
current source base has unrelated changes as well. To explain is
simpler. In the wait morphing implementation the signal() does the

1. Dequeue first thread from the condvar wait queue.
2. If mutex is locked, put this thread into the mutex wait queue.
3. Otherwise wakeup the thread and lock the mutex for this dequeued
   thread (make the dequeued thread own the mutex).

In fact, the (3) could be replaced by just waking the thread though my
measurements show that used approach is slightly faster.

And now after writing down the above, I suddenly realize that Nick's
scenario can't occur if signalling is performed with the mutex locked!
Indeed, if mutex is locked by T1 when signalling, the T3 is in the mutex
queue after signalling due to wait morphing, then T1 unlocks the mutex
waking T3, then T1 eventually goes to sleep and we end up having
unlocked mutex and two ready threads, T2 and T3. T2 is run due to its
higher priority and locks the mutex.

Please also notice that if step (3) doesn't lock mutex for the thread,
then Nick's scenario won't happen even if signalling is performed with
mutex unlocked.

> > I speak about guaranteed maximum delay for the thread to get from point
> > A in the code (before mutex lock) to point B (after mutex lock). And
> quite right.
> > this guaranteed maximum delay doesn't seem to be affected by wait
> > morphing, as even without wait morphing I can find a scenario that
> > results in the same delay as in Nick's scenario with wait morphing.
> please give that scenario. it would help us understand the things even
> better, irrespective of whether eCos gets waitmorphed or not. :)

I thought I've already gave the scenario with the same delay. In fact
any scenario where T2 tries to lock the mutex that is currently owned by
T3 will do. Anyway, as the Nick's scenario doesn't in fact happen (see
above), there is no need to discuss it further.

> > heard really bad things may happen when broadcasting a condvar without
> > wait morphing.
> it will be good learning experience, if you can mention these bad things or
> provide link(s) on net that mention these.
> > Just imagine a bunch of threads simultaneously awake, run
> > on multiple processors and try to acquire the same locked mutex. Hard
> > stall on the bus I foresee.
> if mutex is locked, all will go to wait state again, put in wait
> queue, as they would in uniprocessor case.

Well, maybe "stall" is wrong term here, but how much contention on the
SMP bus due to all these threads trying to access the mutex and its wait
queue simultaneously. Can you predict how much time will it take to
finish the process of putting them all back to sleep in the mutex queue?
Don't forget about cache synchronization as every CPU should have recent
values for mutex lock and mutex queue state.

On the other hand, with wait morphing single CPU (those that happened to
run T1 when it signalled the condvar) does all the work of moving
threads from one queue to another.

> > Why not? Wait morphing doesn't awake any threads so its behavior should
> surely, i don't understand waitmorphing business wrt broadcast then.

Instead of waking all the threads waiting in condvar queue, wait
morphing moves all the threads from condvar queue directly into the
mutex queue. Very simple.

> > be the same on SMP. Wait morphing will put the threads into the mutex
> > queue in corresponding order no matter SMP or non-SMP system is in
> > use.
> how does it get the original wait queue in priority order

It doesn't. If queues are FIFO, then it takes threads in FIFO order, if
the queues are priority, then it takes threads in priority order. That's
one of good thing about it I tried to explain: the order of wakeups
matches the order you've selected for wait queues at configuration time.

> (you had mentioned something about that in your earlier mail, but that
> didn't clear the things).

No, I believe I didn't.


Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

More information about the Ecos-discuss mailing list