This is the mail archive of the ecos-devel@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Cyg_Mutex::check_this() fails


...Sending this from my private mail account because the eCos mailserver
blocks mail containing confidentiality disclaimers, which I can't prevent
when sending from my work mail account... :( :(

--------------------oOo--------------------

Hello folks,

I have a condition variable, c, tied to a mutex, m, and used like this with
a FIFO, f:

void producer(void)
{
    cyg_mutex_lock(&m);
    copy_to_fifo(&f, some_data);
    cyg_cond_signal(&c);
    cyg_mutex_unlock(&m);
}

void consumer_thread(cyg_addrword_t data) {
    cyg_mutex_lock(&m);
    while (1) {
        while (fifo_empty(&f)) {
            cyg_cond_wait(&c);
        }

        // m is locked here.

        // Empty FIFO.
        while (!fifo_empty(&f)) {
            copy_from_fifo(&f, &some_data);
            cyg_mutex_unlock();

            // Do something with some_data
            ...

            cyg_mutex_lock();
        }
    }
}

The following description refers to line numbers of rev. 1.15 of mutex.cxx.

When the system is under heavy interrupt load, threads may get scheduled in
and out more frequently than when not. Under these circumstances, I
sometimes get an assertion (cyg_assert_msg()) stemming from line 197 of
mutex.cxx, which is Cyg_Mutex::check_this(). Placing a breakpoint in this
function reveals that it happens when the consumer is about to wake up, that
is, in line 651, which is the second half of
Cyg_Condition_Variable::wait_inner( Cyg_Mutex *mx ).

A closer look at wait_inner() shows that when CYG_ASSERTCLASS( mx, "Corrupt
mutex") is invoked, the scheduler is not locked, which in turn means that
Cyg_Mutex::check_this() line 197 is tested non-atomically. Line 197
contains:
    if (( locked && owner == NULL ) return false;

So if the preemptive scheduler schedules the caller of cyg_cond_wait() out
in between the test of "locked" and "owner == NULL", and the mutex state
changes while scheduled out, we have a problem.

As I see it, CYG_ASSERTCLASS(some_obj, "some message") serves two purposes:
  1) check that some_obj is non-NULL and
  2) check that some_obj->check_this() returns TRUE.

IMO, only the first check needs to be made by wait_inner(), because line
#677 attempts to reacquire the mutex (mx->lock()), which itself performs the
check_this() check - and with the scheduler locked.

There are other places in mutex.cxx where CYG_ASSERTCLASS(Cyg_Mutex,
"message") is invoked without the scheduler locked, but I can't judge
whether these are OK or not.

Bottomline is that I suggest to change line #651 of mutex.cxx from
  CYG_ASSERTCLASS( mx, "Corrupt mutex"); to
  CYG_ASSERT(mx, "Invalid mutex pointer"); /* Or some other message */

Or is there anything fundamental, I have missed here?
Comments are appreciated.

Thanks in advance,
René Schipp von Branitz Nielsen
Vitesse Semiconductor Corporation



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]