This is the mail archive of the ecos-devel@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

serious bug in synchronisation primitives


While going through the execution logs of an assert
build, found the messages complaining about 'Locking
mutex I already own' , 'Unlock mutex I do not own'. On
further analysis found the source of problems lying in
situations like (explained wrt Cyg_Mutex::lock) --

self/get_current_thread gets value from
current_thread[CYG_KERNEL_CPU_THIS()]
If the thread gets switched out in the middle of this
indexing, it's mess.

Consider a thread is executing on processor 0 when it
executes the mutex lock call. it has got the current
CPU (index into array) when it's timeslice got over
and it gets switched out. Next time it gets chance to
run on processor 1 and continues from where it left
off. The id it gets for self is not it's own, but of
the thread running on processor 0.

Hope that repurcussions of this are clear w/o detailed
explanations. On noticing this, scanned mutex.cxx and
other sources in kernel/current/src/sync and found
couple of more synchronisation primitives affected by
this bug in a quick scan.

The crux of problem (whatever little i can see into it
at the moment) is - accesses to arrays via
CYG_KERNEL_CPU_THIS() should be done under
scheduler_lock taken.

Doing it under sched_lock might be a costly affair in
some places (?? need to check out ??), may be
interrupts-disabled for the moment could be used in
those situations.

Various asserts/tests/normal code needs to be checked
for direct/indirect accesses to current_thread,
need_reschedule, thread_switches (variable that i
directly see in sched.hxx) outside scheduler_lock.

I hope with the help from list a thorough scan
(earlier thorough scan for direct/indirect use of
get_sched_lock is still pending ) can be run to find
instances of the problem.


mutex.cxx
---------
cyg_bool Cyg_Mutex::lock(void)
{
    CYG_REPORT_FUNCTYPE("returning %d");

   cyg_bool result = true;
   Cyg_Thread *self = Cyg_Thread::self();

   // Prevent preemption
   Cyg_Scheduler::lock();
...
}

same situation also appears in --
cyg_bool Cyg_Condition_Variable::wait_inner( Cyg_Mutex
*mx )
cyg_bool Cyg_Condition_Variable::wait_inner( Cyg_Mutex
*mx, cyg_tick_count timeout )

cnt_sem2.cxx
------------
cyg_bool Cyg_Counting_Semaphore2::wait()
cyg_bool Cyg_Counting_Semaphore2::wait( cyg_tick_count
abs_timeout )

cnt_sem.cxx
-----------
cyg_bool Cyg_Counting_Semaphore::wait()
cyg_bool Cyg_Counting_Semaphore::wait( cyg_tick_count
timeout )

bin_sem.cxx
-----------
cyg_bool Cyg_Binary_Semaphore::wait()
cyg_bool Cyg_Binary_Semaphore::wait( cyg_tick_count
timeout )



		
__________________________________
Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.
http://promotions.yahoo.com/new_mail 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]