This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

[PATCH] correction for PPC __compare_and_swap



This patch corrects an error in PPC __compare_and_swap.

An isync is necessary after acquisition of a lock to discard all prefetched
instructions.  On page 335 of the The PowerPC Architecture:  A
Specification For A New Family Of RISC Processors book it states the
following:  The "sync" instruction is execution synchronizing.  It is not
context synchronizing, and therefore need not discard prefetched
instructions.  For context synchronization you can see page 371 where the
following instructions rfi, sc and isync can be used.   End of quote.

What can happen is the processor could speculative load values into
registers as it is acquiring the lock and there is an opportunity to have
fetched stale data because another processor still owns the lock and is
modifying data that is protected by the lock.  The processor that is trying
to acquire the lock has speculatively loaded the data the other processor
is modifying.  The processor finally succeeds in acquiring the lock and
continues on with the data it had already loaded.  The sync at the end does
not cause the prefetched data to be discarded.  The isync causes all the
speculative execution to be thrown away and re-executed.

The pthread_lock and pthread_unlock routines really should be writted to
not use compare_and_swap.  Compare_and_swap requires a sync at the
beginning and an isync at the end.  This causes two syncs and two isyncs
for every lock/unlock pair which causes performance and thus scalability
problems with threads.  The pthread_lock routine does not need a sync
before the aquisition but only an isync after acquisition.  The unlock
requires a sync before release of the lock and no isync at the end.
Therefore if written as separate routines then there would only be one sync
and one isync per lock/unlock pair which will give better performance and
thus better scalability.

2001-05-03     Brian McCorkle <brianmc1@us.ibm.com>

*  powerpc/pt-machine.h  (__compare_and_swap):     Change exit sync to
isync to flush completed speculative loads

--- glibc-2.2/linuxthreads/sysdeps/powerpc/pt-machine.h.org  Thu Apr 19
14:38:54 2001
+++ glibc-2.2/linuxthreads/sysdeps/powerpc/pt-machine.h      Mon Apr 23
14:10:15 2001
@@ -51,7 +51,6 @@
 {
   int ret;

-  MEMORY_BARRIER ();
   __asm__ __volatile__ (
        "0:    lwarx %0,0,%1 ;"
        "      xor. %0,%3,%0;"
@@ -59,9 +58,9 @@
        "      stwcx. %2,0,%1;"
        "      bne- 0b;"
        "1:    "
+       "   isync;"
     : "=&r"(ret)
     : "r"(p), "r"(newval), "r"(oldval)
     : "cr0", "memory");
-  MEMORY_BARRIER ();
   return ret == 0;
 }




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]