This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] PowerPC: Fix a race condition when eliding a lock
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Torvald Riegel <triegel at redhat dot com>, Carlos O'Donell <carlos at redhat dot com>
- Cc: Tulio Magno Quites Machado Filho <tuliom at linux dot vnet dot ibm dot com>, libc-alpha at sourceware dot org, munroesj at linux dot vnet dot ibm dot com
- Date: Tue, 25 Aug 2015 13:47:32 -0300
- Subject: Re: [PATCH v2] PowerPC: Fix a race condition when eliding a lock
- Authentication-results: sourceware.org; auth=none
- References: <55BAEE77 dot 9000501 at redhat dot com> <1440084054-32243-1-git-send-email-tuliom at linux dot vnet dot ibm dot com> <55D628C2 dot 3080305 at redhat dot com> <55D62FD3 dot 8090109 at linaro dot org> <1440150735 dot 6240 dot 74 dot camel at localhost dot localdomain> <55DBE899 dot 2080908 at redhat dot com> <1440500727 dot 27492 dot 128 dot camel at localhost dot localdomain> <55DC786C dot 8080108 at redhat dot com> <1440520639 dot 27492 dot 190 dot camel at localhost dot localdomain>
On 25-08-2015 13:37, Torvald Riegel wrote:
> On Tue, 2015-08-25 at 10:15 -0400, Carlos O'Donell wrote:
>> As a concrete example the structure element that is accessed atomically is
>> rwlock->__data.__lock. We access it atomically in lll_lock and also
>> in the txnal region (is region the right word?).
>
> Txnal region is used by some, so I think this works. Just transaction
> would work as well I think.
>
>> The access is part of:
>>
>> nptl/pthread_rwlock_wrlock.c
>>
>> 100 if (ELIDE_LOCK (rwlock->__data.__rwelision,
>> 101 rwlock->__data.__lock == 0
>> 102 && rwlock->__data.__writer == 0
>> 103 && rwlock->__data.__nr_readers == 0))
>> 104 return 0;
>>
>> With the intent being for the expression to be evaluted inside of the
>> transaction and thus abort if another thread has touched any of those
>> fields.
>>
>> That entire expression expands into the usage of is_lock_free. Therefore
>> shouldn't the change be at the caller site?
>
> That would be an option, or we should pass in a function or something.
>
>> e.g.
>>
>> 100 if (ELIDE_LOCK (rwlock->__data.__rwelision,
>> 101 atomic_load_relaxed (rwlock->__data.__lock) == 0
>> 102 && rwlock->__data.__writer == 0
>> 103 && rwlock->__data.__nr_readers == 0))
>> 104 return 0;
>>
>> Since the powerpc elision backend doesn't know anything about the types
>> that went into the evaluation of is_lock_free?
>>
>> If anything I think *we* have the onus to fix these cases of missing
>> atomic_load_relaxed not IBM?
>
> Somebody should do it :) I hadn't thought too much about for whom it
> would be most convenient to do. As I wrote in the review for Tulio's
> patch, IMO passing in an expression into a macro that has to be
> evaluated in there is pretty ugly and seems to be at least related to
> the bug Tulio is fixing. Maybe we can pass in a function that is
> inlined by the compiler.
> I do agree that IBM just used what the Intel implementation did, and
> thus didn't create that in the first place.
>
Indeed Intel was the first arch that enable Lock Elision and other
arches just followed the implementation. And I also agree that the
macro is not best approach and on initial iterations on this I proposed
to instead of passing an function we can instead use the arguments
address instead:
diff --git a/nptl/pthread_rwlock_wrlock.c b/nptl/pthread_rwlock_wrlock.c
index 60fa909..0161876 100644
--- a/nptl/pthread_rwlock_wrlock.c
+++ b/nptl/pthread_rwlock_wrlock.c
@@ -98,9 +98,9 @@ __pthread_rwlock_wrlock (pthread_rwlock_t *rwlock)
LIBC_PROBE (wrlock_entry, 1, rwlock);
if (ELIDE_LOCK (rwlock->__data.__rwelision,
- rwlock->__data.__lock == 0
- && rwlock->__data.__writer == 0
- && rwlock->__data.__nr_readers == 0))
+ rwlock->__data.__lock,
+ rwlock->__data.__writer,
+ rwlock->__data.__nr_readers))
diff --git a/sysdeps/generic/elide.h b/sysdeps/generic/elide.h
index 80a3a03..64d9cce 100644
--- a/sysdeps/generic/elide.h
+++ b/sysdeps/generic/elide.h
@@ -15,11 +15,12 @@
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+
#ifndef ELIDE_H
#define ELIDE_H 1
-#define ELIDE_LOCK(adapt_count, is_lock_free) 0
-#define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) 0
-#define ELIDE_UNLOCK(is_lock_free) 0
+#define ELIDE_LOCK(adapt_count, lock, writer, readers) 0
+#define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) 0
+#define ELIDE_UNLOCK(writer, readers) 0
#endif
diff --git a/sysdeps/powerpc/nptl/elide.h b/sysdeps/powerpc/nptl/elide.h
index 389f5a5..b3cc50a 100644
--- a/sysdeps/powerpc/nptl/elide.h
+++ b/sysdeps/powerpc/nptl/elide.h
@@ -27,7 +27,7 @@
ADAPT_COUNT is a pointer to per-lock state variable. */
static inline bool
-__elide_lock (uint8_t *adapt_count, int is_lock_free)
+__elide_lock (uint8_t *adapt_count, int *lock, int *writer, unsigned int *readers)
{
if (*adapt_count > 0)
{
@@ -39,7 +39,10 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free)
{
if (__builtin_tbegin (0))
{
- if (is_lock_free)
+ /* The compiler barrier is required because some GCC version might
+ reorder the lock read before the transaction init builtin. */
+ asm volatile("" ::: "memory");
+ if ((*lock == 0) && (*writer == 0) && (*readers == 0))
return true;
/* Lock was busy. */
__builtin_tabort (_ABORT_LOCK_BUSY);
@@ -66,30 +69,31 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free)
return false;
}
I do not know which is better, since it will tie the ELIDE_LOCK implementation
with current internal pthread definitions.