This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][BZ 18743] PowerPC: Fix a race condition when eliding a lock


Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:

> On 30-07-2015 23:06, Tulio Magno Quites Machado Filho wrote:
>> Using pthread_rwlock_rdlock as an example, is_lock_free would be a preprocessor
>> token with the following contents:
>> 
>>  128   if (ELIDE_LOCK (rwlock->__data.__rwelision,
>>  129                   rwlock->__data.__lock == 0
>>  130                   && rwlock->__data.__writer == 0
>>  131                   && rwlock->__data.__nr_readers == 0))
>> 
>> And this is where the important memory access happens.
>> If one of these fields are changed by another thread after this point, but
>> before __builtin_tbegin, we're able to reproduce the first problem I mentioned
>> previously.
>
> My understanding is this fields will be changed only in the lock taken path. Let's
> say a thread t1 grab the lock (either by a transaction or by using a default lock)
> and modifies any 'rwlock->__data' fields after the ELIDE_LOCK check and before
> the transaction begin on a second thread t2.  My understanding of the issue you
> are seeing is the transaction will initiate and since is_lock_free is 1 it won't
> abort and continue as the a lock taken in both threads.

I agree with you.  But this won't reproduce the bug.  You have to change the
order of events.

Let's say t1 is the first one to read rwlock->__data fields and they're all 0.
t1 will set is_lock_free to 0.
Then t2 modifies rwlock->__data before t1 starts the transaction (as in: get a
lock, instead of a transaction).
After that, t1 starts the transaction.  Here comes the problem: the memory
barrier is created with rwlock->__data saying that someone took the lock,
but is_lock_free is set to 0.  In this scenario, t1 will proceed with the
transaction.

> However my understanding is the transaction won't be able to commit since it
> will have a conflict in the 'rwlock->__data.lock' or in any memory being access
> in the critical region.

Yes, but with the order of events I mentioned, when t1 calls
pthread_rwlock_unlock, it will run the following code:

  38   if (ELIDE_UNLOCK (rwlock->__data.__writer == 0
  39                     && rwlock->__data.__nr_readers == 0))
  40     return 0;

Where ELIDE_UNLOCK is:

  89 static inline bool
  90 __elide_unlock (int is_lock_free)
  91 {
  92   if (is_lock_free)
  93     {
  94       __builtin_tend (0);
  95       return true;
  96     }
  97   return false;
  98 }
  99 
 100 # define ELIDE_UNLOCK(is_lock_free) \
 101   __elide_unlock (is_lock_free)

Remember that t1 started the transaction even if rwlock->__data said someone
was holding the lock.
So now, in __elide_unlock, is_lock_free will be 1.  __elide_unlock will
return false and t1 will unlock a lock acquired by t2.

> Are you saying that the transaction is not being aborted
> in the critical region?

No.  I'm saying the memory access to rwlock->__data is critical and should
happen inside the transaction.  Without this, we can't say the whole process
is atomic.

> If compiler is doing memory reordering around HTM builtin it is *clearly* a bug
> since transaction begin/end acts a full memory barrier (sync).  You should also
> add a compiler barrier in pthread elision code as well.

Agreed.  But I guess that's out of scope for this particular bug.
I could prepare a separate patch with these changes if you agree with that.

-- 
Tulio Magno


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]