This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] PowerPC: Fix a race condition when eliding a lock


On 08/25/2015 07:05 AM, Torvald Riegel wrote:
> On Tue, 2015-08-25 at 00:01 -0400, Carlos O'Donell wrote:
>> On 08/21/2015 05:52 AM, Torvald Riegel wrote:
>>> On Thu, 2015-08-20 at 16:51 -0300, Adhemerval Zanella wrote:
>>>>
>>>>
>>>>>> +
>>>>>> +/* Returns 0 if the lock defined by is_lock_free was elided.
>>>>>> +   ADAPT_COUNT is a per-lock state variable.  */
>>>>>> +# define ELIDE_LOCK(adapt_count, is_lock_free)				\
>>>>>> +  ({									\
>>>>>> +    int ret = 0;							\
>>>>>> +    if (adapt_count > 0)						\
>>>>>> +      (adapt_count)--;							\
>>>>>> +    else								\
>>>>>> +      for (int i = __elision_aconf.try_tbegin; i > 0; i--)		\
>>>>>> +	{								\
>>>>>> +	  if (__builtin_tbegin (0))					\
>>>>>> +	    {								\
>>>>>> +	      if (is_lock_free)						\
>>>>>
>>>>> This should use a relaxed MO atomic access to highlight that this data *is*
>>>>> shared between threads, but that there are no ordering guarantees. It should
>>>>> result in a normal load, but it makes it clear in the code that some other
>>>>> thread is also accessing this data, and that the transaction as a barrier
>>>>> is all that is protecting this from being a data-race (undefined behaviour).
>>>>
>>>> Should we really highlight it using atomic operations? The idea is exactly
>>>> that any access within a transaction will not result in undefined behavior,
>>>> so no need to atomic for synchronization.
>>>
>>> I think it's better to highlight it.  If the data would only be accessed
>>> from within transaction, the highlighting wouldn't be necessary.  But it
>>> is mixed transactional/nontransactional synchronization in this case, so
>>> we need the annotation.
>>
>> I'll let you discuss this with Tulio. Unless there is a correctness issue
>> or a violation of our rule to be data race free, then there should be some
>> flexibility granted to IBM to do what they wish, and particularly that which
>> is optimal. While the atomic_load_relaxed doesn't appear to do anything that
>> would incur a serious performance penality, it does add an asm with inputs
>> that force the value to a register and this may limit the compiler in some
>> future way.
> 
> Note that the atomic will be using a compiler built-in (eventually), so
> the compiler isn't at a disadvantage.

Good point.

>> From my perspective a code comment on concurrency seems sufficient?
>>
>> If thread T1 and a thread T2 read or write memory that is used to evaluate
>> the value of is_lock_free, then the transaction would be aborted and thus
>> is_lock_free needs no atomic access?
>>
>>> This also makes moving to proper atomic types easier should we ever
>>> decide to do that.
>>
>> Could you please elaborate a bit more on this?
> 
> If we would be using C11 or C++11, then we would need to use an atomic
> type for is_lock_free because we need atomic accesses outside of
> transactions.  The atomic types do not offer a way to access the data
> nonatomically (this is being worked on in C++ SG1, but for other
> reasons).

As a concrete example the structure element that is accessed atomically is
rwlock->__data.__lock. We access it atomically in lll_lock and also
in the txnal region (is region the right word?).

The access is part of:

nptl/pthread_rwlock_wrlock.c 

100   if (ELIDE_LOCK (rwlock->__data.__rwelision,
101                   rwlock->__data.__lock == 0
102                   && rwlock->__data.__writer == 0
103                   && rwlock->__data.__nr_readers == 0))
104     return 0;

With the intent being for the expression to be evaluted inside of the
transaction and thus abort if another thread has touched any of those
fields.

That entire expression expands into the usage of is_lock_free. Therefore
shouldn't the change be at the caller site?

e.g.

100   if (ELIDE_LOCK (rwlock->__data.__rwelision,
101                   atomic_load_relaxed (rwlock->__data.__lock) == 0
102                   && rwlock->__data.__writer == 0
103                   && rwlock->__data.__nr_readers == 0))
104     return 0;

Since the powerpc elision backend doesn't know anything about the types
that went into the evaluation of is_lock_free?

If anything I think *we* have the onus to fix these cases of missing
atomic_load_relaxed not IBM?

Thoughts?

> Thus, if we'd use real atomic types at some point, then we would not
> want people to access them nonatomically, accidentally.  (And we
> wouldn't want to have seq-cst default accesses either.)
> Therefore, to make this possible in the future, we should stick to the
> rule that data accessed atomically somewhere is considered to be of
> atomic type (even though we actually use nonatomic types in the
> declaration), and all accesses to it are also using atomics.  IMO, this
> is just cleaner and there's no reason to make an exception for the few
> variables that we use in mixed nontxnal/txnal synchronization.  (In
> contrast, using nonatomic accesses for initialization can provide
> benefits such as using memset etc.)

I can agree with that.

c.
 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]