This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCHv3] PowerPC: Fix a race condition when eliding a lock


Torvald Riegel <triegel@redhat.com> writes:

> On Mon, 2015-08-24 at 15:11 -0300, Tulio Magno Quites Machado Filho
> wrote:
>> Changes since v2:
>>  - Moved part of the source code comments to the commit message.
>>  - Added O'Donnel's suggestion to the source code comments.
>> 
>> Changes since v1:
>>  - Improved commit message.
>>  - Added new comments in the source code alerting to the concurrency details
>>    intrinsic to this code.
>>  - Removed compiler barriers from this patch, which will be treated in another
>>    patch and will be synchronized with the GCC implementation.
>> 
>> 8<----------
>> 
>> The previous code used to evaluate the preprocessor token is_lock_free to
>> a variable before starting a transaction.  This behavior can cause an
>> error if another thread got the lock (without using a transaction)
>> between the evaluation of the token and the beginning of the transaction.
>
> I find the use of the preprocessor to capture an evaluation really ugly.
> And it is error-prone, as the bug fixed by this patch shows.
> I would appreciate a follow-up patch that replaces is_lock_free with either a
> function that is called (relying on the compiler to inline it if the
> function pointer is constant, which it should be), or an address to a
> value that is compared to some caller-provided constant (which would
> limit flexibility for the condition somewhat, but is cleaner).

I completely agree with you

>> +/* CONCURRENCY NOTES:
>> +
>> +   The value of is_lock_free is read and possibly written to by multiple
>
> As you write above, is_lock_free is not a value, it is an expression.
> The expression will read from a memory location, that is then modified
> by other threads.  Please clarify that so this is clear to readers (this
> misunderstanding seems to be related to the bug you're fixing...).

Ack.  I'll change it to: "The macro expression is_lock_free is..."

>
>> +   threads, either during lock acquisition, or elision.  In order to avoid
>> +   this evaluation from becoming a data race the access of is_lock_free
>
> It could be a data race because you're not using atomics there, but
> that's not the whole point.  (We use the term "data race" specifically
> to refer to the C11/C++11 memory model concept of the same name.)
> You want to ensure the lock elision synchronization scheme, and thus are
> moving it inside the txn.

Are you complaining about the usage of the term "data race"?
If so, what about "race condition"?

>> +   is placed *inside* the transaction.  Within the transaction is_lock_free
>> +   can be accessed with relaxed ordering.  We are assured by the transaction
>
> Please use "memory order" to be specific, or abbreviate with MO as
> mentioned on the Concurrency wiki page.

Ack.

>> +   that the value of is_lock_free is consistent with the state of the lock,
>> +   otherwise the hardware will abort the transaction.  */
>
> That's not the really why this lock elisions synchronization scheme
> works :)  (At least you're not giving sufficient reason why it is.)

I do agree with you that comment needs some adjustments but its intention is
only to describe the concurrency details of the next macro.  It has no intention
of describing how lock elision or this macro work.
It was added per request [1] [2].

What do you think about the following?

   Within the transaction we are assured that all memory accesses are atomic
   and is_lock_free can be evaluated with relaxed memory order.  That way, the
   value of is_lock_free is consistent with the state of the lock until the
   end of the transaction.

[1] https://sourceware.org/ml/libc-alpha/2015-07/msg00987.html
[2] https://sourceware.org/ml/libc-alpha/2015-08/msg00870.html

>> +/* Returns 0 if the lock defined by is_lock_free was elided.
>> +   ADAPT_COUNT is a per-lock state variable.  */
>> +# define ELIDE_LOCK(adapt_count, is_lock_free)				\
>> +  ({									\
>> +    int ret = 0;							\
>> +    if (adapt_count > 0)						\
>> +      (adapt_count)--;							\
>> +    else								\
>> +      for (int i = __elision_aconf.try_tbegin; i > 0; i--)		\
>> +	{								\
>> +	  if (__builtin_tbegin (0))					\
>> +	    {								\
>> +	      if (is_lock_free)						\
>
> This evaluation should use relaxed MO atomic accesses, IMO.  But that
> can be changed by a follow-up patch.

Ack.

>> +		{							\
>> +		  ret = 1;						\
>> +		  break;						\
>> +		}							\
>> +	      __builtin_tabort (_ABORT_LOCK_BUSY);			\
>> +	    }								\
>> +	  else								\
>> +	    if (!__get_new_count(&adapt_count))				\
>> +	      break;							\
>> +	}								\
>> +    ret;								\
>> +  })
>>  
>>  # define ELIDE_TRYLOCK(adapt_count, is_lock_free, write)	\
>> -  __elide_trylock (&(adapt_count), is_lock_free, write)
>> +  ({								\
>> +    int ret = 0;						\
>> +    if (__elision_aconf.try_tbegin > 0)				\
>> +      {								\
>
> I'd prefer a comment here why you abort if write is true.  Or a
> reference to another source file / function where this is explained.
> Can be in a follow-up patch.

Let's leave this to another patch.  I want to unblock this bugfix.  ;-)

Anyway, this code is not new.  This patch is just moving it from a function to
a macro.
AFAICS, x86 and powerpc are the only architectures to implement this macro.
They both do the same thing and they don't provide comments.
I'll have to investigate why it was implemented this way.

Thanks!

-- 
Tulio Magno


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]