This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [BZ #21778] Fix oversight in robust mutex lock acquisition.


On 07/28/2017 08:41 AM, Siddhesh Poyarekar wrote:
> On Wednesday 26 July 2017 03:02 AM, Torvald Riegel wrote:
>> 65810f0ef05e8c9e333f17a44e77808b163ca298 fixed a robust mutex bug but
>> introduced BZ 21778: if the CAS used to try to acquire a lock fails, the
>> expected value is not updated, which breaks other cases in the lock
>> acquisition loop.  The fix is to simply update the expected value with
>> the value returned by the CAS, which ensures that behavior is as if the
>> first case with the CAS never happened (if the CAS fails).
>>
>> This is a regression introduced in the last release, so it would be good
>> to get this included in this release.  I'll likely be AFK on Thursday,
>> so please just commit this once it has been approved.  Tested on
>> x86_64-linux.
>>
>>
>> 	[BZ 21778]
>> 	* nptl/pthread_mutex_timedlock.c (__pthread_mutex_timedlock): Update
>> 	oldval if the CAS fails.
>> 	* nptl/pthread_mutex_lock.c (__pthread_mutex_lock_full): Likewise.
>> 	* nptl/tst-mutex7.c (ROBUST, DELAY_NSEC, ROUNDS, N): New.
>> 	(tf, do_test): Use them.
>> 	* nptl/tst-mutex7robust.c: New file.
>> 	* nptl/Makefile (tests): Add new test.
>>
> 
> Looks good to me, but I think Carlos is also reviewing this, so please
> wait for his confirmation before you commit.

So far I've finished confirming this fix on:
- x86_64
- i686
- ppc64

With the following targets still lagging:
- s390x
- aarch64
- ppc64le
- armv7hl

I'd like to see those 4 last targets complete their test cycle before
I commit this change.

I've got a v2 which basically just adds some more comments to the testsuite.

The bug is 100% reproducible and easily with the small critical section and
the lock/unlock sequence of 100 threads, we hit the CAS very quickly, fail
to set oldval and everything deadlocks. It's always easier to validate when
the failure is in a high contention path, because creating contention is easy :-)

Would you be opposed if I commit this slightly *after* the midnight UTC hard
freeze? That way I can confirm the other four arches are OK? I expect that with
the coverage I have here it won't break anyone else (and the patch is clearly
correct, but I would like to really double check).

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]