This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Possible fix for bug #13165
- From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Mon, 18 Aug 2014 21:20:50 -0300
- Subject: Re: Possible fix for bug #13165
- Authentication-results: sourceware.org; auth=none
- References: <20140818195344 dot GA1187 at brightrain dot aerifal dot cx> <53F26769 dot 20801 at linux dot vnet dot ibm dot com> <20140818212710 dot GV12888 at brightrain dot aerifal dot cx> <1408400297 dot 2220 dot 26 dot camel at triegel dot csb>
On 18-08-2014 19:18, Torvald Riegel wrote:
> On Mon, 2014-08-18 at 17:27 -0400, Rich Felker wrote:
>> On Mon, Aug 18, 2014 at 05:51:53PM -0300, Adhemerval Zanella wrote:
>>> On 18-08-2014 16:53, Rich Felker wrote:
>>>> A couple days ago I posted ideas for a fix for this issue on the bug
>>>> tracker:
>>>>
>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=13165#c38
>>>>
>>>> Anybody who does glibc development/builds/testing up for trying my
>>>> idea and seeing if it works?
>>>>
>>>> Rich
>>>>
>>> If I understood correctly, you are proposing something like:
>>>
>>> diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
>>> index fc5eac4..a16c5d5 100644
>>> --- a/nptl/pthread_cond_wait.c
>>> +++ b/nptl/pthread_cond_wait.c
>>> @@ -118,14 +118,6 @@ __pthread_cond_wait (cond, mutex)
>>> /* Make sure we are alone. */
>>> lll_lock (cond->__data.__lock, pshared);
>>>
>>> - /* Now we can release the mutex. */
>>> - err = __pthread_mutex_unlock_usercnt (mutex, 0);
>>> - if (__glibc_unlikely (err))
>>> - {
>>> - lll_unlock (cond->__data.__lock, pshared);
>>> - return err;
>>> - }
>>> -
>>> /* We have one new user of the condvar. */
>>> ++cond->__data.__total_seq;
>>> ++cond->__data.__futex;
>>> @@ -153,6 +145,14 @@ __pthread_cond_wait (cond, mutex)
>>> /* Remember the broadcast counter. */
>>> cbuffer.bc_seq = cond->__data.__broadcast_seq;
>>>
>>> + /* Now we can release the mutex. */
>>> + err = __pthread_mutex_unlock_usercnt (mutex, 0);
>>> + if (__glibc_unlikely (err))
>>> + {
>>> + lll_unlock (cond->__data.__lock, pshared);
>>> + return err;
>>> + }
>>> +
>>> do
>>> {
>>> unsigned int futex_val = cond->__data.__futex;
>>>
>>> correct?
>> Yes, that looks like what I had in mind. But on second thought I'm not
>> sure it should be necessary; I missed the internal lock that's being
>> taken before the mutex is unlocked.
>>
>>> I saw not NPTL issues in nether powerpc64 or x86_64.
>> You mean you can't reproduce the issue on these targets? Maybe it's
>> only present on targets that are using non-default versions of the
>> code. If so, and if the offending target-specific versions have been
>> removed, maybe the bug is fixed now? I haven't had a chance to try the
>> test case on latest glibc, but the bug was present on x86 (32-bit)
>> last time I checked.
>>
>> It would be nice if this bug has already been fixed without taking any
>> specific action to do so...
> The bug is not fixed. It is an algorithmic issue -- though of course it
> may or may not trigger in a particular execution depending on the
> respective interleavings.
>
I just noted I wasn't specific, by 'not issues' I meant NPTL testcase ran fine,
however the testcase from bug stills fails.