This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Is this pthread_cond_wait() race possible


On Thu, 2017-06-08 at 17:45 +0200, Sebastian Andrzej Siewior wrote:
> On 2017-06-08 14:28:09 [+0200], Torvald Riegel wrote:
> > > The counterpart (pthread_cond_signal()) uses __condvar_acquire_lock()
> > > which is a handcrafted lock with atomics and two bits. Do you have an
> > > integer field left?
> > 
> > No, otherwise I would have avoided the complexity of embedding the lock
> > bits :)  One could rearrange the bits in some other way to get a full
> > 32b for the PI lock -- but I don't think that's the fundamental problem.
> > 
> > > This seems to be only part that would break PI. That
> > > is a low priority thread that owns the lock (and is scheduled out) will
> > > make every subsequent thread end up in futex_wait(). An integer with
> > > futex_lock_pi() should fix that.
> > 
> > Have you looked at the presentation I gave at Real-Time Summit last
> > year?  I think the problem we're facing re PI support is more
> > fundamental than getting a PI lock in there.
> 
> You refer to
>   https://wiki.linuxfoundation.org/_media/realtime/events/rt-summit2016/pthread-condvars-posix-compliance-and-the-pi-gap_darren-hart_torvald-riegel.pdf
> 
> the last two slides?

Yes.  Do these bring across what the underlying algorithmic problem is?

> FUTEX_WAIT_REQUEUE_PI / FUTEX_CMP_REQUEUE_PI handled the user-mutex.
> This is gone now. I don't know how much trouble it will cost us. Let's
> see…
> In the old implementation

The old implementation did not handle all cases correctly.

> if you had a high-prio task in
> pthread_cond_wait() then it already had the mutex on its return path and
> it would boost the waker if needed. We don't have this "fast" kernel
> boost, only after the task has been woken up and is going for the
> userland mutex (which should be owned by the pthread_cond_signal()
> caller). So this context switch is not optimal. *But* the signal path
> does not hold any locks which made you lose PI while trying to obtain
> the internal condvar mutex via futex_wait().

I can't follow you.  What are you trying to say?

> In the broadcast case all threads will compete for the one lock so
> thundering herd here we are. But ideally the thread with highest
> priority should get the lock first.
> So based on that, the only thing I see is, that you end up in
> futex_wait() via the one lock in pthread_cond_signal() (or
> pthread_cond_timed_wait() ETIMEDOUT path).

I can't follow.  What is the "thing" you see, a problem, or specific
execution, or something else?

I think it would make sense to focus on correctness first.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]