Bug 12875

Summary: pthread_cond_timedwait can steal the wakeup of slower thread in pthread_cond_wait
Product: glibc Reporter: martin
Component: nptlAssignee: Not yet assigned to anyone <unassigned>
Status: REOPENED ---    
Severity: normal CC: bugdal, jakub, siddhesh, triegel
Priority: P2 Flags: fweimer: security-
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Example source code
Test case with explicit happens-before logic rather than usleep

Description martin 2011-06-10 17:56:56 UTC
Created attachment 5787 [details]
Example source code

According to the definition of pthread_cond_signal, it should only unblock threads that are blocked at the time it is called.

The attached example demonstrates a bug in pthread_cond_timedwait that can allow it to "steal" the signal from a thread that was blocked in pthread_cond_wait when pthread_cond_signal was called, even though pthread_cond_timedwait was called after pthread_cond_signal.

This was tested on an Intel Core i7 970 CPU (6 cores, 12 threads) running Fedora 14 x86_64 with the master branch of glibc from 2011-06-10 and also with older releases.

There is no easy way to repeat this, because it depends very much on the timing of the threads, so I've had to cheat by using pthread_kill to artificially delay the thread that called pthread_cond_wait (cs_ptA).

The expected output of the program is

A waits
cs_timewaster starts
B signals
cs_delaywaster starts
C waits
D waits
B signals
C wakes
D wakes with ETIMEDOUT
cs_delaywaster ends
A wakes

Note that D wakes with ETIMEDOUT and A wakes afterwards.

Often the program hangs after outputing

A waits
cs_timewaster starts
B signals
cs_delaywaster starts
C waits
D waits
B signals
C wakes
D wakes with code 0
cs_delaywaster ends

Note that D wakes with code 0 and A never wakes.

The calls to usleep in the example are such that the first signal from thread B should cause thread A to wake and second signal should cause thread C to wake.  Thread D should wake on the timeout.

I think the problem is that if thread A wakes from the futex but doesn't execute the rest of pthread_cond_wait quickly enough (e.g. because the signal handler cs_delaywaster runs), then thread D can steal the wakeup when its futex returns on the timeout.
Comment 1 Ulrich Drepper 2011-06-11 13:35:58 UTC
The current implementation is correct.  There is no guarantee of fairness.
Comment 2 martin 2011-06-13 12:39:04 UTC
Can you take another look at this please?  It isn't about fairness because the man page (and spec AFAIK) for pthread_cond_signal says:

"The pthread_cond_signal() function shall unblock at least one of the threads that are blocked on the specified condition variable cond (if any threads are blocked on cond)."

In the example, only thread A is blocked at the time thread B signals for the first time, so I think it should wake whatever happens later  The blocking of threads C and D occurs after thread B signals for the first time so shouldn't be affected by that signal.

The problem is that if pthread_cond_timedwait in thread D reaches its timeout before thread A has had a chance to increment __woken_seq, then thread D will claim the signal even though its true reason for waking is the timeout.
Comment 3 Rich Felker 2011-09-28 21:55:17 UTC
This is definitely a bug. I hope this won't be another case where Mr. Drepper has got too much time invested in the over-engineered code for minimizing spurious wakes to admit that it needs to be thrown out and replaced with a correct implementation.

Actually I have a possible alternate explanation for his position: it's possible that he's viewing condition signals as events that always correspond to a new item in a queue that will be removed by any waiter who wakes. If this is all you use cond vars for, there's no need to avoid having a new waiter (arriving after the signal) avoid stealing an existing waiter's wake, because the new waiter could just as easily have experienced a spurious wake, and upon waking up and checking the queue, found and removed the next queued item - at which point, even if the existing waiter woke, it would find an empty queue again and wait again.

Of course condition variables have plenty of other legitimate uses, and the requirements on the implementation are governed by the specification, not by one narrow-minded idea of what they should be used for...
Comment 4 Rich Felker 2012-03-17 20:40:32 UTC
Ping. There was a lot of analysis on this bug but as far as I know, no work towards fixing it...
Comment 5 Torvald Riegel 2012-09-18 14:18:13 UTC
(In reply to comment #2)
> Can you take another look at this please?  It isn't about fairness because the
> man page (and spec AFAIK) for pthread_cond_signal says:
> 
> "The pthread_cond_signal() function shall unblock at least one of the threads
> that are blocked on the specified condition variable cond (if any threads are
> blocked on cond)."
> 
> In the example, only thread A is blocked at the time thread B signals for the
> first time, so I think it should wake whatever happens later  The blocking of
> threads C and D occurs after thread B signals for the first time so shouldn't
> be affected by that signal.

Please revise your test case so that it is properly synchronized for what you intend to test.  usleep() is not a reliable way to enforce a happens-before order between operations in different threads.  Cond vars, locks, or something similar can enforce happens-before.

> The problem is that if pthread_cond_timedwait in thread D reaches its timeout
> before thread A has had a chance to increment __woken_seq, then thread D will
> claim the signal even though its true reason for waking is the timeout.

If A didn't consume a signal before D did, why shouldn't D consume an available signal?  The manpage bit you quote says that _at least_ one of the blocked threads should be unblocked.  This seems to be what's happening.

If you find a statement in the spec that disallows the behavior you see, please quote this statement instead.  I guess what you might find to be surprising is that a signal operation might not have finished wakening a thread even though it already returned to the caller.
Comment 6 Torvald Riegel 2012-09-18 14:28:05 UTC
(In reply to comment #3)
> This is definitely a bug.

If so, please quote the specification or manpage that this violates.  At the very least, we need this for documentation purposes.

> I hope this won't be another case where Mr. Drepper
> has got too much time invested in the over-engineered code for minimizing
> spurious wakes to admit that it needs to be thrown out and replaced with a
> correct implementation.

This is a bug report.  Please focus on the technical matter.

> Actually I have a possible alternate explanation for his position: it's
> possible that he's viewing condition signals as events that always correspond
> to a new item in a queue that will be removed by any waiter who wakes. If this
> is all you use cond vars for, there's no need to avoid having a new waiter
> (arriving after the signal) avoid stealing an existing waiter's wake, because
> the new waiter could just as easily have experienced a spurious wake, and upon
> waking up and checking the queue, found and removed the next queued item - at
> which point, even if the existing waiter woke, it would find an empty queue
> again and wait again.

What you describe is a fairness issue.  I'm not aware of any guarantee of fairness or absence of starvation.  The cond var should try to be fair if possible, but the test case is not even for a PI cond var, so threads aren't even guaranteed to get to run on the CPU.

> Of course condition variables have plenty of other legitimate uses, and the
> requirements on the implementation are governed by the specification, not by
> one narrow-minded idea of what they should be used for...

So, please cite the bits of the specification that disallow the behavior.  What Martin cite from the manpage does not seem to conflict with the current behavior.
Comment 7 Rich Felker 2012-09-19 03:21:17 UTC
> If so, please quote the specification or manpage that this violates.  At the
> very least, we need this for documentation purposes.

"The pthread_cond_signal() function shall unblock at least one of the threads that are blocked on the specified condition variable cond (if any threads are blocked on cond)."

Source: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_signal.html

If it's provable that threads 1 through N are waiting on a condition variable C when thread X calls pthread_cond_signal on C N times, then in order to satisfy the above, all N must be unblocked. If another waiter arriving after the N signals consumes any of the signals and prevents those N from being blocked, then the above cited requirement has been violated; at least one call to pthread_cond_signal did not "unblock at least one of the threads that are blocked".

I admit the test case is poorly written and rather obfuscated, but I believe that it is showing a bug of this nature. Contrary to yours and Mr. Dreppers characterizations, this is NOT A FAIRNESS ISSUE. No complaint about fairness has been made by myself or the original poster of the bug. If the bug really exists in the described form -- and it seems to me that it does, although I'd like to find a simpler test case -- then it's an issue of the interface violating its contract.
Comment 8 Jakub Jelinek 2012-09-19 06:23:24 UTC
(In reply to comment #7)
> I admit the test case is poorly written and rather obfuscated, but I believe
> that it is showing a bug of this nature. Contrary to yours and Mr. Dreppers
> characterizations, this is NOT A FAIRNESS ISSUE. No complaint about fairness
> has been made by myself or the original poster of the bug. If the bug really
> exists in the described form -- and it seems to me that it does, although I'd
> like to find a simpler test case -- then it's an issue of the interface
> violating its contract.

The testcase doesn't show anything like that, there are absolutely no guarantees that the usleeps result in whatever ordering of events in the threaded program.
Comment 9 Torvald Riegel 2012-09-19 08:21:49 UTC
(In reply to comment #7)
> > If so, please quote the specification or manpage that this violates.  At the
> > very least, we need this for documentation purposes.
> 
> "The pthread_cond_signal() function shall unblock at least one of the threads
> that are blocked on the specified condition variable cond (if any threads are
> blocked on cond)."
> 
> Source:
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_signal.html
> 
> If it's provable that threads 1 through N are waiting on a condition variable C
> when thread X calls pthread_cond_signal on C N times, then in order to satisfy
> the above, all N must be unblocked.

First, I don't think you can prove that they are waiting in this test case -- usleep doesn't give you any happens-before or other ordering guarantees.

Second, the sentence from the spec that you quote is vague on when those other threads would block (e.g., is there any guarantee of linearizability or such?). Specifically, it doesn't state which blocked threads should be wakened, or which threads blocked at which time.

> If another waiter arriving after the N
> signals consumes any of the signals and prevents those N from being blocked,
> then the above cited requirement has been violated;

I don't see how this is violated. You say it unblocked the waiter and N-1 other threads, so N overall for N signals.  That's what the guarantee is, isn't it?

> at least one call to
> pthread_cond_signal did not "unblock at least one of the threads that are
> blocked".

It did unblock one of the threads that are blocked.  Even at the end right before their timeout they are still blocked, right?

> I admit the test case is poorly written and rather obfuscated but I believe
> that it is showing a bug of this nature.

> Contrary to yours and Mr. Dreppers
> characterizations, this is NOT A FAIRNESS ISSUE.

Please look again at the paragraph that you wrote that I replied to in comment #6.  You describe a fairness issue there (a newer waiter "stealing" an older waiter's signal).  I did not say that this bug report is just a fairness issue.  In some executions it might show that there's no fairness, but the core problems with this bug report are the two items I point out above.
Comment 10 Rich Felker 2012-09-19 08:23:42 UTC
Use of usleep does not automatically invalidate the test. However, looking at the test source again, I see this is a different test case (and different bug report) from the one I was thinking of, which claims the same issue (stolen wakes). I had a conversation with the author of that other bug report and was convinced it's valid; I'll have to go back and find it so that this report can be considered as a possible duplicate...
Comment 11 Rich Felker 2012-09-19 08:49:36 UTC
Here's the related bug report, whose thread concluded with the claim that this bug (12875) is probably a manifestation of it: http://sourceware.org/bugzilla/show_bug.cgi?id=13165

With that said, some comments on Torvald's last reply:

> Second, the sentence from the spec that you quote is vague on when those other
> threads would block (e.g., is there any guarantee of linearizability or such?).
> Specifically, it doesn't state which blocked threads should be wakened, or
> which threads blocked at which time.

The relevant text is in the specification for pthread_cond_wait regarding atomically unlocking the mutex and blocking:

"These functions atomically release mutex and cause the calling thread to block on the condition variable cond; atomically here means "atomically with respect to access by another thread to the mutex and then the condition variable". That is, if another thread is able to acquire the mutex after the about-to-block thread has released it, then a subsequent call to pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave as if it were issued after the about-to-block thread has blocked."

That is, if the mutex was held by a thread entering pthread_cond_wait and another thread successfully acquires the mutex, the first thread "has blocked".

> > If another waiter arriving after the N
> > signals consumes any of the signals and prevents those N from being blocked,
> > then the above cited requirement has been violated;
> 
> I don't see how this is violated. You say it unblocked the waiter and N-1
> other threads, so N overall for N signals.  That's what the guarantee is,
> isn't it?

Perhaps my typo in the above-quoted text is the source of confusion. It should have read "prevents those N from being _unblocked_".

With the above reading of the standard in mind, at the moment pthread_cond_signal is called, it's provable that exactly those N, and not the new waiter that's about to arrive, "have blocked" on the condition variable. So if the N signals don't wake all N of them, it's a bug - each signal is required to unblock at least one thread that "has blocked" in the above sense.
Comment 12 Torvald Riegel 2012-09-19 15:34:48 UTC
(In reply to comment #11)
> Here's the related bug report, whose thread concluded with the claim that this
> bug (12875) is probably a manifestation of it:
> http://sourceware.org/bugzilla/show_bug.cgi?id=13165
> 
> With that said, some comments on Torvald's last reply:
> 
> > Second, the sentence from the spec that you quote is vague on when those other
> > threads would block (e.g., is there any guarantee of linearizability or such?).
> > Specifically, it doesn't state which blocked threads should be wakened, or
> > which threads blocked at which time.
> 
> The relevant text is in the specification for pthread_cond_wait regarding
> atomically unlocking the mutex and blocking:
> 
> "These functions atomically release mutex and cause the calling thread to block
> on the condition variable cond; atomically here means "atomically with respect
> to access by another thread to the mutex and then the condition variable". That
> is, if another thread is able to acquire the mutex after the about-to-block
> thread has released it, then a subsequent call to pthread_cond_broadcast() or
> pthread_cond_signal() in that thread shall behave as if it were issued after
> the about-to-block thread has blocked."

That states that the signaler needs to respect a happens-before relation established by the mutex.  Thus, the prior cond_wait will be considered as a wake-up target.  There is no guarantee there that it is the first to be wakened. There is no statement there about the relationship to other writers, nor when the signal is actually delivered.

> That is, if the mutex was held by a thread entering pthread_cond_wait and
> another thread successfully acquires the mutex, the first thread "has blocked".

Indeed.  However, that doesn't mean that it will be the first to be unblocked.  If there are no other waiters, it is the only candidate though, and will be unblocked (i.e., there are no lost wake-ups).

> > > If another waiter arriving after the N
> > > signals consumes any of the signals and prevents those N from being blocked,
> > > then the above cited requirement has been violated;
> > 
> > I don't see how this is violated. You say it unblocked the waiter and N-1
> > other threads, so N overall for N signals.  That's what the guarantee is,
> > isn't it?
> 
> Perhaps my typo in the above-quoted text is the source of confusion. It should
> have read "prevents those N from being _unblocked_".

I read this as unblocked.

> With the above reading of the standard in mind, at the moment
> pthread_cond_signal is called, it's provable that exactly those N, and not the
> new waiter that's about to arrive, "have blocked" on the condition variable. > So

It says that those have blocked, not that those are exactly the ones that can be considered to have blocked (i.e., the first "if" in what you quoted is an "if", not an "iff" (if and only if)).

> if the N signals don't wake all N of them, it's a bug - each signal is required
> to unblock at least one thread that "has blocked" in the above sense.

No.  See above.
Comment 13 Rich Felker 2012-09-19 17:30:17 UTC
> It says that those have blocked, not that those are exactly the ones that can
> be considered to have blocked (i.e., the first "if" in what you quoted is an
> "if", not an "iff" (if and only if)).

Short of an implementation-defined extension that must be documented, there is no way a waiter can block on a condition variable short of calling pthread_cond_wait or pthread_cond_timedwait. Under your proposed interpretation, pthread_cond_signal would be useless; in order to be useful, at least one thread needs to unblock, and the application has to be able to know (based on its knowledge of which threads have blocked) that the signal will unblock a thread that will allow the program to make forward progress. This of course requires that every thread which has blocked on the condition at the time of pthread_cond_signal be a thread whose unblocking would allow forward progress; if other threads have blocked on the same condition, then it's an application bug.

As I stated before, I'm not sure this bug report is valid and I was thinking of the other one. But there is a real issue here that the implementation needs to take care to satisfy.
Comment 14 Torvald Riegel 2012-09-20 12:28:36 UTC
(In reply to comment #13)
> > It says that those have blocked, not that those are exactly the ones that can
> > be considered to have blocked (i.e., the first "if" in what you quoted is an
> > "if", not an "iff" (if and only if)).
> 
> Short of an implementation-defined extension that must be documented, there is
> no way a waiter can block on a condition variable short of calling
> pthread_cond_wait or pthread_cond_timedwait. Under your proposed
> interpretation, pthread_cond_signal would be useless;

It's not useless. It just doesn't give all the guarantees you thought it would give.

> in order to be useful, at
> least one thread needs to unblock, and the application has to be able to know
> (based on its knowledge of which threads have blocked) that the signal will
> unblock a thread that will allow the program to make forward progress.

If the application needs to associate certain waiters with signalers, it can just use a separate cond var for this, for example.
Comment 15 martin 2012-09-20 14:01:28 UTC
Created attachment 6639 [details]
Test case with explicit happens-before logic rather than usleep

As requested, I've attached a version of the test case that uses the lock, barriers and atomic instructions to enforce happens-before.  There are 2 remaining calls to usleep, which are needed to ensure ordering between the calling thread and the internals of pthread_cond_wait than can't be controlled.

This bug may be different from Bug 13165, because it is caused by waking from a timeout rather than an extra signal.

Why I think this is a bug: my reading of the sentence "The pthread_cond_signal() function shall unblock at least one of the threads that are blocked on the specified condition variable cond (if any threads are blocked on cond)." is that it only affects threads that "are blocked" at the time pthread_cond_signal() is called, not those that call pthread_cond_wait afterwards.  The wording for pthread_cond_broadcast() says "currently blocked", but is that an intentional difference?
Comment 16 Torvald Riegel 2012-09-20 16:59:10 UTC
(In reply to comment #15)
> Created attachment 6639 [details]
> Test case with explicit happens-before logic rather than usleep
> 
> As requested, I've attached a version of the test case that uses the lock,
> barriers and atomic instructions to enforce happens-before.

Thanks.

> This bug may be different from Bug 13165, because it is caused by waking from a
> timeout rather than an extra signal.

Even this happens with cond_timedwait, it seems to me that it is conceptually the same issue.

> Why I think this is a bug: my reading of the sentence "The
> pthread_cond_signal() function shall unblock at least one of the threads that
> are blocked on the specified condition variable cond (if any threads are
> blocked on cond)." is that it only affects threads that "are blocked" at the
> time pthread_cond_signal() is called, not those that call pthread_cond_wait
> afterwards.

I don't read it that way.  Please see the discussion in bug #13165 about this.  To summarize, the spec only requires the threads to be considered as blocked that happen before the signal, but it does not require threads that wait after the signal to NOT be considered blocked.  In your example, the waiter in thread D is not disallowed to be considered as a blocked thread.

> The wording for pthread_cond_broadcast() says "currently blocked",
> but is that an intentional difference?

I can't speak about the intent of the authors of the spec. A more detailed specification would certainly be easier to understand.  But unless there is a change, I would stick to what's allowed/required by the current wording.

OK to classify as not a bug?