Bug 13165

Summary: pthread_cond_wait() can consume a signal that was sent before it started waiting
Product: glibc Reporter: Mihail Mihaylov <mihaylov.mihail>
Component: nptlAssignee: Torvald Riegel <triegel>
Status: ASSIGNED ---    
Severity: normal CC: bugdal, scot4spam, siddhesh, triegel
Priority: P2    
Version: 2.14   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Test to observe the race
Simpler test to observe the race
Simpler test, converted to pure C

Description Mihail Mihaylov 2011-09-07 19:14:55 UTC
I was implementing something like a monitor on top of pthread condition
variables and I observed some strange behaviour. I was always holding the mutex
when calling pthread_cond_signal(). My code relied on only two assumptions
about the way pthread_cond_signal() works:

1) A call to pthread_cond_signal() will wake at least one thread which is
blocked on the condition, and the woken threads will start waiting on the
mutex.

2) If the signaling thread holds the mutex when it calls pthread_cond_signal(),
only threads which are already waiting on the condition variable may be woken.
In particular, if the signaling thread releases the mutex and then another
thread acquires the mutex and calls pthread_cond_wait(), the waiting thread
cannot be woken by this signal, no matter what other waiters are present before
or after the signal.

The only explanation that I could find for the observed behaviour was that my
second assumption was wrong. It seemed that I was hitting the following
scenario:

1) We have several threads which are blocked on the condvar in
pthread_cond_wait(). I'll call these threads "group A".

2) We then send N signals from another thread while holding the mutex. We are
releasing the mutex and acquiring it again between the signals.

3) Next we have several more threads (at least two) that acquire the mutex and
enter pthread_cond_wait(). I'll call these threads "group B"

4) Then we acquire the mutex in the signaling thread again and call
pthread_cond_signal() just once, then we release the mutex.

5) Two threads from group B wake up, and N-1 threads from group A wake up. In
effect one of the threads from group B has stolen a signal that was sent before
it started waiting from a thread from group A.

My expectation in this scenario is that at least N threads from group A should
wake up. I don't expect that exactly one thread from group B should wake up,
because spurious wakeups are possible. But this is not a spurious wakeup - I
have N signals, and N woken threads, it's just that the order is wrong.

I ran some experiments and they seemed to confirm my theory, so I looked at the
condvar implementation in nptl. I'm new to POSIX and Linux programing, but I
think I see how this can happen:

1) When we send the first N signals, N threads from group A that are waiting on
the cond->__data.__futex are woken and start waiting on cond->__data.__lock.

2) Then while the threads from group B enter pthread_cond_wait, some of the
woken threads from group A may remain waiting on the lock.

3) When we send the last signal, one thread from group B will wake and consume
this signal.

4) But suppose one more thread from group B wakes spuriously from
lll_futex_wait. At this moment it is possible that some of the woken threads
from group A will still be waiting on cond->__data.__lock. In that case the
spuriously woken thread from group B will see that cond->__data.__wakeup_seq
has changed (because of the last signal) and cond->__data._woken_seq has not
reached cond->__data.__wakeup_seq (because some of the woken threads in group A
are still waiting to acquire cond->__data.__lock), so it will exit the retry
loop and increase cond->__data.__woken_seq. The result is that the thread will
steal the signal.

Is this scenario really possible? And if it is, is this on purpose or is it a
bug?
Comment 1 Mihail Mihaylov 2011-09-21 09:12:26 UTC
It's been two weeks with no update on the bug. There is no way to tell if
someone has noticed it at all.

Is it possible to have at least some statement as of if this is considered a
valid bug that might be addressed at some point in the future?
Comment 2 Rich Felker 2011-09-21 18:18:43 UTC
I've waited months for a response to some of my bug reports for NPTL, all race
conditions, some of which make it impossible to use major features like
cancellation in a robust program. Nothing. Good luck, but don't hold your
breath. The best thing I can recommend to you is cross-posting the bug report
to distros' bug trackers, and getting other people interesting these issues to
follow up on the bug tracker with confirmations that they can reproduce it.
Comment 3 Rich Felker 2011-09-21 22:29:21 UTC
Can you explain how you know (2) is completed before (3) occurs, in your
scenario? If there's no synchronization to order these steps, then isn't it
possible that one or more of the signals happens after a thread from group B is
waiting?

If you have a minimal self-contained test case for the issue, I'd be interested
in seeing it.
Comment 4 Mihail Mihaylov 2011-09-22 22:21:10 UTC
Thank you for taking an interest in this issue.

(In reply to comment #3)
> Can you explain how you know (2) is completed before (3) occurs, in your
> scenario? If there's no synchronization to order these steps, then isn't it
> possible that one or more of the signals happens after a thread from group B is
> waiting?

Basically, because we are holding the mutex when signaling, we can tell exactly
which threads started waiting after we finished sending the first N signals.
These are the threads that I call "group B", so by this definition they cannot
start waiting before all signals from step 2 have been sent.

What I'm trying to say is that the scenario is not a test case, but rather a
hypothetical  sequence of events that can happen and can be observed, so it
doesn't specify why exactly no new threads started waiting during step 2, it
just says what happened. This left some ambiguity in my description.

One way to resolve this ambiguity is to say that if during step 2 some threads
acquired the mutex and called pthread_cond_wait(), they should be counted
towards group A.

Another way is to change step 2 and say that the signaling thread acquired the
mutex, sent N signals and only then released the mutex, without releasing it
between the signals.

The second way seems simpler and will probably make the race more likely, but
the first is closer to what I actually observed.

> If you have a minimal self-contained test case for the issue, I'd be interested
> in seeing it.

I don't have such a test case, but I'll try to find time in the next days to
write one and attach it to the bug.
Comment 5 Mihail Mihaylov 2011-09-25 21:32:35 UTC
Created attachment 5945 [details]
Test to observe the race

Attaching a self contained test. What the test does:

We have a mutex and a condition variable. We also have several auxiliary
condition variables and counters.

The main thread locks the mutex and creates as many waiter threads as possible.
The waiter threads start by waiting on the mutex. Then both the main thread and
the waiter thread start looping to perform iterations of the test until the
race condition in NPTL is hit.

The loops of the main thread and the waiter threads are synchronized and go
like this:

1) The main thread starts by releasing the mutex and blocking on an auxiliary
condvar. This unblocks the waiter threads which start by entering the first
wait on the condition variable. Each waiter thread increments a waiters counter
before waiting and the last one also signals the auxiliary condvar to notify
the main thread that all waiters are blocked on the first wait.

2) When all waiters are blocked on the first wait, the main thread is unblocked
and starts sending signals. It sends as many signals as there are waiters, so
all waiters should move (eventually) beyond the first wait. The main thread
holds the mutex while sending the signals. The 'releaseMutexBetweenSignals'
constant controls whether it will release and reacquire the mutex between
signals.

3) Each unblocked waiter decrements the waiters counter and moves to the second
wait. To simplify the test, the waiters don't enter the second wait until all
signals from step 2 have been sent. This is controlled through a sent signals
counter and another auxiliary condvar.

4) After the main thread has sent all signals, it starts waiting for at least
two waiters to block on the second wait. This is facilitated by a counter of
the threads that have reached the second wait and one more auxiliary condvar.

5) When at least two threads have blocked on the second wait, the main thread
sends one more signal. Threads that get unblocked from the second wait may
start a third wait to allow the test iteration to complete before they loop
back to the first wait (of course this actually happens when the main thread
releases the mutex in step 6)

6) The main thread starts waiting for all waiters to exit the first wait. Each
waiter that exits the first wait decrements the waiters count and the last one
signals the last auxiliary condvar that the main thread waits on. If the wait
times out, the test has failed, otherwise it has passed.

7) If the test has passed, all waiters are waiting on the condition variable in
the second wait or the third wait, so the main thread sends a broadcast to
unblock them and all waiters move back to the first wait. With this the test
iteration is complete and a new iteration begins.


The main point about this test is that at the point where the main thread sends
the single signal, all waiters should be:

1) either waiting on the mutex in the first wait,
2) or waiting on the condition variable in the second wait,
3) or waiting on the mutex in the wait on the auxiliary condvar from step 3

which means that if the mutex gets released for long enough, all threads should
acquire the mutex in the first wait and the waiters count should eventually
reach zero. Step 6 is meant to provide this time. At step 6, the main thread
releases the mutex and starts waiting, and all waiters that acquire the mutex
release it almost immediately and start waiting themselves, so there is nothing
to prevent the threads from group (1) above from acquiring the mutex one by one
and bringing the waiters counter back to 0. The only thing that can get in the
way is if there is a waiter which is still blocked on the condition variable in
the first wait, which is what the test aims to trigger and detect.
Comment 6 Mihail Mihaylov 2011-09-25 21:43:17 UTC
(In reply to comment #3)
> If you have a minimal self-contained test case for the issue, I'd be interested
> in seeing it.

I wrote a self-contained test case. It is not very simple, and may seem
contrived. My real code is actually much simpler (just one condition variable),
but it is not suitable as a test.

I ran the test on my dual-core laptop and had consistent results of one stolen
signal. I guess that with more cores it will be possible for more signals to be
stolen. Typically it took from 3 to 40 iterations to hit the race.

I would appreciate your feedback.
Comment 7 Mihail Mihaylov 2011-09-26 09:13:37 UTC
I ran the test on my workstation too. It took much longer to hit the race. The
difference is that on my workstation a process can start over 30000 threads as
opposed to about 400 on my laptop. Limiting the number of created threads to
several hundred made the race much easier to hit.
Comment 8 Rich Felker 2011-09-26 16:19:50 UTC
Could you simplify it? I'm not sure why the test has to be so much more
complicated than your real code. If you think there's a race/bug in cond vars,
then it's probably not a good idea to be using cond vars in the logic for the
test loop, since it will be hard to tell if the bug is occurring in the tested
code or the control code. Perhaps you could use barriers or semaphores to
synchronize the test control logic, if needed. Personally barriers are my
favorite synchronization primitive for that sort of thing.

Also, yes, get rid of the "make as many threads as possible" logic. That will
not make it any easier to find the race, and the limit to how many threads you
can make is usually dependent on available memory/virtual address space, not
kernel thread resources. Just pick a "sane" number (probably below 100) and go
with it.
Comment 9 Mihail Mihaylov 2011-09-27 10:10:23 UTC
Created attachment 5946 [details]
Simpler test to observe the race

This is a much simpler test case that demonstrates the race. The logic is
pretty much the same as with the previous test but it doesn't use any auxiliary
condition variables.
Comment 10 Mihail Mihaylov 2011-09-27 10:13:25 UTC
(In reply to comment #8)
> Could you simplify it? I'm not sure why the test has to be so much more
> complicated than your real code. If you think there's a race/bug in cond vars,
> then it's probably not a good idea to be using cond vars in the logic for the
> test loop, since it will be hard to tell if the bug is occurring in the tested
> code or the control code. Perhaps you could use barriers or semaphores to
> synchronize the test control logic, if needed. Personally barriers are my
> favorite synchronization primitive for that sort of thing.
> 
> Also, yes, get rid of the "make as many threads as possible" logic. That will
> not make it any easier to find the race, and the limit to how many threads you
> can make is usually dependent on available memory/virtual address space, not
> kernel thread resources. Just pick a "sane" number (probably below 100) and go
> with it.

I simplified the test significantly as you suggested. Now its somewhat simpler
than my actual code.
Comment 11 Rich Felker 2011-09-28 02:06:54 UTC
I've confirmed that the issue occurs on my Debian system with their libc6
package (eglibc 2.13-10, albeit slightly different from glibc). I've also
confirmed that the issue does not occur with my implementation of condition
variables in musl libc(*). I suspect it's a real bug, but I need to read the
code more closely to understand what's going on...

(*) I've converted the test to C (just replaced cout with printf) so I can run
it without C++ support. I'm attaching the modified version.
Comment 12 Rich Felker 2011-09-28 02:08:00 UTC
Created attachment 5947 [details]
Simpler test, converted to pure C
Comment 13 Mihail Mihaylov 2011-09-28 09:02:51 UTC
(In reply to comment #11)
> I've confirmed that the issue occurs on my Debian system with their libc6
> package (eglibc 2.13-10, albeit slightly different from glibc).

I originally observed the problem on a Debian stable. I've run my test case on
my laptop which is running Mint and on my office workstation which is running
kubuntu.

I looked at the eglibc source code before posting the bug and saw that the code
which causes the race is identical to the one in glibc, so the bug is in both
implementations.

> I've also confirmed that the issue does not occur with my implementation of
> condition variables in musl libc(*).

I took a look at your code. As far as I can tell, you are not trying to avoid
spurious wakeups as hard as glibc, that's why you don't have the same race.

> I suspect it's a real bug, but I need to read the code more closely to
> understand what's going on...

Here is my understanding of the root cause - an attempt to prevent spurious
wakeups that has gone too far and destroys ordering - waking future waiters
instead of present ones.

There are two checks that NPTL uses to prevent spurious wakeups:

1) It only allows a thread to wake if a signal has been sent after it started
waiting. This is achieved by checking if cond->__data.__wakeup_seq has remained
unchanged.

2) It only allows as many threads to wake up as there were signals. This is
achieved by checking if cond->__data._woken_seq equals
cond->__data.__wakeup_seq.

If any of this checks indicates a spurious wakeup the thread retries the wait.

The problem is in check 2, because the guard is triggered if any thread has
woken spuriously - not just the current thread. Worse - it is triggered only
after the spuriously woken thread consumed a signal. So in many cases the
spuriously woken thread consumes the signal, and a validly woken thread is
forced to retry. The result is that a spurious wakeup may steal signals that
were sent before it started waiting.

Now, I'm confident that the race is real. But maybe some people would disagree
that it is a bug. That's why I asked in my original message if this behaviour
is intentional or a bug.

It is a bug if pthread condition variables should support the following usage: 

   ...

   pthread_mutex_lock(&m);

   SomeType localState = f(sharedState);

   while ( predicate(sharedState, localState) ) {
      pthread_cond_wait(&c, &m);
   }

   ...

In this case it actually matters which thread will wake up, because if the
wrong thread wakes up, it will retry the wait and the signal will be lost (this
is what happened to me). Unfortunately the spec is not very clear on the issue.
But this is the pattern that the pthread_cond_wait implementation in glibc
itself uses to detect spurious wakeups on the futex.
Comment 14 Rich Felker 2011-09-28 16:06:21 UTC
I think it's clear that the usage you describe needs to be supported. Nowhere
does the standard state that predicate must be a pure function of the shared
(mutex-protected) state, or that it even need to depend on the shared state
whatsoever. For example, this is a valid albeit potentially stupid use of cond
variables:

Thread 1:

Lock mutex.
Create thread 2.
Wait on cond var.
Unlock mutex.
Continue with other work.

Thread 2:

Lock mutex.
Signal cond var.
Unlock mutex.
Continue with other work.

There is no guarantee that thread 1 will actually block until thread 2 starts
and signals (it could have a spurious wakeup), but there is a guarantee that
the signal will cause a wakeup if one has not already occurred. Naturally this
example is not analogous to your test (there's nobody else using the cond var),
but the point is that it's valid to use cond vars even without predicates. If
you read the language of the standard, which is in terms of threads currently
blocked rather than predicates, it's clear.

By the way, assuming spurious wakeups are rare, my above "stupid" use of cond
vars may actually be an efficient way to roughly synchronize typical start
times of parallel calculations, perhaps useful in benchmarks or in tweaking
cache utilization for a particular machine. Barriers would probably be more
appropriate, however.
Comment 15 Mihail Mihaylov 2011-09-28 20:59:15 UTC
Well, you don't need to convince me. I think this is a bug too. But based on
how Ulrich Drepper responded to Bug 12875, he might claim that this is not a
bug either. Bug 12875 is almost certainly a manifestation of the same race.
Comment 16 Torvald Riegel 2012-09-19 15:15:02 UTC
I'm not aware of any requirement that pthread_cond_signal should block until a
waiter has actually woken up. (Your test case relies on it to not block, so
that it can send out multiple signals while holding the mutex, right?)  I'm
also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
you combine both, you will see that the behavior you observe is a valid
execution.

(In reply to comment #5)
> 4) After the main thread has sent all signals, it starts waiting for at least
> two waiters to block on the second wait. This is facilitated by a counter of
> the threads that have reached the second wait and one more auxiliary cond var.

And here you do block for waiters to have consumed a signal (i.e., for a call
to pthread_cond_signal to have finished its work and delivered a signal), but
you do this just for two of the signals / calls.

If you do not want to wait for all signals being delivered yet still need a
fair cond var implementation, I suggest either (1) building your own (e.g., you
could build something like a simple queue lock based on pthread mutexes or cond
vars) or (2) propose an addition of a fair cond var to glibc or the respective
standards bodies.
Comment 17 Torvald Riegel 2012-09-19 15:20:45 UTC
(In reply to comment #13)
> It is a bug if pthread condition variables should support the following usage: 
> 
>    ...
> 
>    pthread_mutex_lock(&m);
> 
>    SomeType localState = f(sharedState);
> 
>    while ( predicate(sharedState, localState) ) {
>       pthread_cond_wait(&c, &m);
>    }
> 
>    ...
> 
> In this case it actually matters which thread will wake up, because if the
> wrong thread wakes up, it will retry the wait and the signal will be lost (this
> is what happened to me). Unfortunately the spec is not very clear on the issue.

If the spec doesn't guarantee something, it's usually best to not depend on
this.

> But this is the pattern that the pthread_cond_wait implementation in glibc
> itself uses to detect spurious wakeups on the futex.

Not quite.  For one, pthread_cond_wait doesn't try to establish the ordering
that you seem to depend on the piece of code above.
Comment 18 Rich Felker 2012-09-19 17:23:20 UTC
> I'm not aware of any requirement that pthread_cond_signal should block until a
> waiter has actually woken up. (Your test case relies on it to not block, so

Do you mean "should block" or "should not block"? POSIX's definition of threads
has general language that requires that forward process be possible, and
blocking in pthread_cond_signal until a waiter actually wakes up (including
acquiring the mutex) would be non-conformant; in fact it would be a guaranteed
deadlock.

> that it can send out multiple signals while holding the mutex, right?)  I'm

It is definitely required that pthread_cond_signal can be called more than
once.

> also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
> you combine both, you will see that the behavior you observe is a valid
> execution.

Nobody has asked for glibc to satisfy any fairness constraint. The standard
says it shall unblock at least one thread that "has blocked" on the condition
variable, not that it can randomly fail to do so or instead send the unblocking
event into the future to be consumed by another thread that has not yet blocked
but is about to block (after the signaling thread unlocks the mutex). The claim
in this bug report is that glibc is randomly failing (race condition) to
unblock ANY of the threads that have blocked. A viable mechanism of how this
failure is occurring has also been proposed.

You are free to test this hypothesis and claim that this is not what's
happening, or provide a rigorous proof that it couldn't happen. But what you've
been doing so far is mischaracterizing the bug report and my comments as a
complaint about scheduling fairness, which they are not, and this is getting
really frustrating...
Comment 19 Mihail Mihaylov 2012-09-20 10:21:39 UTC
(In reply to comment #16)
Sorry for the long reply. Please, bare with me, because this issue is very
subtle and I don't know how to explain it more succinctly.

First of all, let me clarify that this is a test that exposes the race, and not
the usage scenario that I claim should be supported. The usage scenario is
described in the bug description. Well, actually, I do claim that the scenario
in the test should be supported too, but the scenario in the description makes
more sense.

> I'm not aware of any requirement that pthread_cond_signal should block until a
> waiter has actually woken up. (Your test case relies on it to not block, so
> that it can send out multiple signals while holding the mutex, right?)  I'm
> also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
> you combine both, you will see that the behavior you observe is a valid
> execution.

I'm not making any assumptions about the state of the waiters when
pthread_cond_signal returns. All I'm assuming is that, no matter if the
signaling thread releases and reacquires the mutex after each sent signal or
sends all signals without releasing the mutex, at least as many waiters as the
number of signals will wake (eventually).

But even if this assumption is wrong (and it's not), if you set
releaseMutexBetweenSignals to true, the test will release the mutex after each
sent signal. In this case the test doesn't send multiple signals while holding
the mutex, and the problem still occurs.

As for fairness, this is not about fairness. It is also not about ordering
between the waiters. It's about ordering between waiters and signalers.

I'm getting tired of people jumping to fairness at the first mention of
ordering. You could say that I'm requesting fairness if I wanted the first
single signal to wake the waiter that blocked first. But all I'm requesting is
for the signal to wake at least one of the waiters that started waiting before
the signal was sent. I don't care which one of them.

This is guaranteed by the standard (from the documentation of pthread_cond_wait
and pthread_cond_signal on the opengroup site):

"The pthread_cond_signal() function shall unblock at least one of the threads
that are blocked on the specified condition variable cond (if any threads are
blocked on cond)."

And I think the next quote makes it very clear what threads are considered to
be blocked on the condvar at the time of the call to pthread_cond_signal():

"That is, if another thread is able to acquire the mutex after the
about-to-block thread has released it, then a subsequent call to
pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave
as if it were issued after the about-to-block thread has blocked."

In effect this means that each call to pthread_cond_signal() defines a point in
time and all waiters (or calls to pthread_cond_wait() if you prefer) are either
before this call, or after it. And only the ones that are before it are allowed
to consume the signal sent by this call.

Now, of course in a multiprocessor system it is hard to order events in time,
but that's where the mutex comes in. And if the signaling thread sends multiple
signals while holding the mutex, we can consider all these signals to be
simultaneous. But that doesn't change the validity of the test.

On the other hand, the standard doesn't guarantee that there won't be spurious
wakeups. However, glibc tries to prevent them. But the logic for this
prevention is flawed and causes the race that this bug is about.

So the net result is that glibc chose to provide a feature that is not
required, but dropped a much more important feature which is actually required.
Hence, this bug is not a fairness feature request, it is a correctness defect
report.
Comment 20 Torvald Riegel 2012-09-20 10:43:00 UTC
(In reply to comment #18)
> > I'm not aware of any requirement that pthread_cond_signal should block until a
> > waiter has actually woken up. (Your test case relies on it to not block, so
> 
> Do you mean "should block" or "should not block"?

"Block", as I wrote.

> POSIX's definition of threads
> has general language that requires that forward process be possible, and
> blocking in pthread_cond_signal until a waiter actually wakes up (including
> acquiring the mutex) would be non-conformant; in fact it would be a guaranteed
> deadlock.

That's what I'm pointing out.  Glad you agree on this point.

> 
> > that it can send out multiple signals while holding the mutex, right?)  I'm
> 
> It is definitely required that pthread_cond_signal can be called more than
> once.
> 
> > also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
> > you combine both, you will see that the behavior you observe is a valid
> > execution.
> 
> Nobody has asked for glibc to satisfy any fairness constraint.

Glad you agree on that too. Now please just think about the combination of both
for a minute.

> The standard
> says it shall unblock at least one thread that "has blocked" on the condition
> variable, not that it can randomly fail to do so or instead send the unblocking
> event into the future to be consumed by another thread that has not yet blocked
> but is about to block (after the signaling thread unlocks the mutex).

You assume that only some threads can count as blocked, which is not guaranteed
by the standard. Signalers are not required to have finished delivering the
signal when they return.  Thus, the scope of which threads are blocked can be
longer than you assume.  Combined with no fairness guarantee this exactly
allows the behavior that we're talking about here.

The standard indeed doesn't talk about the "future".  It doesn't make a sort of
lower-bound requirement on which threads have to be considered blocked, but no
upper bound.  If you think there's an upper bound, please point the requirement
in the standard.  If there is no required upper bound, it's up to the
implementation how to deal with that.

> The claim
> in this bug report is that glibc is randomly failing (race condition) to
> unblock ANY of the threads that have blocked.

No.  The claim is that the "wrong" threads have been unblocked, where "wrong"
is based on an assumption on ordering or an upper-bound-on-blocked-threads
guarantee that is not required by the standard.

> A viable mechanism of how this
> failure is occurring has also been proposed.
> 
> You are free to test this hypothesis and claim that this is not what's
> happening, or provide a rigorous proof that it couldn't happen.

I have argued that this is allowed behavior, because there is no requirement
that conflicts with it, and pointed out why.  You haven't done that; just
because you think the standard should have a certain requirement doesn't mean
it actually has.

> But what you've
> been doing so far is mischaracterizing the bug report and my comments as a
> complaint about scheduling fairness, which they are not, and this is getting
> really frustrating...

I've never talked about "scheduling fairness" (assuming you mean the OS
scheduler here), just about fairness regarding which thread gets wakened. So
much for mischaracterization, eh?

Sorry to hear that you feel frustrated, but I'd like to point out that I don't
think that I'm the cause for this.

Bottom line: In my opinion, this is not a bug.  However, it might be good to
explain why this behavior is allowed (e.g., somewhere in the docs or on the
wiki), so that this doesn't surprise other users.
Comment 21 Mihail Mihaylov 2012-09-20 11:05:21 UTC
(In reply to comment #20)
> The standard indeed doesn't talk about the "future".  It doesn't make a sort of
> lower-bound requirement on which threads have to be considered blocked, but no
> upper bound.  If you think there's an upper bound, please point the requirement
> in the standard.  If there is no required upper bound, it's up to the
> implementation how to deal with that.

"The pthread_cond_broadcast() and pthread_cond_signal() functions shall have no
effect if there are no threads currently blocked on cond."

How about this as an upper bound? If implementations are allowed to determine
the set of blocked threads at any point in time they see fit, there would be no
way to define "currently blocked" at all and this sentence couldn't make any
sense.

And also:

".... however, if predictable scheduling behavior is required, then that mutex
shall be locked by the thread calling pthread_cond_broadcast() or
pthread_cond_signal()."

If I accept your argument, there will be no way to determine at least a set of
threads from which the woken thread will be chosen, so why does the standard
talk about predictability?
Comment 22 Torvald Riegel 2012-09-20 11:22:45 UTC
(In reply to comment #19)
> (In reply to comment #16)
> Sorry for the long reply. Please, bare with me, because this issue is very
> subtle and I don't know how to explain it more succinctly.

No worries.  If it would be straight-forward, we wouldn't be talking about it
here :)

> 
> First of all, let me clarify that this is a test that exposes the race, and not
> the usage scenario that I claim should be supported. The usage scenario is
> described in the bug description. Well, actually, I do claim that the scenario
> in the test should be supported too, but the scenario in the description makes
> more sense.

I think your test is good and helps to point out the issue (including,
actually, why it's not a bug).  In your comment #1, I agree on assumption 1),
but not on assumption 2). The latter (2) is not required by the standard.

> > I'm not aware of any requirement that pthread_cond_signal should block until a
> > waiter has actually woken up. (Your test case relies on it to not block, so
> > that it can send out multiple signals while holding the mutex, right?)  I'm
> > also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
> > you combine both, you will see that the behavior you observe is a valid
> > execution.
> 
> I'm not making any assumptions about the state of the waiters when
> pthread_cond_signal returns.

You do assume that only those will be associated with this particular signal
call.  We can argue that whether this is part of the state of waiters or not,
but I hope you'll agree that it's at least a property that the waiters are part
of.

> All I'm assuming is that, no matter if the
> signaling thread releases and reacquires the mutex after each sent signal or
> sends all signals without releasing the mutex, at least as many waiters as the
> number of signals will wake (eventually).

This assumption is correct, but you assume more:  That only the threads before
cond_signal returns will be considered as blocked.

So, if you have waiters W and signal S, and an execution of (W1->W2->S->W3),
then the standard requires that W1 and W2 need to be considered as blocked
threads by S.  It does not require that W3 is _not_ considered as a blocked
thread.

Informally, S is required to be ordered after W1 and W2, but it is unspecified
whether it is ordered before W3. S does not block; it is like starting an
asynchronous operation that will eventually deliver a signal.  S can return
earlier, and the earlier return can happen before W3, but that doesn't mean
anything for the delivery of the signal.

> But even if this assumption is wrong (and it's not), if you set
> releaseMutexBetweenSignals to true, the test will release the mutex after each
> sent signal. In this case the test doesn't send multiple signals while holding
> the mutex, and the problem still occurs.
> 
> As for fairness, this is not about fairness. It is also not about ordering
> between the waiters. It's about ordering between waiters and signalers.

The ordering between waiters and signalers is the first point, which I
illustrated with "signalers don't block for waiters".  After that, we have the
problem of which of the blocked threads the signal chooses to wake up, so this
becomes an ordering problem (i.e., a selection problem if applied
continuously). On an abstract level, this is a typical fairness problem.

As I pointed out, it's the combination of both these issues.

> I'm getting tired of people jumping to fairness at the first mention of
> ordering.

If you're tired, get some sleep before suggesting that the participants of this
discussion don't understand ordering.

> You could say that I'm requesting fairness if I wanted the first
> single signal to wake the waiter that blocked first. But all I'm requesting is
> for the signal to wake at least one of the waiters that started waiting before
> the signal was sent. I don't care which one of them.

That's the first wrong assumption ("before the signal" [call returned]).  If
you don't make that incorrect assumption, then it becomes a fairness issue
among classes of waiters, where the classes are defined by whether they
happened before a certain signal call.

> This is guaranteed by the standard (from the documentation of pthread_cond_wait
> and pthread_cond_signal on the opengroup site):
> 
> "The pthread_cond_signal() function shall unblock at least one of the threads
> that are blocked on the specified condition variable cond (if any threads are
> blocked on cond)."
>
> And I think the next quote makes it very clear what threads are considered to
> be blocked on the condvar at the time of the call to pthread_cond_signal():

It says that a thread must be considered blocked if the cond_wait call happened
before the signal call.  It does NOT say that ONLY those threads need to be
considered blocked.

> "That is, if another thread is able to acquire the mutex after the
> about-to-block thread has released it, then a subsequent call to
> pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave
> as if it were issued after the about-to-block thread has blocked."
> 
> In effect this means that each call to pthread_cond_signal() defines a point in
> time and all waiters (or calls to pthread_cond_wait() if you prefer) are either
> before this call, or after it. And only the ones that are before it are allowed
> to consume the signal sent by this call.

No it does not.  It only talks about what happens before (=> meaning it
implies):

waiter happens-before signaler => waiter is considered blocked

It does not say (<=> meaning being equivalent to):

waiter happens-before signaler <=> waiter is considered blocked

(In fact, if you spin the second (<=>) further, you could only expect / build
an ordering based on the mutex, but not for other kinds of enforced
happens-before order. That's not what we'd want.)

> On the other hand, the standard doesn't guarantee that there won't be spurious
> wakeups. However, glibc tries to prevent them. But the logic for this
> prevention is flawed and causes the race that this bug is about.

The term "race" here is misleading.  If there's a race, in it's how you use the
cond var and expect it to behave.  The logic would be flawed if it allowed
incorrect behavior, which it doesn't.

> So the net result is that glibc chose to provide a feature that is not
> required, but dropped a much more important feature which is actually required.
> Hence, this bug is not a fairness feature request, it is a correctness defect
> report.

No.  You assume a guarantee that isn't required by the standard.

The comment that you are asking for a fairness feature is an attempt at an
explanation for you that should point out the difference to what the standard
guarantees.  This was meant to help in this discussion.  If we want to further
discuss this, I believe this needs to focus around whether the standard
requires that certain threads must not be considered blocked.
Comment 23 Torvald Riegel 2012-09-20 11:58:26 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > The standard indeed doesn't talk about the "future".  It doesn't make a sort of
> > lower-bound requirement on which threads have to be considered blocked, but no
> > upper bound.  If you think there's an upper bound, please point the requirement
> > in the standard.  If there is no required upper bound, it's up to the
> > implementation how to deal with that.
> 
> "The pthread_cond_broadcast() and pthread_cond_signal() functions shall have no
> effect if there are no threads currently blocked on cond."
> 
> How about this as an upper bound?

This states something in relation to those threads that are considered to be
blocked.  It does not state anything about which threads can or have to be
considered to be blocked.  So, it can't be an upper bound.

> If implementations are allowed to determine
> the set of blocked threads at any point in time they see fit, there would be no
> way to define "currently blocked" at all and this sentence couldn't make any
> sense.

There is a lower bound (or minimum requirement) based on the happens-before via
the mutex (hence "currently").  The sentence allows the implementation to let
the signal have no effect if there is no thread that has to be considered
blocked with the assumption of the lower bound.  Assuming more threads to be
blocked is the same as allowing spurious wake-ups.

> And also:
> 
> ".... however, if predictable scheduling behavior is required, then that mutex
> shall be locked by the thread calling pthread_cond_broadcast() or
> pthread_cond_signal()."
> 
> If I accept your argument, there will be no way to determine at least a set of
> threads from which the woken thread will be chosen, so why does the standard
> talk about predictability?

There is the lower bound, which does determine properties of this set.
Comment 24 Mihail Mihaylov 2012-09-20 12:46:24 UTC
(In reply to comment #23)
> This states something in relation to those threads that are considered to be
> blocked.  It does not state anything about which threads can or have to be
> considered to be blocked.  So, it can't be an upper bound.

It doesn't say "blocked at the time the signal is delivered". It says
"currently blocked". And it doesn't talk about the effect of the delivery of
the signal, it talks about the effect of the function. So "current" and
"function" should be related somehow. I'm taking the relation to be that
"current" means "at the time of the call to the function".

But the language of the standard is too vague to make it explicit what
"currently" means here, so I guess you are free to interpret it otherwise.

> There is a lower bound (or minimum requirement) based on the happens-before via
> the mutex (hence "currently").  The sentence allows the implementation to let
> the signal have no effect if there is no thread that has to be considered
> blocked with the assumption of the lower bound.  Assuming more threads to be
> blocked is the same as allowing spurious wake-ups.

I don't really buy your argument here, but I just realized that this sentence
only talks about the situation when there are no blocked threads, so I cannot
use it to reason about the case when there are blocked waiters before the call.

Again, the vague language of the standard allows your interpretation.

> > If I accept your argument, there will be no way to determine at least a set of
> > threads from which the woken thread will be chosen, so why does the standard
> > talk about predictability?
> 
> There is the lower bound, which does determine properties of this set.

What I meant is that if we accept your interpretation, there are no threads
that can be excluded from the set of blocked threads - except the ones that
never called, nor will ever call pthread_cond_wait() and the ones that have
already returned from all the calls to pthread_cond_wait() they will ever make.

To illustrate this, let me take what you're saying to its logical conclusion.

Suppose for simplicity that we have a single call to pthread_cond_signal() and
many calls to pthread_cond_wait(), both before and after the call to
pthread_cond_signal(). What you are actually saying is that it is correct for
the call to pthread_cond_signal() to consider all of these threads to be
blocked, and is allowed to wake any of them, even threads that are not created
yet.

Even more, if the signaling thread calls pthread_cond_wait() some time (even
hours) after it called pthread_cond_signal(). it is correct (although
undesirable) for it to consume its own signal and all the rest waiters to
become blocked.

Although I agree that the spec allows this interpretation, I think it is highly
impractical and unintuitive, that's why I believe it is not what the authors
had in mind. But of course, the only resolution of this argument at this point
is to ask them for clarification. So I don't see any point in arguing any more
without involving them.

Meanwhile, just consider this: you have code in the implementation which tries
to prevent spurious wakeups and it basically aims to establish a timeline of
the calls to pthread_cond_signal() and pthread_cond_wait() and assumes that a
wakeup is spurious if it occurred without a signal being sent after the
respective thread blocked. I dare say that the code suggests that whoever wrote
this code had the same assumptions about ordering as me. And it actually
contradicts your interpretation, because with your interpretation a single
counter of outstanding signals would be enough.

BTW, the reason why this code fails to work correctly is very simple - you
can't detect spurious wakeups reliably using constant memory without giving up
all ordering guarantees.
Comment 25 Mihail Mihaylov 2012-09-20 12:49:05 UTC
(In reply to comment #24)
> Even more, if the signaling thread calls pthread_cond_wait() some time (even
> hours) after it called pthread_cond_signal(). it is correct (although
> undesirable) for it to consume its own signal and all the rest waiters to
> become blocked.

I meant "to remain blocked"...
Comment 26 Torvald Riegel 2012-09-20 16:21:01 UTC
(In reply to comment #24)
> Although I agree that the spec allows this interpretation, I think it is highly
> impractical and unintuitive, that's why I believe it is not what the authors
> had in mind. But of course, the only resolution of this argument at this point
> is to ask them for clarification. So I don't see any point in arguing any more
> without involving them.

If the requirements in the specification should change in the future, we can
look at this again. I personally don't think that this makes cond vars
significantly impractical or unintuitive, but that is a trade-off that the
standards body will have to make (or probably did when it produced the current
spec).

Until then, I suppose that you now agree that this is correct behavior
according to the current spec.

> Meanwhile, just consider this: you have code in the implementation which tries
> to prevent spurious wakeups and it basically aims to establish a timeline of
> the calls to pthread_cond_signal() and pthread_cond_wait()

Not a timeline, but just overall number of signals or threads that started
waiting.

> and assumes that a
> wakeup is spurious if it occurred without a signal being sent after the
> respective thread blocked.

It is spurious if there were not enough signals being sent, or other threads
consumed the signals that were sent.

> I dare say that the code suggests that whoever wrote
> this code had the same assumptions about ordering as me.

Looking at the git log for the respective files, you should take that question
to Ulrich or Jakub.

> And it actually
> contradicts your interpretation, because with your interpretation a single
> counter of outstanding signals would be enough.

I doubt that (if you want to do this efficiently).

> BTW, the reason why this code fails to work correctly is very simple - you
> can't detect spurious wakeups reliably using constant memory without giving up
> all ordering guarantees.

I think you can (just combine something like a ticket lock with the futex) --
but this won't be efficient.
Comment 27 Rich Felker 2012-09-20 18:39:10 UTC
I disagree strongly that the spec even allows Torvald's interpretation.
Torvald's claim is essentially that an implementation can consider an
unspecified set of threads beyond those which "have blocked" per the
specification of pthread_cond_wait to also "have blocked" on the condition. Not
only is there no language in the standard to support this (the only definition
of "blocked" on a condition variable is the one we've cited); it would also
make the specification of pthread_cond_signal completely useless, as the "at
least one" could always be taken to mean "the one invisible thread your
application can't see that's always blocked on every possible condition
variable". This is the pinnacle of absurdity in an attempt to language-lawyer
out of fixing a bug.

The fact of the matter is that POSIX, along with common sense and the entire
intended usage of condition variables, requires pthread_cond_signal to unblock
at least one thread that "has blocked", in the sense of the blocking that
happens atomically with unlocking the mutex as described in the specification
of pthread_cond_wait.

The situation we're looking at here is that the authors of NPTL came up with a
clever hack to reduce spurious wakeups in pthread_cond_wait, but failed to
realize the hack was broken in some corner cases, and now Torvald is unwilling
to admit that the hack is broken and take it out.

It should also be noted that I have been unable to demonstrate any case where
NPTL's hack to prevent spurious wakeups actually has any positive effect. A
while back I wrote a test program to hammer condition variables and look for
spurious wakeups, and I could not generate any with either NPTL's buggy
implementation OR with musl's implementation which does not use any similar
hack and does not suffer from this bug. Thus, in addition to being wrong and
broken, it's my conclusion, unless anybody else can produce evidence to the
contrary, that the hack is also useless (does not have the intended effect of
improving performance).
Comment 28 Mihail Mihaylov 2012-09-20 19:48:25 UTC
(In reply to comment #27)
> I disagree strongly that the spec even allows Torvald's interpretation.
> Torvald's claim is essentially that an implementation can consider an
> unspecified set of threads beyond those which "have blocked" per the
> specification of pthread_cond_wait to also "have blocked" on the condition.

Yes, that's what he claims.

> Not
> only is there no language in the standard to support this (the only definition
> of "blocked" on a condition variable is the one we've cited);

Yes, there is no language to support it, but I must admit that there is also no
language to explicitly prevent it, even though I too consider this
interpretation completely unreasonable as I tried to explain several times.

Anyway, this whole dispute has been reduced to the question of which threads
are eligible for wakeup, so I've taken the liberty to post a clarification
request to the Austin Group, asking them to add explicit text explaining which
threads should be considered blocked with respect to a pthread_cond_signal()
call. The clarification request is at
http://austingroupbugs.net/view.php?id=609. Torvald, please correct me if have
inadvertently misrepresented your position.
Comment 29 Rich Felker 2012-09-20 20:31:26 UTC
The lack of language to explicitly prevent something is not necessary. Do you
see any language that explicitly prevents an implementation from writing the
text "42" to stdout when you call strlen? I would grant that Torvald has an
argument if the application had called any glibc functions not specified by
POSIX (which could be defined by the implementation to do all sorts of crazy
things) or if the application had passed constants other than those defined by
POSIX to standard functions (e.g. a special attribute for the condition
variable). But in the absence of that, no interface in the standard library can
have further side effects on other interfaces/objects than what it's specified
to do.
Comment 30 Torvald Riegel 2012-09-21 08:03:55 UTC
(In reply to comment #27)
> I disagree strongly that the spec even allows Torvald's interpretation.
> Torvald's claim is essentially that an implementation can consider an
> unspecified set of threads beyond those which "have blocked" per the
> specification of pthread_cond_wait to also "have blocked" on the condition.

No it's not unspecified. The specification requires a lower bound on this set.
If you think that this is not just a lower bound, argue against the reasons I
gave for this, don't just repeat your opinion here, please.

> Not
> only is there no language in the standard to support this (the only definition
> of "blocked" on a condition variable is the one we've cited);

Which is the lower bound. And this is a meaningful requirement, and provides
cond vars that are useful.

> it would also
> make the specification of pthread_cond_signal completely useless, as the "at
> least one" could always be taken to mean "the one invisible thread your
> application can't see that's always blocked on every possible condition
> variable".

You do see that there is a difference between this and requiring "blocked" but
not requiring an upper bound based on happens before?

> This is the pinnacle of absurdity in an attempt to language-lawyer
> out of fixing a bug.

Frankly, I don't care what you think that my motivations are. And I also don't
speculate here about your motivations, nor about your abilities to distinguish
an implication from equivalence. So please stop making such statements.

What I care about is whether this is indeed a bug or not.  Taking this question
to the standards body for clarification is the right way to go if people think
that the standard should require something else, or be more precise is what is
allowed or not.

> The fact of the matter is that POSIX, along with common sense and the entire
> intended usage of condition variables, requires pthread_cond_signal to unblock
> at least one thread that "has blocked", in the sense of the blocking that
> happens atomically with unlocking the mutex as described in the specification
> of pthread_cond_wait.

That doesn't conflict with what I said. You mention an entirely different point
here.

> The situation we're looking at here is that the authors of NPTL came up with a
> clever hack to reduce spurious wakeups in pthread_cond_wait, but failed to
> realize the hack was broken in some corner cases, and now Torvald is unwilling
> to admit that the hack is broken and take it out.

See above. Aside from this being impolite, speculation like this does not
belong in a bug report.
Comment 31 Rich Felker 2012-09-21 08:52:51 UTC
Such "speculation" does belong in a bug report so that there is public record
of this behavior of trying to weasel out of conformance requirements. Public
opinion counts, and there should be a cost in public opinion to software
maintainers who refuse to fix bugs and try to argue that their bugs are not
bugs.

Certainly I cannot read your mind to prove claims about your motivation. It
really just boils down to a case of Occam's razor. When fixing the issue would
basically amount to an all-minuses patch that greatly simplifies the code, yet
the maintainers refuse to do it, the simplest explanation is that somebody has
an attachment to the code and the work that went into writing it. If your
motivation is something else, like perhaps concerns about making a mistake and
breaking something in the process of fixing it, why not say that outright
rather than leaving everybody stuck having to speculate about your motivations?
Then the pros and cons of fixing it can be argued rationally. Or do you claim
you have no further motivation here than "I believe the standard allows
low-quality implementations with bad properties like this, so I want to
exercise my right to do a bad job and still claim conformance"?

With the motivation topic out of the way, let's get back to the requirements of
the standard. My understanding is that you claim "the threads that are blocked
on the specified condition variable cond" is a set consisting of at least those
threads which provably (by virtue of the mutex providing ordering) called
pthread_cond_[timed]wait before the caller of pthread_cond_signal obtained the
mutex, but which also may contain other threads. Not, as I originally accused
you of, a completely arbitrary set of other threads, but rather threads which
are "about to" wait on the condition variable in the immediate future after the
signaling thread unlocks the mutex. Is this a correct assessment of your
position?

If so, can you clarify what condition would qualify such threads for membership
in this set? Certainly it can't be any thread that could ever conceivably wait
on the condition variable at any later point in the program flow; if nothing
else, that set is self-referential and thus not even a set (because which
threads are candidates for membership may depend on which thread the signal
unblocked).

My claim that the set of candidate threads pthread_cond_signal can unblock is a
fixed set is based on the following:

1. The status of being blocked on a condition variable is defined only in the
specification of pthread_cond_[timed]wait. I think we both agree on this. There
is no way a thread is ever blocked on a condition variable except by calling
pthread_cond_[timed]wait on it.

2. Sequencing of events between threads is not defined in general, but the use
of mutexes with condition variables imposes a sequencing. In particular, from
the point of view of any other thread, a given thread's status as being blocked
on a condition variable is acquired after it obtains the mutex and before
pthread_cond_[timed]wait releases the mutex. The former sequencing requirement
comes from the fact that you can only call pthread_cond_[timed]wait with the
mutex held, and the fact that pthread_cond_[timed]wait is the function
specified to block; the latter sequencing requirement is the atomicity of
blocking and unlocking the mutex.

3. The pthread_cond_signal function "shall unblock at least one of the threads
that are blocked on the specified condition variable". There is no language
about queuing an unblock request; the language of the standard is "shall
unblock". This means that even if the mechanism is some sort of queued request,
in the sense of the formal, abstract machine, one of the threads in the set of
blocked threads goes from blocked status to unblocked status during the call to
pthread_cond_signal (and, from the stantpoint of observable sequencing, between
the calling thread's calls to pthread_mutex_lock and pthread_mutex_unlock).
Since the mutex is locked during this time, there is no way additional threads
can become blocked on the condition variable. On the other hand, some could
become unblocked due to spurious wakes, so the set could shrink, but it could
not grow.

If you still claim my reasoning is wrong, please cite the specific steps you
believe are hand-waving or misinterpretation of the standard.
Comment 32 Torvald Riegel 2012-09-21 15:44:30 UTC
(In reply to comment #28)
> (In reply to comment #27)
> > I disagree strongly that the spec even allows Torvald's interpretation.
> > Torvald's claim is essentially that an implementation can consider an
> > unspecified set of threads beyond those which "have blocked" per the
> > specification of pthread_cond_wait to also "have blocked" on the condition.
> 
> Yes, that's what he claims.

Not quite, as I point out in comment #30.

> Anyway, this whole dispute has been reduced to the question of which threads
> are eligible for wakeup, so I've taken the liberty to post a clarification
> request to the Austin Group, asking them to add explicit text explaining which
> threads should be considered blocked with respect to a pthread_cond_signal()
> call. The clarification request is at
> http://austingroupbugs.net/view.php?id=609. Torvald, please correct me if have
> inadvertently misrepresented your position.

Thanks.  I have added a note there that summarizes the interpretation that the
current implementation is based on.  Let's see what they think about this
issue.
Comment 33 Mihail Mihaylov 2012-10-18 06:26:22 UTC
(In reply to comment #32)
> (In reply to comment #28)
> ....
> ....
> > Anyway, this whole dispute has been reduced to the question of which threads
> > are eligible for wakeup, so I've taken the liberty to post a clarification
> > request to the Austin Group, asking them to add explicit text explaining which
> > threads should be considered blocked with respect to a pthread_cond_signal()
> > call. The clarification request is at
> > http://austingroupbugs.net/view.php?id=609. Torvald, please correct me if have
> > inadvertently misrepresented your position.
> 
> Thanks.  I have added a note there that summarizes the interpretation that the
> current implementation is based on.  Let's see what they think about this
> issue.

The Austin Group have reached an official position. They have decided to make
changes to some of the texts related to condition variables. I believe that the
changes as they announced them yesterday invalidate glibc's interpretation of
the spec.

Let me point out that these changes do not add new requirements to the spec.
They just make explicit the requirements that were already suggested by the
spec.

In my opinion, at this point it is already clear that NPTL's implementation of
condition variables does not conform to the POSIX spec, therefore this bug is
actually really a bug. I hope that the NPTL team will acknowledge it as such.
Comment 34 Rich Felker 2012-10-18 12:25:16 UTC
It should be noted that no changes were made to the requirements the standard
places on implementations; the changes made are only clarifications, since
apparently the original language was not clear enough.
Comment 35 Torvald Riegel 2012-10-24 20:25:32 UTC
(In reply to comment #33)
> The Austin Group have reached an official position. They have decided to make
> changes to some of the texts related to condition variables. I believe that the
> changes as they announced them yesterday invalidate glibc's interpretation of
> the spec.

I agree.  Those changes disallow glibc's current behavior.

> Let me point out that these changes do not add new requirements to the spec.
> They just make explicit the requirements that were already suggested by the
> spec.

I disagree.  It's a specification -- it has to be explicit.

(In reply to comment #34)
> It should be noted that no changes were made to the requirements the standard
> places on implementations; the changes made are only clarifications, since
> apparently the original language was not clear enough.

How is that not a change in the spec:
http://austingroupbugs.net/view.php?id=609#c1403 ?  Are we talking about the
same thing here?

Apparently, the original language was not clear enough.  Otherwise, why change
the spec?  If it were clear enough, an interpretation explanation or something
like that would have been sufficient, right.  They even say that they see the
need to "produce some interpretation text to precede these changes".
Comment 36 Rich Felker 2012-10-25 04:07:39 UTC
The note #0001403 you cited begins:

"In the Oct 17th conference call we agreed to make the following changes.
Further work is needed to produce some interpretation text to precede these
changes."

Said "interpretation text to precede these changes", as I understand it, is an
explanation of how the current text is meant to be interpreted. The "following
changes" then are clarifications to appear in the TC or next version of the
standard, but the interpretation is (or will be) that the current standard
already requires exactly the same thing, at least from an observable-behavior
standpoint.

As I have stated before, I do not see any way one could interpret the current
glibc behavior as conformant. Under the current glibc behavior, events that
should be synchronized by a mutex observably/provably occur in an order
different from the order the synchronization imposes.