<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "http://sourceware.org/bugzilla/page.cgi?id=bugzilla.dtd">

<bugzilla version="4.4+"
          urlbase="http://sourceware.org/bugzilla/"
          
          maintainer="overseers@sourceware.org"
>

    <bug>
          <bug_id>13165</bug_id>
          
          <creation_ts>2011-09-07 19:14:00 +0000</creation_ts>
          <short_desc>pthread_cond_wait() can consume a signal that was sent before it started waiting</short_desc>
          <delta_ts>2013-01-19 16:18:55 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>glibc</product>
          <component>nptl</component>
          <version>2.14</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>ASSIGNED</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Mihail Mihaylov">mihaylov.mihail</reporter>
          <assigned_to name="Torvald Riegel">triegel</assigned_to>
          <cc>bugdal</cc>
    
    <cc>scot4spam</cc>
    
    <cc>siddhesh</cc>
    
    <cc>triegel</cc>
          <cf_gcchost></cf_gcchost>
          <cf_gcctarget></cf_gcctarget>
          <cf_gccbuild></cf_gccbuild>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>50510</commentid>
    <comment_count>0</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-07 19:14:55 +0000</bug_when>
    <thetext>I was implementing something like a monitor on top of pthread condition variables and I observed some strange behaviour. I was always holding the mutex when calling pthread_cond_signal(). My code relied on only two assumptions about the way pthread_cond_signal() works:

1) A call to pthread_cond_signal() will wake at least one thread which is blocked on the condition, and the woken threads will start waiting on the mutex.

2) If the signaling thread holds the mutex when it calls pthread_cond_signal(), only threads which are already waiting on the condition variable may be woken. In particular, if the signaling thread releases the mutex and then another thread acquires the mutex and calls pthread_cond_wait(), the waiting thread cannot be woken by this signal, no matter what other waiters are present before or after the signal.

The only explanation that I could find for the observed behaviour was that my second assumption was wrong. It seemed that I was hitting the following scenario:

1) We have several threads which are blocked on the condvar in pthread_cond_wait(). I&apos;ll call these threads &quot;group A&quot;.

2) We then send N signals from another thread while holding the mutex. We are releasing the mutex and acquiring it again between the signals.

3) Next we have several more threads (at least two) that acquire the mutex and enter pthread_cond_wait(). I&apos;ll call these threads &quot;group B&quot;

4) Then we acquire the mutex in the signaling thread again and call pthread_cond_signal() just once, then we release the mutex.

5) Two threads from group B wake up, and N-1 threads from group A wake up. In effect one of the threads from group B has stolen a signal that was sent before it started waiting from a thread from group A.

My expectation in this scenario is that at least N threads from group A should wake up. I don&apos;t expect that exactly one thread from group B should wake up, because spurious wakeups are possible. But this is not a spurious wakeup - I have N signals, and N woken threads, it&apos;s just that the order is wrong.

I ran some experiments and they seemed to confirm my theory, so I looked at the condvar implementation in nptl. I&apos;m new to POSIX and Linux programing, but I think I see how this can happen:

1) When we send the first N signals, N threads from group A that are waiting on the cond-&gt;__data.__futex are woken and start waiting on cond-&gt;__data.__lock.

2) Then while the threads from group B enter pthread_cond_wait, some of the woken threads from group A may remain waiting on the lock.

3) When we send the last signal, one thread from group B will wake and consume this signal.

4) But suppose one more thread from group B wakes spuriously from lll_futex_wait. At this moment it is possible that some of the woken threads from group A will still be waiting on cond-&gt;__data.__lock. In that case the spuriously woken thread from group B will see that cond-&gt;__data.__wakeup_seq has changed (because of the last signal) and cond-&gt;__data._woken_seq has not reached cond-&gt;__data.__wakeup_seq (because some of the woken threads in group A are still waiting to acquire cond-&gt;__data.__lock), so it will exit the retry loop and increase cond-&gt;__data.__woken_seq. The result is that the thread will steal the signal.

Is this scenario really possible? And if it is, is this on purpose or is it a bug?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50705</commentid>
    <comment_count>1</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-21 09:12:26 +0000</bug_when>
    <thetext>It&apos;s been two weeks with no update on the bug. There is no way to tell if someone has noticed it at all.

Is it possible to have at least some statement as of if this is considered a valid bug that might be addressed at some point in the future?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50717</commentid>
    <comment_count>2</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2011-09-21 18:18:43 +0000</bug_when>
    <thetext>I&apos;ve waited months for a response to some of my bug reports for NPTL, all race conditions, some of which make it impossible to use major features like cancellation in a robust program. Nothing. Good luck, but don&apos;t hold your breath. The best thing I can recommend to you is cross-posting the bug report to distros&apos; bug trackers, and getting other people interesting these issues to follow up on the bug tracker with confirmations that they can reproduce it.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50724</commentid>
    <comment_count>3</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2011-09-21 22:29:21 +0000</bug_when>
    <thetext>Can you explain how you know (2) is completed before (3) occurs, in your scenario? If there&apos;s no synchronization to order these steps, then isn&apos;t it possible that one or more of the signals happens after a thread from group B is waiting?

If you have a minimal self-contained test case for the issue, I&apos;d be interested in seeing it.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50740</commentid>
    <comment_count>4</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-22 22:21:10 +0000</bug_when>
    <thetext>Thank you for taking an interest in this issue.

(In reply to comment #3)
&gt; Can you explain how you know (2) is completed before (3) occurs, in your
&gt; scenario? If there&apos;s no synchronization to order these steps, then isn&apos;t it
&gt; possible that one or more of the signals happens after a thread from group B is
&gt; waiting?

Basically, because we are holding the mutex when signaling, we can tell exactly which threads started waiting after we finished sending the first N signals. These are the threads that I call &quot;group B&quot;, so by this definition they cannot start waiting before all signals from step 2 have been sent.

What I&apos;m trying to say is that the scenario is not a test case, but rather a hypothetical  sequence of events that can happen and can be observed, so it doesn&apos;t specify why exactly no new threads started waiting during step 2, it just says what happened. This left some ambiguity in my description.

One way to resolve this ambiguity is to say that if during step 2 some threads acquired the mutex and called pthread_cond_wait(), they should be counted towards group A.

Another way is to change step 2 and say that the signaling thread acquired the mutex, sent N signals and only then released the mutex, without releasing it between the signals.

The second way seems simpler and will probably make the race more likely, but the first is closer to what I actually observed.
 
&gt; If you have a minimal self-contained test case for the issue, I&apos;d be interested
&gt; in seeing it.

I don&apos;t have such a test case, but I&apos;ll try to find time in the next days to write one and attach it to the bug.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50759</commentid>
    <comment_count>5</comment_count>
      <attachid>5945</attachid>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-25 21:32:35 +0000</bug_when>
    <thetext>Created attachment 5945
Test to observe the race

Attaching a self contained test. What the test does:

We have a mutex and a condition variable. We also have several auxiliary condition variables and counters.

The main thread locks the mutex and creates as many waiter threads as possible. The waiter threads start by waiting on the mutex. Then both the main thread and the waiter thread start looping to perform iterations of the test until the race condition in NPTL is hit.

The loops of the main thread and the waiter threads are synchronized and go like this:

1) The main thread starts by releasing the mutex and blocking on an auxiliary condvar. This unblocks the waiter threads which start by entering the first wait on the condition variable. Each waiter thread increments a waiters counter before waiting and the last one also signals the auxiliary condvar to notify the main thread that all waiters are blocked on the first wait.

2) When all waiters are blocked on the first wait, the main thread is unblocked and starts sending signals. It sends as many signals as there are waiters, so all waiters should move (eventually) beyond the first wait. The main thread holds the mutex while sending the signals. The &apos;releaseMutexBetweenSignals&apos; constant controls whether it will release and reacquire the mutex between signals.

3) Each unblocked waiter decrements the waiters counter and moves to the second wait. To simplify the test, the waiters don&apos;t enter the second wait until all signals from step 2 have been sent. This is controlled through a sent signals counter and another auxiliary condvar.

4) After the main thread has sent all signals, it starts waiting for at least two waiters to block on the second wait. This is facilitated by a counter of the threads that have reached the second wait and one more auxiliary condvar.

5) When at least two threads have blocked on the second wait, the main thread sends one more signal. Threads that get unblocked from the second wait may start a third wait to allow the test iteration to complete before they loop back to the first wait (of course this actually happens when the main thread releases the mutex in step 6)

6) The main thread starts waiting for all waiters to exit the first wait. Each waiter that exits the first wait decrements the waiters count and the last one signals the last auxiliary condvar that the main thread waits on. If the wait times out, the test has failed, otherwise it has passed.

7) If the test has passed, all waiters are waiting on the condition variable in the second wait or the third wait, so the main thread sends a broadcast to unblock them and all waiters move back to the first wait. With this the test iteration is complete and a new iteration begins.


The main point about this test is that at the point where the main thread sends the single signal, all waiters should be:

1) either waiting on the mutex in the first wait,
2) or waiting on the condition variable in the second wait,
3) or waiting on the mutex in the wait on the auxiliary condvar from step 3

which means that if the mutex gets released for long enough, all threads should acquire the mutex in the first wait and the waiters count should eventually reach zero. Step 6 is meant to provide this time. At step 6, the main thread releases the mutex and starts waiting, and all waiters that acquire the mutex release it almost immediately and start waiting themselves, so there is nothing to prevent the threads from group (1) above from acquiring the mutex one by one and bringing the waiters counter back to 0. The only thing that can get in the way is if there is a waiter which is still blocked on the condition variable in the first wait, which is what the test aims to trigger and detect.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50760</commentid>
    <comment_count>6</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-25 21:43:17 +0000</bug_when>
    <thetext>(In reply to comment #3)
&gt; If you have a minimal self-contained test case for the issue, I&apos;d be interested
&gt; in seeing it.

I wrote a self-contained test case. It is not very simple, and may seem contrived. My real code is actually much simpler (just one condition variable), but it is not suitable as a test.

I ran the test on my dual-core laptop and had consistent results of one stolen signal. I guess that with more cores it will be possible for more signals to be stolen. Typically it took from 3 to 40 iterations to hit the race.

I would appreciate your feedback.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50763</commentid>
    <comment_count>7</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-26 09:13:37 +0000</bug_when>
    <thetext>I ran the test on my workstation too. It took much longer to hit the race. The difference is that on my workstation a process can start over 30000 threads as opposed to about 400 on my laptop. Limiting the number of created threads to several hundred made the race much easier to hit.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50766</commentid>
    <comment_count>8</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2011-09-26 16:19:50 +0000</bug_when>
    <thetext>Could you simplify it? I&apos;m not sure why the test has to be so much more complicated than your real code. If you think there&apos;s a race/bug in cond vars, then it&apos;s probably not a good idea to be using cond vars in the logic for the test loop, since it will be hard to tell if the bug is occurring in the tested code or the control code. Perhaps you could use barriers or semaphores to synchronize the test control logic, if needed. Personally barriers are my favorite synchronization primitive for that sort of thing.

Also, yes, get rid of the &quot;make as many threads as possible&quot; logic. That will not make it any easier to find the race, and the limit to how many threads you can make is usually dependent on available memory/virtual address space, not kernel thread resources. Just pick a &quot;sane&quot; number (probably below 100) and go with it.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50768</commentid>
    <comment_count>9</comment_count>
      <attachid>5946</attachid>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-27 10:10:23 +0000</bug_when>
    <thetext>Created attachment 5946
Simpler test to observe the race

This is a much simpler test case that demonstrates the race. The logic is pretty much the same as with the previous test but it doesn&apos;t use any auxiliary condition variables.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50769</commentid>
    <comment_count>10</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-27 10:13:25 +0000</bug_when>
    <thetext>(In reply to comment #8)
&gt; Could you simplify it? I&apos;m not sure why the test has to be so much more
&gt; complicated than your real code. If you think there&apos;s a race/bug in cond vars,
&gt; then it&apos;s probably not a good idea to be using cond vars in the logic for the
&gt; test loop, since it will be hard to tell if the bug is occurring in the tested
&gt; code or the control code. Perhaps you could use barriers or semaphores to
&gt; synchronize the test control logic, if needed. Personally barriers are my
&gt; favorite synchronization primitive for that sort of thing.
&gt; 
&gt; Also, yes, get rid of the &quot;make as many threads as possible&quot; logic. That will
&gt; not make it any easier to find the race, and the limit to how many threads you
&gt; can make is usually dependent on available memory/virtual address space, not
&gt; kernel thread resources. Just pick a &quot;sane&quot; number (probably below 100) and go
&gt; with it.

I simplified the test significantly as you suggested. Now its somewhat simpler than my actual code.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50780</commentid>
    <comment_count>11</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2011-09-28 02:06:54 +0000</bug_when>
    <thetext>I&apos;ve confirmed that the issue occurs on my Debian system with their libc6 package (eglibc 2.13-10, albeit slightly different from glibc). I&apos;ve also confirmed that the issue does not occur with my implementation of condition variables in musl libc(*). I suspect it&apos;s a real bug, but I need to read the code more closely to understand what&apos;s going on...

(*) I&apos;ve converted the test to C (just replaced cout with printf) so I can run it without C++ support. I&apos;m attaching the modified version.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50781</commentid>
    <comment_count>12</comment_count>
      <attachid>5947</attachid>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2011-09-28 02:08:00 +0000</bug_when>
    <thetext>Created attachment 5947
Simpler test, converted to pure C</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50785</commentid>
    <comment_count>13</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-28 09:02:51 +0000</bug_when>
    <thetext>(In reply to comment #11)
&gt; I&apos;ve confirmed that the issue occurs on my Debian system with their libc6
&gt; package (eglibc 2.13-10, albeit slightly different from glibc).

I originally observed the problem on a Debian stable. I&apos;ve run my test case on my laptop which is running Mint and on my office workstation which is running kubuntu.

I looked at the eglibc source code before posting the bug and saw that the code which causes the race is identical to the one in glibc, so the bug is in both implementations.

&gt; I&apos;ve also confirmed that the issue does not occur with my implementation of
&gt; condition variables in musl libc(*).

I took a look at your code. As far as I can tell, you are not trying to avoid spurious wakeups as hard as glibc, that&apos;s why you don&apos;t have the same race.

&gt; I suspect it&apos;s a real bug, but I need to read the code more closely to
&gt; understand what&apos;s going on...

Here is my understanding of the root cause - an attempt to prevent spurious wakeups that has gone too far and destroys ordering - waking future waiters instead of present ones.

There are two checks that NPTL uses to prevent spurious wakeups:

1) It only allows a thread to wake if a signal has been sent after it started waiting. This is achieved by checking if cond-&gt;__data.__wakeup_seq has remained unchanged.

2) It only allows as many threads to wake up as there were signals. This is achieved by checking if cond-&gt;__data._woken_seq equals cond-&gt;__data.__wakeup_seq.

If any of this checks indicates a spurious wakeup the thread retries the wait.

The problem is in check 2, because the guard is triggered if any thread has woken spuriously - not just the current thread. Worse - it is triggered only after the spuriously woken thread consumed a signal. So in many cases the spuriously woken thread consumes the signal, and a validly woken thread is forced to retry. The result is that a spurious wakeup may steal signals that were sent before it started waiting.

Now, I&apos;m confident that the race is real. But maybe some people would disagree that it is a bug. That&apos;s why I asked in my original message if this behaviour is intentional or a bug.

It is a bug if pthread condition variables should support the following usage: 

   ...

   pthread_mutex_lock(&amp;m);

   SomeType localState = f(sharedState);

   while ( predicate(sharedState, localState) ) {
      pthread_cond_wait(&amp;c, &amp;m);
   }

   ...

In this case it actually matters which thread will wake up, because if the wrong thread wakes up, it will retry the wait and the signal will be lost (this is what happened to me). Unfortunately the spec is not very clear on the issue. But this is the pattern that the pthread_cond_wait implementation in glibc itself uses to detect spurious wakeups on the futex.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50791</commentid>
    <comment_count>14</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2011-09-28 16:06:21 +0000</bug_when>
    <thetext>I think it&apos;s clear that the usage you describe needs to be supported. Nowhere does the standard state that predicate must be a pure function of the shared (mutex-protected) state, or that it even need to depend on the shared state whatsoever. For example, this is a valid albeit potentially stupid use of cond variables:

Thread 1:

Lock mutex.
Create thread 2.
Wait on cond var.
Unlock mutex.
Continue with other work.

Thread 2:

Lock mutex.
Signal cond var.
Unlock mutex.
Continue with other work.

There is no guarantee that thread 1 will actually block until thread 2 starts and signals (it could have a spurious wakeup), but there is a guarantee that the signal will cause a wakeup if one has not already occurred. Naturally this example is not analogous to your test (there&apos;s nobody else using the cond var), but the point is that it&apos;s valid to use cond vars even without predicates. If you read the language of the standard, which is in terms of threads currently blocked rather than predicates, it&apos;s clear.

By the way, assuming spurious wakeups are rare, my above &quot;stupid&quot; use of cond vars may actually be an efficient way to roughly synchronize typical start times of parallel calculations, perhaps useful in benchmarks or in tweaking cache utilization for a particular machine. Barriers would probably be more appropriate, however.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>50802</commentid>
    <comment_count>15</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2011-09-28 20:59:15 +0000</bug_when>
    <thetext>Well, you don&apos;t need to convince me. I think this is a bug too. But based on how Ulrich Drepper responded to Bug 12875, he might claim that this is not a bug either. Bug 12875 is almost certainly a manifestation of the same race.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57484</commentid>
    <comment_count>16</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-19 15:15:02 +0000</bug_when>
    <thetext>I&apos;m not aware of any requirement that pthread_cond_signal should block until a waiter has actually woken up. (Your test case relies on it to not block, so that it can send out multiple signals while holding the mutex, right?)  I&apos;m also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If you combine both, you will see that the behavior you observe is a valid execution.

(In reply to comment #5)
&gt; 4) After the main thread has sent all signals, it starts waiting for at least
&gt; two waiters to block on the second wait. This is facilitated by a counter of
&gt; the threads that have reached the second wait and one more auxiliary cond var.

And here you do block for waiters to have consumed a signal (i.e., for a call to pthread_cond_signal to have finished its work and delivered a signal), but you do this just for two of the signals / calls.

If you do not want to wait for all signals being delivered yet still need a fair cond var implementation, I suggest either (1) building your own (e.g., you could build something like a simple queue lock based on pthread mutexes or cond vars) or (2) propose an addition of a fair cond var to glibc or the respective standards bodies.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57485</commentid>
    <comment_count>17</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-19 15:20:45 +0000</bug_when>
    <thetext>(In reply to comment #13)
&gt; It is a bug if pthread condition variables should support the following usage: 
&gt; 
&gt;    ...
&gt; 
&gt;    pthread_mutex_lock(&amp;m);
&gt; 
&gt;    SomeType localState = f(sharedState);
&gt; 
&gt;    while ( predicate(sharedState, localState) ) {
&gt;       pthread_cond_wait(&amp;c, &amp;m);
&gt;    }
&gt; 
&gt;    ...
&gt; 
&gt; In this case it actually matters which thread will wake up, because if the
&gt; wrong thread wakes up, it will retry the wait and the signal will be lost (this
&gt; is what happened to me). Unfortunately the spec is not very clear on the issue.

If the spec doesn&apos;t guarantee something, it&apos;s usually best to not depend on this.

&gt; But this is the pattern that the pthread_cond_wait implementation in glibc
&gt; itself uses to detect spurious wakeups on the futex.

Not quite.  For one, pthread_cond_wait doesn&apos;t try to establish the ordering that you seem to depend on the piece of code above.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57488</commentid>
    <comment_count>18</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2012-09-19 17:23:20 +0000</bug_when>
    <thetext>&gt; I&apos;m not aware of any requirement that pthread_cond_signal should block until a
&gt; waiter has actually woken up. (Your test case relies on it to not block, so

Do you mean &quot;should block&quot; or &quot;should not block&quot;? POSIX&apos;s definition of threads has general language that requires that forward process be possible, and blocking in pthread_cond_signal until a waiter actually wakes up (including acquiring the mutex) would be non-conformant; in fact it would be a guaranteed deadlock.

&gt; that it can send out multiple signals while holding the mutex, right?)  I&apos;m

It is definitely required that pthread_cond_signal can be called more than once.

&gt; also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
&gt; you combine both, you will see that the behavior you observe is a valid
&gt; execution.

Nobody has asked for glibc to satisfy any fairness constraint. The standard says it shall unblock at least one thread that &quot;has blocked&quot; on the condition variable, not that it can randomly fail to do so or instead send the unblocking event into the future to be consumed by another thread that has not yet blocked but is about to block (after the signaling thread unlocks the mutex). The claim in this bug report is that glibc is randomly failing (race condition) to unblock ANY of the threads that have blocked. A viable mechanism of how this failure is occurring has also been proposed.

You are free to test this hypothesis and claim that this is not what&apos;s happening, or provide a rigorous proof that it couldn&apos;t happen. But what you&apos;ve been doing so far is mischaracterizing the bug report and my comments as a complaint about scheduling fairness, which they are not, and this is getting really frustrating...</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57494</commentid>
    <comment_count>19</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2012-09-20 10:21:39 +0000</bug_when>
    <thetext>(In reply to comment #16)
Sorry for the long reply. Please, bare with me, because this issue is very subtle and I don&apos;t know how to explain it more succinctly.

First of all, let me clarify that this is a test that exposes the race, and not the usage scenario that I claim should be supported. The usage scenario is described in the bug description. Well, actually, I do claim that the scenario in the test should be supported too, but the scenario in the description makes more sense.

&gt; I&apos;m not aware of any requirement that pthread_cond_signal should block until a
&gt; waiter has actually woken up. (Your test case relies on it to not block, so
&gt; that it can send out multiple signals while holding the mutex, right?)  I&apos;m
&gt; also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
&gt; you combine both, you will see that the behavior you observe is a valid
&gt; execution.

I&apos;m not making any assumptions about the state of the waiters when pthread_cond_signal returns. All I&apos;m assuming is that, no matter if the signaling thread releases and reacquires the mutex after each sent signal or sends all signals without releasing the mutex, at least as many waiters as the number of signals will wake (eventually).

But even if this assumption is wrong (and it&apos;s not), if you set releaseMutexBetweenSignals to true, the test will release the mutex after each sent signal. In this case the test doesn&apos;t send multiple signals while holding the mutex, and the problem still occurs.

As for fairness, this is not about fairness. It is also not about ordering between the waiters. It&apos;s about ordering between waiters and signalers.

I&apos;m getting tired of people jumping to fairness at the first mention of ordering. You could say that I&apos;m requesting fairness if I wanted the first single signal to wake the waiter that blocked first. But all I&apos;m requesting is for the signal to wake at least one of the waiters that started waiting before the signal was sent. I don&apos;t care which one of them.

This is guaranteed by the standard (from the documentation of pthread_cond_wait and pthread_cond_signal on the opengroup site):

&quot;The pthread_cond_signal() function shall unblock at least one of the threads that are blocked on the specified condition variable cond (if any threads are blocked on cond).&quot;

And I think the next quote makes it very clear what threads are considered to be blocked on the condvar at the time of the call to pthread_cond_signal():

&quot;That is, if another thread is able to acquire the mutex after the about-to-block thread has released it, then a subsequent call to pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave as if it were issued after the about-to-block thread has blocked.&quot;

In effect this means that each call to pthread_cond_signal() defines a point in time and all waiters (or calls to pthread_cond_wait() if you prefer) are either before this call, or after it. And only the ones that are before it are allowed to consume the signal sent by this call.

Now, of course in a multiprocessor system it is hard to order events in time, but that&apos;s where the mutex comes in. And if the signaling thread sends multiple signals while holding the mutex, we can consider all these signals to be simultaneous. But that doesn&apos;t change the validity of the test.

On the other hand, the standard doesn&apos;t guarantee that there won&apos;t be spurious wakeups. However, glibc tries to prevent them. But the logic for this prevention is flawed and causes the race that this bug is about.

So the net result is that glibc chose to provide a feature that is not required, but dropped a much more important feature which is actually required. Hence, this bug is not a fairness feature request, it is a correctness defect report.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57495</commentid>
    <comment_count>20</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-20 10:43:00 +0000</bug_when>
    <thetext>(In reply to comment #18)
&gt; &gt; I&apos;m not aware of any requirement that pthread_cond_signal should block until a
&gt; &gt; waiter has actually woken up. (Your test case relies on it to not block, so
&gt; 
&gt; Do you mean &quot;should block&quot; or &quot;should not block&quot;?

&quot;Block&quot;, as I wrote.

&gt; POSIX&apos;s definition of threads
&gt; has general language that requires that forward process be possible, and
&gt; blocking in pthread_cond_signal until a waiter actually wakes up (including
&gt; acquiring the mutex) would be non-conformant; in fact it would be a guaranteed
&gt; deadlock.

That&apos;s what I&apos;m pointing out.  Glad you agree on this point.

&gt; 
&gt; &gt; that it can send out multiple signals while holding the mutex, right?)  I&apos;m
&gt; 
&gt; It is definitely required that pthread_cond_signal can be called more than
&gt; once.
&gt; 
&gt; &gt; also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
&gt; &gt; you combine both, you will see that the behavior you observe is a valid
&gt; &gt; execution.
&gt; 
&gt; Nobody has asked for glibc to satisfy any fairness constraint.

Glad you agree on that too. Now please just think about the combination of both for a minute.

&gt; The standard
&gt; says it shall unblock at least one thread that &quot;has blocked&quot; on the condition
&gt; variable, not that it can randomly fail to do so or instead send the unblocking
&gt; event into the future to be consumed by another thread that has not yet blocked
&gt; but is about to block (after the signaling thread unlocks the mutex).

You assume that only some threads can count as blocked, which is not guaranteed by the standard. Signalers are not required to have finished delivering the signal when they return.  Thus, the scope of which threads are blocked can be longer than you assume.  Combined with no fairness guarantee this exactly allows the behavior that we&apos;re talking about here.

The standard indeed doesn&apos;t talk about the &quot;future&quot;.  It doesn&apos;t make a sort of lower-bound requirement on which threads have to be considered blocked, but no upper bound.  If you think there&apos;s an upper bound, please point the requirement in the standard.  If there is no required upper bound, it&apos;s up to the implementation how to deal with that.

&gt; The claim
&gt; in this bug report is that glibc is randomly failing (race condition) to
&gt; unblock ANY of the threads that have blocked.

No.  The claim is that the &quot;wrong&quot; threads have been unblocked, where &quot;wrong&quot; is based on an assumption on ordering or an upper-bound-on-blocked-threads guarantee that is not required by the standard.

&gt; A viable mechanism of how this
&gt; failure is occurring has also been proposed.
&gt; 
&gt; You are free to test this hypothesis and claim that this is not what&apos;s
&gt; happening, or provide a rigorous proof that it couldn&apos;t happen.

I have argued that this is allowed behavior, because there is no requirement that conflicts with it, and pointed out why.  You haven&apos;t done that; just because you think the standard should have a certain requirement doesn&apos;t mean it actually has.

&gt; But what you&apos;ve
&gt; been doing so far is mischaracterizing the bug report and my comments as a
&gt; complaint about scheduling fairness, which they are not, and this is getting
&gt; really frustrating...

I&apos;ve never talked about &quot;scheduling fairness&quot; (assuming you mean the OS scheduler here), just about fairness regarding which thread gets wakened. So much for mischaracterization, eh?

Sorry to hear that you feel frustrated, but I&apos;d like to point out that I don&apos;t think that I&apos;m the cause for this.

Bottom line: In my opinion, this is not a bug.  However, it might be good to explain why this behavior is allowed (e.g., somewhere in the docs or on the wiki), so that this doesn&apos;t surprise other users.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57496</commentid>
    <comment_count>21</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2012-09-20 11:05:21 +0000</bug_when>
    <thetext>(In reply to comment #20)
&gt; The standard indeed doesn&apos;t talk about the &quot;future&quot;.  It doesn&apos;t make a sort of
&gt; lower-bound requirement on which threads have to be considered blocked, but no
&gt; upper bound.  If you think there&apos;s an upper bound, please point the requirement
&gt; in the standard.  If there is no required upper bound, it&apos;s up to the
&gt; implementation how to deal with that.

&quot;The pthread_cond_broadcast() and pthread_cond_signal() functions shall have no effect if there are no threads currently blocked on cond.&quot;

How about this as an upper bound? If implementations are allowed to determine the set of blocked threads at any point in time they see fit, there would be no way to define &quot;currently blocked&quot; at all and this sentence couldn&apos;t make any sense.

And also:

&quot;.... however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().&quot;

If I accept your argument, there will be no way to determine at least a set of threads from which the woken thread will be chosen, so why does the standard talk about predictability?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57497</commentid>
    <comment_count>22</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-20 11:22:45 +0000</bug_when>
    <thetext>(In reply to comment #19)
&gt; (In reply to comment #16)
&gt; Sorry for the long reply. Please, bare with me, because this issue is very
&gt; subtle and I don&apos;t know how to explain it more succinctly.

No worries.  If it would be straight-forward, we wouldn&apos;t be talking about it here :)

&gt; 
&gt; First of all, let me clarify that this is a test that exposes the race, and not
&gt; the usage scenario that I claim should be supported. The usage scenario is
&gt; described in the bug description. Well, actually, I do claim that the scenario
&gt; in the test should be supported too, but the scenario in the description makes
&gt; more sense.

I think your test is good and helps to point out the issue (including, actually, why it&apos;s not a bug).  In your comment #1, I agree on assumption 1), but not on assumption 2). The latter (2) is not required by the standard.

&gt; &gt; I&apos;m not aware of any requirement that pthread_cond_signal should block until a
&gt; &gt; waiter has actually woken up. (Your test case relies on it to not block, so
&gt; &gt; that it can send out multiple signals while holding the mutex, right?)  I&apos;m
&gt; &gt; also not aware of any ordering requirement wrt. waiters (i.e., fairness).  If
&gt; &gt; you combine both, you will see that the behavior you observe is a valid
&gt; &gt; execution.
&gt; 
&gt; I&apos;m not making any assumptions about the state of the waiters when
&gt; pthread_cond_signal returns.

You do assume that only those will be associated with this particular signal call.  We can argue that whether this is part of the state of waiters or not, but I hope you&apos;ll agree that it&apos;s at least a property that the waiters are part of.

&gt; All I&apos;m assuming is that, no matter if the
&gt; signaling thread releases and reacquires the mutex after each sent signal or
&gt; sends all signals without releasing the mutex, at least as many waiters as the
&gt; number of signals will wake (eventually).

This assumption is correct, but you assume more:  That only the threads before cond_signal returns will be considered as blocked.

So, if you have waiters W and signal S, and an execution of (W1-&gt;W2-&gt;S-&gt;W3), then the standard requires that W1 and W2 need to be considered as blocked threads by S.  It does not require that W3 is _not_ considered as a blocked thread.

Informally, S is required to be ordered after W1 and W2, but it is unspecified whether it is ordered before W3. S does not block; it is like starting an asynchronous operation that will eventually deliver a signal.  S can return earlier, and the earlier return can happen before W3, but that doesn&apos;t mean anything for the delivery of the signal.

&gt; But even if this assumption is wrong (and it&apos;s not), if you set
&gt; releaseMutexBetweenSignals to true, the test will release the mutex after each
&gt; sent signal. In this case the test doesn&apos;t send multiple signals while holding
&gt; the mutex, and the problem still occurs.
&gt; 
&gt; As for fairness, this is not about fairness. It is also not about ordering
&gt; between the waiters. It&apos;s about ordering between waiters and signalers.

The ordering between waiters and signalers is the first point, which I illustrated with &quot;signalers don&apos;t block for waiters&quot;.  After that, we have the problem of which of the blocked threads the signal chooses to wake up, so this becomes an ordering problem (i.e., a selection problem if applied continuously). On an abstract level, this is a typical fairness problem.

As I pointed out, it&apos;s the combination of both these issues.

&gt; I&apos;m getting tired of people jumping to fairness at the first mention of
&gt; ordering.

If you&apos;re tired, get some sleep before suggesting that the participants of this discussion don&apos;t understand ordering.

&gt; You could say that I&apos;m requesting fairness if I wanted the first
&gt; single signal to wake the waiter that blocked first. But all I&apos;m requesting is
&gt; for the signal to wake at least one of the waiters that started waiting before
&gt; the signal was sent. I don&apos;t care which one of them.

That&apos;s the first wrong assumption (&quot;before the signal&quot; [call returned]).  If you don&apos;t make that incorrect assumption, then it becomes a fairness issue among classes of waiters, where the classes are defined by whether they happened before a certain signal call.

&gt; This is guaranteed by the standard (from the documentation of pthread_cond_wait
&gt; and pthread_cond_signal on the opengroup site):
&gt; 
&gt; &quot;The pthread_cond_signal() function shall unblock at least one of the threads
&gt; that are blocked on the specified condition variable cond (if any threads are
&gt; blocked on cond).&quot;
&gt;
&gt; And I think the next quote makes it very clear what threads are considered to
&gt; be blocked on the condvar at the time of the call to pthread_cond_signal():

It says that a thread must be considered blocked if the cond_wait call happened before the signal call.  It does NOT say that ONLY those threads need to be considered blocked.

&gt; &quot;That is, if another thread is able to acquire the mutex after the
&gt; about-to-block thread has released it, then a subsequent call to
&gt; pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave
&gt; as if it were issued after the about-to-block thread has blocked.&quot;
&gt; 
&gt; In effect this means that each call to pthread_cond_signal() defines a point in
&gt; time and all waiters (or calls to pthread_cond_wait() if you prefer) are either
&gt; before this call, or after it. And only the ones that are before it are allowed
&gt; to consume the signal sent by this call.

No it does not.  It only talks about what happens before (=&gt; meaning it implies):

waiter happens-before signaler =&gt; waiter is considered blocked

It does not say (&lt;=&gt; meaning being equivalent to):

waiter happens-before signaler &lt;=&gt; waiter is considered blocked

(In fact, if you spin the second (&lt;=&gt;) further, you could only expect / build an ordering based on the mutex, but not for other kinds of enforced happens-before order. That&apos;s not what we&apos;d want.)

&gt; On the other hand, the standard doesn&apos;t guarantee that there won&apos;t be spurious
&gt; wakeups. However, glibc tries to prevent them. But the logic for this
&gt; prevention is flawed and causes the race that this bug is about.

The term &quot;race&quot; here is misleading.  If there&apos;s a race, in it&apos;s how you use the cond var and expect it to behave.  The logic would be flawed if it allowed incorrect behavior, which it doesn&apos;t.

&gt; So the net result is that glibc chose to provide a feature that is not
&gt; required, but dropped a much more important feature which is actually required.
&gt; Hence, this bug is not a fairness feature request, it is a correctness defect
&gt; report.

No.  You assume a guarantee that isn&apos;t required by the standard.

The comment that you are asking for a fairness feature is an attempt at an explanation for you that should point out the difference to what the standard guarantees.  This was meant to help in this discussion.  If we want to further discuss this, I believe this needs to focus around whether the standard requires that certain threads must not be considered blocked.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57499</commentid>
    <comment_count>23</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-20 11:58:26 +0000</bug_when>
    <thetext>(In reply to comment #21)
&gt; (In reply to comment #20)
&gt; &gt; The standard indeed doesn&apos;t talk about the &quot;future&quot;.  It doesn&apos;t make a sort of
&gt; &gt; lower-bound requirement on which threads have to be considered blocked, but no
&gt; &gt; upper bound.  If you think there&apos;s an upper bound, please point the requirement
&gt; &gt; in the standard.  If there is no required upper bound, it&apos;s up to the
&gt; &gt; implementation how to deal with that.
&gt; 
&gt; &quot;The pthread_cond_broadcast() and pthread_cond_signal() functions shall have no
&gt; effect if there are no threads currently blocked on cond.&quot;
&gt; 
&gt; How about this as an upper bound?

This states something in relation to those threads that are considered to be blocked.  It does not state anything about which threads can or have to be considered to be blocked.  So, it can&apos;t be an upper bound.

&gt; If implementations are allowed to determine
&gt; the set of blocked threads at any point in time they see fit, there would be no
&gt; way to define &quot;currently blocked&quot; at all and this sentence couldn&apos;t make any
&gt; sense.

There is a lower bound (or minimum requirement) based on the happens-before via the mutex (hence &quot;currently&quot;).  The sentence allows the implementation to let the signal have no effect if there is no thread that has to be considered blocked with the assumption of the lower bound.  Assuming more threads to be blocked is the same as allowing spurious wake-ups.

&gt; And also:
&gt; 
&gt; &quot;.... however, if predictable scheduling behavior is required, then that mutex
&gt; shall be locked by the thread calling pthread_cond_broadcast() or
&gt; pthread_cond_signal().&quot;
&gt; 
&gt; If I accept your argument, there will be no way to determine at least a set of
&gt; threads from which the woken thread will be chosen, so why does the standard
&gt; talk about predictability?

There is the lower bound, which does determine properties of this set.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57503</commentid>
    <comment_count>24</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2012-09-20 12:46:24 +0000</bug_when>
    <thetext>(In reply to comment #23)
&gt; This states something in relation to those threads that are considered to be
&gt; blocked.  It does not state anything about which threads can or have to be
&gt; considered to be blocked.  So, it can&apos;t be an upper bound.

It doesn&apos;t say &quot;blocked at the time the signal is delivered&quot;. It says &quot;currently blocked&quot;. And it doesn&apos;t talk about the effect of the delivery of the signal, it talks about the effect of the function. So &quot;current&quot; and &quot;function&quot; should be related somehow. I&apos;m taking the relation to be that &quot;current&quot; means &quot;at the time of the call to the function&quot;.

But the language of the standard is too vague to make it explicit what &quot;currently&quot; means here, so I guess you are free to interpret it otherwise.

&gt; There is a lower bound (or minimum requirement) based on the happens-before via
&gt; the mutex (hence &quot;currently&quot;).  The sentence allows the implementation to let
&gt; the signal have no effect if there is no thread that has to be considered
&gt; blocked with the assumption of the lower bound.  Assuming more threads to be
&gt; blocked is the same as allowing spurious wake-ups.

I don&apos;t really buy your argument here, but I just realized that this sentence only talks about the situation when there are no blocked threads, so I cannot use it to reason about the case when there are blocked waiters before the call.

Again, the vague language of the standard allows your interpretation.

&gt; &gt; If I accept your argument, there will be no way to determine at least a set of
&gt; &gt; threads from which the woken thread will be chosen, so why does the standard
&gt; &gt; talk about predictability?
&gt; 
&gt; There is the lower bound, which does determine properties of this set.

What I meant is that if we accept your interpretation, there are no threads that can be excluded from the set of blocked threads - except the ones that never called, nor will ever call pthread_cond_wait() and the ones that have already returned from all the calls to pthread_cond_wait() they will ever make.

To illustrate this, let me take what you&apos;re saying to its logical conclusion.

Suppose for simplicity that we have a single call to pthread_cond_signal() and many calls to pthread_cond_wait(), both before and after the call to pthread_cond_signal(). What you are actually saying is that it is correct for the call to pthread_cond_signal() to consider all of these threads to be blocked, and is allowed to wake any of them, even threads that are not created yet.

Even more, if the signaling thread calls pthread_cond_wait() some time (even hours) after it called pthread_cond_signal(). it is correct (although undesirable) for it to consume its own signal and all the rest waiters to become blocked.

Although I agree that the spec allows this interpretation, I think it is highly impractical and unintuitive, that&apos;s why I believe it is not what the authors had in mind. But of course, the only resolution of this argument at this point is to ask them for clarification. So I don&apos;t see any point in arguing any more without involving them.

Meanwhile, just consider this: you have code in the implementation which tries to prevent spurious wakeups and it basically aims to establish a timeline of the calls to pthread_cond_signal() and pthread_cond_wait() and assumes that a wakeup is spurious if it occurred without a signal being sent after the respective thread blocked. I dare say that the code suggests that whoever wrote this code had the same assumptions about ordering as me. And it actually contradicts your interpretation, because with your interpretation a single counter of outstanding signals would be enough.

BTW, the reason why this code fails to work correctly is very simple - you can&apos;t detect spurious wakeups reliably using constant memory without giving up all ordering guarantees.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57504</commentid>
    <comment_count>25</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2012-09-20 12:49:05 +0000</bug_when>
    <thetext>(In reply to comment #24)
&gt; Even more, if the signaling thread calls pthread_cond_wait() some time (even
&gt; hours) after it called pthread_cond_signal(). it is correct (although
&gt; undesirable) for it to consume its own signal and all the rest waiters to
&gt; become blocked.

I meant &quot;to remain blocked&quot;...</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57506</commentid>
    <comment_count>26</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-20 16:21:01 +0000</bug_when>
    <thetext>(In reply to comment #24)
&gt; Although I agree that the spec allows this interpretation, I think it is highly
&gt; impractical and unintuitive, that&apos;s why I believe it is not what the authors
&gt; had in mind. But of course, the only resolution of this argument at this point
&gt; is to ask them for clarification. So I don&apos;t see any point in arguing any more
&gt; without involving them.

If the requirements in the specification should change in the future, we can look at this again. I personally don&apos;t think that this makes cond vars significantly impractical or unintuitive, but that is a trade-off that the standards body will have to make (or probably did when it produced the current spec).

Until then, I suppose that you now agree that this is correct behavior according to the current spec.

&gt; Meanwhile, just consider this: you have code in the implementation which tries
&gt; to prevent spurious wakeups and it basically aims to establish a timeline of
&gt; the calls to pthread_cond_signal() and pthread_cond_wait()

Not a timeline, but just overall number of signals or threads that started waiting.

&gt; and assumes that a
&gt; wakeup is spurious if it occurred without a signal being sent after the
&gt; respective thread blocked.

It is spurious if there were not enough signals being sent, or other threads consumed the signals that were sent.

&gt; I dare say that the code suggests that whoever wrote
&gt; this code had the same assumptions about ordering as me.

Looking at the git log for the respective files, you should take that question to Ulrich or Jakub.

&gt; And it actually
&gt; contradicts your interpretation, because with your interpretation a single
&gt; counter of outstanding signals would be enough.

I doubt that (if you want to do this efficiently).

&gt; BTW, the reason why this code fails to work correctly is very simple - you
&gt; can&apos;t detect spurious wakeups reliably using constant memory without giving up
&gt; all ordering guarantees.

I think you can (just combine something like a ticket lock with the futex) -- but this won&apos;t be efficient.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57509</commentid>
    <comment_count>27</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2012-09-20 18:39:10 +0000</bug_when>
    <thetext>I disagree strongly that the spec even allows Torvald&apos;s interpretation. Torvald&apos;s claim is essentially that an implementation can consider an unspecified set of threads beyond those which &quot;have blocked&quot; per the specification of pthread_cond_wait to also &quot;have blocked&quot; on the condition. Not only is there no language in the standard to support this (the only definition of &quot;blocked&quot; on a condition variable is the one we&apos;ve cited); it would also make the specification of pthread_cond_signal completely useless, as the &quot;at least one&quot; could always be taken to mean &quot;the one invisible thread your application can&apos;t see that&apos;s always blocked on every possible condition variable&quot;. This is the pinnacle of absurdity in an attempt to language-lawyer out of fixing a bug.

The fact of the matter is that POSIX, along with common sense and the entire intended usage of condition variables, requires pthread_cond_signal to unblock at least one thread that &quot;has blocked&quot;, in the sense of the blocking that happens atomically with unlocking the mutex as described in the specification of pthread_cond_wait.

The situation we&apos;re looking at here is that the authors of NPTL came up with a clever hack to reduce spurious wakeups in pthread_cond_wait, but failed to realize the hack was broken in some corner cases, and now Torvald is unwilling to admit that the hack is broken and take it out.

It should also be noted that I have been unable to demonstrate any case where NPTL&apos;s hack to prevent spurious wakeups actually has any positive effect. A while back I wrote a test program to hammer condition variables and look for spurious wakeups, and I could not generate any with either NPTL&apos;s buggy implementation OR with musl&apos;s implementation which does not use any similar hack and does not suffer from this bug. Thus, in addition to being wrong and broken, it&apos;s my conclusion, unless anybody else can produce evidence to the contrary, that the hack is also useless (does not have the intended effect of improving performance).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57512</commentid>
    <comment_count>28</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2012-09-20 19:48:25 +0000</bug_when>
    <thetext>(In reply to comment #27)
&gt; I disagree strongly that the spec even allows Torvald&apos;s interpretation.
&gt; Torvald&apos;s claim is essentially that an implementation can consider an
&gt; unspecified set of threads beyond those which &quot;have blocked&quot; per the
&gt; specification of pthread_cond_wait to also &quot;have blocked&quot; on the condition.

Yes, that&apos;s what he claims.

&gt; Not
&gt; only is there no language in the standard to support this (the only definition
&gt; of &quot;blocked&quot; on a condition variable is the one we&apos;ve cited);

Yes, there is no language to support it, but I must admit that there is also no language to explicitly prevent it, even though I too consider this interpretation completely unreasonable as I tried to explain several times.

Anyway, this whole dispute has been reduced to the question of which threads are eligible for wakeup, so I&apos;ve taken the liberty to post a clarification request to the Austin Group, asking them to add explicit text explaining which threads should be considered blocked with respect to a pthread_cond_signal() call. The clarification request is at http://austingroupbugs.net/view.php?id=609. Torvald, please correct me if have inadvertently misrepresented your position.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57514</commentid>
    <comment_count>29</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2012-09-20 20:31:26 +0000</bug_when>
    <thetext>The lack of language to explicitly prevent something is not necessary. Do you see any language that explicitly prevents an implementation from writing the text &quot;42&quot; to stdout when you call strlen? I would grant that Torvald has an argument if the application had called any glibc functions not specified by POSIX (which could be defined by the implementation to do all sorts of crazy things) or if the application had passed constants other than those defined by POSIX to standard functions (e.g. a special attribute for the condition variable). But in the absence of that, no interface in the standard library can have further side effects on other interfaces/objects than what it&apos;s specified to do.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57519</commentid>
    <comment_count>30</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-21 08:03:55 +0000</bug_when>
    <thetext>(In reply to comment #27)
&gt; I disagree strongly that the spec even allows Torvald&apos;s interpretation.
&gt; Torvald&apos;s claim is essentially that an implementation can consider an
&gt; unspecified set of threads beyond those which &quot;have blocked&quot; per the
&gt; specification of pthread_cond_wait to also &quot;have blocked&quot; on the condition.

No it&apos;s not unspecified. The specification requires a lower bound on this set. If you think that this is not just a lower bound, argue against the reasons I gave for this, don&apos;t just repeat your opinion here, please.

&gt; Not
&gt; only is there no language in the standard to support this (the only definition
&gt; of &quot;blocked&quot; on a condition variable is the one we&apos;ve cited);

Which is the lower bound. And this is a meaningful requirement, and provides cond vars that are useful.

&gt; it would also
&gt; make the specification of pthread_cond_signal completely useless, as the &quot;at
&gt; least one&quot; could always be taken to mean &quot;the one invisible thread your
&gt; application can&apos;t see that&apos;s always blocked on every possible condition
&gt; variable&quot;.

You do see that there is a difference between this and requiring &quot;blocked&quot; but not requiring an upper bound based on happens before?

&gt; This is the pinnacle of absurdity in an attempt to language-lawyer
&gt; out of fixing a bug.

Frankly, I don&apos;t care what you think that my motivations are. And I also don&apos;t speculate here about your motivations, nor about your abilities to distinguish an implication from equivalence. So please stop making such statements.

What I care about is whether this is indeed a bug or not.  Taking this question to the standards body for clarification is the right way to go if people think that the standard should require something else, or be more precise is what is allowed or not.

&gt; The fact of the matter is that POSIX, along with common sense and the entire
&gt; intended usage of condition variables, requires pthread_cond_signal to unblock
&gt; at least one thread that &quot;has blocked&quot;, in the sense of the blocking that
&gt; happens atomically with unlocking the mutex as described in the specification
&gt; of pthread_cond_wait.

That doesn&apos;t conflict with what I said. You mention an entirely different point here.

&gt; The situation we&apos;re looking at here is that the authors of NPTL came up with a
&gt; clever hack to reduce spurious wakeups in pthread_cond_wait, but failed to
&gt; realize the hack was broken in some corner cases, and now Torvald is unwilling
&gt; to admit that the hack is broken and take it out.

See above. Aside from this being impolite, speculation like this does not belong in a bug report.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57521</commentid>
    <comment_count>31</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2012-09-21 08:52:51 +0000</bug_when>
    <thetext>Such &quot;speculation&quot; does belong in a bug report so that there is public record of this behavior of trying to weasel out of conformance requirements. Public opinion counts, and there should be a cost in public opinion to software maintainers who refuse to fix bugs and try to argue that their bugs are not bugs.

Certainly I cannot read your mind to prove claims about your motivation. It really just boils down to a case of Occam&apos;s razor. When fixing the issue would basically amount to an all-minuses patch that greatly simplifies the code, yet the maintainers refuse to do it, the simplest explanation is that somebody has an attachment to the code and the work that went into writing it. If your motivation is something else, like perhaps concerns about making a mistake and breaking something in the process of fixing it, why not say that outright rather than leaving everybody stuck having to speculate about your motivations? Then the pros and cons of fixing it can be argued rationally. Or do you claim you have no further motivation here than &quot;I believe the standard allows low-quality implementations with bad properties like this, so I want to exercise my right to do a bad job and still claim conformance&quot;?

With the motivation topic out of the way, let&apos;s get back to the requirements of the standard. My understanding is that you claim &quot;the threads that are blocked on the specified condition variable cond&quot; is a set consisting of at least those threads which provably (by virtue of the mutex providing ordering) called pthread_cond_[timed]wait before the caller of pthread_cond_signal obtained the mutex, but which also may contain other threads. Not, as I originally accused you of, a completely arbitrary set of other threads, but rather threads which are &quot;about to&quot; wait on the condition variable in the immediate future after the signaling thread unlocks the mutex. Is this a correct assessment of your position?

If so, can you clarify what condition would qualify such threads for membership in this set? Certainly it can&apos;t be any thread that could ever conceivably wait on the condition variable at any later point in the program flow; if nothing else, that set is self-referential and thus not even a set (because which threads are candidates for membership may depend on which thread the signal unblocked).

My claim that the set of candidate threads pthread_cond_signal can unblock is a fixed set is based on the following:

1. The status of being blocked on a condition variable is defined only in the specification of pthread_cond_[timed]wait. I think we both agree on this. There is no way a thread is ever blocked on a condition variable except by calling pthread_cond_[timed]wait on it.

2. Sequencing of events between threads is not defined in general, but the use of mutexes with condition variables imposes a sequencing. In particular, from the point of view of any other thread, a given thread&apos;s status as being blocked on a condition variable is acquired after it obtains the mutex and before pthread_cond_[timed]wait releases the mutex. The former sequencing requirement comes from the fact that you can only call pthread_cond_[timed]wait with the mutex held, and the fact that pthread_cond_[timed]wait is the function specified to block; the latter sequencing requirement is the atomicity of blocking and unlocking the mutex.

3. The pthread_cond_signal function &quot;shall unblock at least one of the threads that are blocked on the specified condition variable&quot;. There is no language about queuing an unblock request; the language of the standard is &quot;shall unblock&quot;. This means that even if the mechanism is some sort of queued request, in the sense of the formal, abstract machine, one of the threads in the set of blocked threads goes from blocked status to unblocked status during the call to pthread_cond_signal (and, from the stantpoint of observable sequencing, between the calling thread&apos;s calls to pthread_mutex_lock and pthread_mutex_unlock). Since the mutex is locked during this time, there is no way additional threads can become blocked on the condition variable. On the other hand, some could become unblocked due to spurious wakes, so the set could shrink, but it could not grow.

If you still claim my reasoning is wrong, please cite the specific steps you believe are hand-waving or misinterpretation of the standard.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>57527</commentid>
    <comment_count>32</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-09-21 15:44:30 +0000</bug_when>
    <thetext>(In reply to comment #28)
&gt; (In reply to comment #27)
&gt; &gt; I disagree strongly that the spec even allows Torvald&apos;s interpretation.
&gt; &gt; Torvald&apos;s claim is essentially that an implementation can consider an
&gt; &gt; unspecified set of threads beyond those which &quot;have blocked&quot; per the
&gt; &gt; specification of pthread_cond_wait to also &quot;have blocked&quot; on the condition.
&gt; 
&gt; Yes, that&apos;s what he claims.

Not quite, as I point out in comment #30.

&gt; Anyway, this whole dispute has been reduced to the question of which threads
&gt; are eligible for wakeup, so I&apos;ve taken the liberty to post a clarification
&gt; request to the Austin Group, asking them to add explicit text explaining which
&gt; threads should be considered blocked with respect to a pthread_cond_signal()
&gt; call. The clarification request is at
&gt; http://austingroupbugs.net/view.php?id=609. Torvald, please correct me if have
&gt; inadvertently misrepresented your position.

Thanks.  I have added a note there that summarizes the interpretation that the current implementation is based on.  Let&apos;s see what they think about this issue.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>58055</commentid>
    <comment_count>33</comment_count>
    <who name="Mihail Mihaylov">mihaylov.mihail</who>
    <bug_when>2012-10-18 06:26:22 +0000</bug_when>
    <thetext>(In reply to comment #32)
&gt; (In reply to comment #28)
&gt; ....
&gt; ....
&gt; &gt; Anyway, this whole dispute has been reduced to the question of which threads
&gt; &gt; are eligible for wakeup, so I&apos;ve taken the liberty to post a clarification
&gt; &gt; request to the Austin Group, asking them to add explicit text explaining which
&gt; &gt; threads should be considered blocked with respect to a pthread_cond_signal()
&gt; &gt; call. The clarification request is at
&gt; &gt; http://austingroupbugs.net/view.php?id=609. Torvald, please correct me if have
&gt; &gt; inadvertently misrepresented your position.
&gt; 
&gt; Thanks.  I have added a note there that summarizes the interpretation that the
&gt; current implementation is based on.  Let&apos;s see what they think about this
&gt; issue.

The Austin Group have reached an official position. They have decided to make changes to some of the texts related to condition variables. I believe that the changes as they announced them yesterday invalidate glibc&apos;s interpretation of the spec.

Let me point out that these changes do not add new requirements to the spec. They just make explicit the requirements that were already suggested by the spec.

In my opinion, at this point it is already clear that NPTL&apos;s implementation of condition variables does not conform to the POSIX spec, therefore this bug is actually really a bug. I hope that the NPTL team will acknowledge it as such.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>58062</commentid>
    <comment_count>34</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2012-10-18 12:25:16 +0000</bug_when>
    <thetext>It should be noted that no changes were made to the requirements the standard places on implementations; the changes made are only clarifications, since apparently the original language was not clear enough.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>58191</commentid>
    <comment_count>35</comment_count>
    <who name="Torvald Riegel">triegel</who>
    <bug_when>2012-10-24 20:25:32 +0000</bug_when>
    <thetext>(In reply to comment #33)
&gt; The Austin Group have reached an official position. They have decided to make
&gt; changes to some of the texts related to condition variables. I believe that the
&gt; changes as they announced them yesterday invalidate glibc&apos;s interpretation of
&gt; the spec.

I agree.  Those changes disallow glibc&apos;s current behavior.

&gt; Let me point out that these changes do not add new requirements to the spec.
&gt; They just make explicit the requirements that were already suggested by the
&gt; spec.

I disagree.  It&apos;s a specification -- it has to be explicit.

(In reply to comment #34)
&gt; It should be noted that no changes were made to the requirements the standard
&gt; places on implementations; the changes made are only clarifications, since
&gt; apparently the original language was not clear enough.

How is that not a change in the spec: http://austingroupbugs.net/view.php?id=609#c1403 ?  Are we talking about the same thing here?

Apparently, the original language was not clear enough.  Otherwise, why change the spec?  If it were clear enough, an interpretation explanation or something like that would have been sufficient, right.  They even say that they see the need to &quot;produce some interpretation text to precede these changes&quot;.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>58199</commentid>
    <comment_count>36</comment_count>
    <who name="Rich Felker">bugdal</who>
    <bug_when>2012-10-25 04:07:39 +0000</bug_when>
    <thetext>The note #0001403 you cited begins:

&quot;In the Oct 17th conference call we agreed to make the following changes. Further work is needed to produce some interpretation text to precede these changes.&quot;

Said &quot;interpretation text to precede these changes&quot;, as I understand it, is an explanation of how the current text is meant to be interpreted. The &quot;following changes&quot; then are clarifications to appear in the TC or next version of the standard, but the interpretation is (or will be) that the current standard already requires exactly the same thing, at least from an observable-behavior standpoint.

As I have stated before, I do not see any way one could interpret the current glibc behavior as conformant. Under the current glibc behavior, events that should be synchronized by a mutex observably/provably occur in an order different from the order the synchronization imposes.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>5945</attachid>
            <date>2011-09-25 21:32:00 +0000</date>
            <delta_ts>2011-09-25 21:32:35 +0000</delta_ts>
            <desc>Test to observe the race</desc>
            <filename>main.cpp</filename>
            <type>text/x-c++src</type>
            <size>6341</size>
            <attacher name="Mihail Mihaylov">mihaylov.mihail</attacher>
            
              <data encoding="base64">I2luY2x1ZGUgPGlvc3RyZWFtPgojaW5jbHVkZSA8cHRocmVhZC5oPgojaW5jbHVkZSA8c3lzL3Rp
bWUuaD4NCiNpbmNsdWRlIDxlcnJuby5oPgoNCnVzaW5nIG5hbWVzcGFjZSBzdGQ7Cgpjb25zdCBi
b29sIHJlbGVhc2VNdXRleEJldHdlZW5TaWduYWxzID0gdHJ1ZTsKY29uc3QgaW50IHdhaXRUaW1l
U2VjID0gMTU7Cgpib29sIGV4aXRQcm9ncmFtOwppbnQgdGhyZWFkc0NvdW50OwppbnQgd2FpdGlu
Z0NvdW50OwppbnQgc2lnbmFsc1NlbnRDb3VudDsKaW50IHNlY29uZFdhaXRpbmdDb3VudDsKaW50
IHNlY29uZFdhaXRXb2tlbkNvdW50OwpwdGhyZWFkX211dGV4X3QgbXV0ZXg7CnB0aHJlYWRfY29u
ZF90IHRlc3RDb25kOwpwdGhyZWFkX2NvbmRfdCBhbGxUaHJlYWRzV2FpdGluZzsKcHRocmVhZF9j
b25kX3QgYWxsU2lnbmFsc1NlbnQ7CnB0aHJlYWRfY29uZF90IGFsbFRocmVhZHNVbmJsb2NrZWQ7
CnB0aHJlYWRfY29uZF90IHR3b09yQWxsQXRTZWNvbmRXYWl0OwoKdm9pZCogVGhyZWFkTWFpbih2
b2lkKiBkYXRhKQp7CiAgICAvLyBXYWl0IGZvciBhbGwgdGhyZWRzIHRvIGJlIGNyZWF0ZWQuCiAg
ICBwdGhyZWFkX211dGV4X2xvY2soJm11dGV4KTsKICAgIHdoaWxlICghZXhpdFByb2dyYW0pIHsK
ICAgIC8vIDEuIFdhaXQgb24gdGhlIGNvbmRpdGlvbiBmb3IgdGhlIGZpcnN0IHRpbWUuIENvdW50
IHRoZSB3YWl0ZXJzIGF0IHRoaXMgc3RlcAogICAgICAgIC8vIDEuMS4gSWYgYWxsIHRocmVhZHMg
YXJlIGJsb2NrZWQgYXQgdGhlIGZpcnN0IHdhaXQsIHNpZ25hbCB0aGUgbWFzdGVyIHRocmVhZAog
ICAgICAgIGlmICgrK3dhaXRpbmdDb3VudCA9PSB0aHJlYWRzQ291bnQpIHsKICAgICAgICAgICAg
cHRocmVhZF9jb25kX3NpZ25hbCgmYWxsVGhyZWFkc1dhaXRpbmcpOwogICAgICAgIH0KCiAgICAg
ICAgLy8gMS4yLiBXYWl0IG9uIHRoZSBjb25kaXRpb24gZm9yIHRoZSBmaXJzdCB0aW1lLgogICAg
ICAgIHB0aHJlYWRfY29uZF93YWl0KCZ0ZXN0Q29uZCwgJm11dGV4KTsKCiAgICAgICAgLy8gMS4z
LiBEZWNyZW1lbnQgdGhlIG51bWJlciBvZiB3YWl0aW5nIHRocmVhZHMKICAgICAgICBpZiAoLS13
YWl0aW5nQ291bnQgPT0gMCkgewogICAgICAgICAgICBwdGhyZWFkX2NvbmRfc2lnbmFsKCZhbGxU
aHJlYWRzVW5ibG9ja2VkKTsKICAgICAgICB9CgogICAgICAgIGlmIChleGl0UHJvZ3JhbSkgewog
ICAgICAgICAgICBicmVhazsKICAgICAgICB9CgogICAgLy8gMi4gV2FpdCBvbiB0aGUgY29uZGl0
aW9uIGZvciB0aGUgc2Vjb25kIHRpbWUuIENvdW50IHdhaXRlcnMgYXQgdGhpcyBzdGVwIHRvbwog
ICAgICAgIC8vIDIuMS4gRG9uJ3Qgc3RhcnQgd2FpdGluZyB1bnRpbCB0aGUgbWFpbiB0aHJlYWQg
aGFzIHNlbnQgYWxsIHNpZ25hbHMKICAgICAgICB3aGlsZSAoc2lnbmFsc1NlbnRDb3VudCA8IHRo
cmVhZHNDb3VudCAmJiAhZXhpdFByb2dyYW0pIHsKICAgICAgICAgICAgcHRocmVhZF9jb25kX3dh
aXQoJmFsbFNpZ25hbHNTZW50LCAmbXV0ZXgpOwogICAgICAgIH0KCiAgICAgICAgaWYgKGV4aXRQ
cm9ncmFtKSB7CiAgICAgICAgICAgIGJyZWFrOwogICAgICAgIH0KCiAgICAgICAgLy8gMi4yLiBX
aGVuIHRoZXJlIGFyZSB0d28gdGhyZWFkcyBibG9ja2VkIG9uIHRoZSBzZWNvbmQgd2FpdCBpbmZv
cm0gdGhlIG1haW4KICAgICAgICAvLyAgICAgIHRocmVhZCB0byBzZW5kIG9uZSBtb3JlIHNpZ25h
bC4gV2hlbiBhbGwgdGhyZWFkcyBoYXZlIHJlYWNoZWQgdGhlIHNlY29uZAogICAgICAgIC8vICAg
ICAgd2FpdCwgc2lnbmFsIHRoZSBzYW1lIGNvbmRpdGlvbiB0byBpbmZvcm0gdGhlIG1haW4gdGhy
ZWFkIHRoYXQgaXQgY2FuCiAgICAgICAgLy8gICAgICBpbml0aWF0ZSB0aGUgbmV4dCB0ZXN0IGl0
ZXJhdGlvbi4KICAgICAgICBpZiAoKytzZWNvbmRXYWl0aW5nQ291bnQgPT0gMiB8fCBzZWNvbmRX
YWl0aW5nQ291bnQgPT0gdGhyZWFkc0NvdW50KSB7CiAgICAgICAgICAgIHB0aHJlYWRfY29uZF9z
aWduYWwoJnR3b09yQWxsQXRTZWNvbmRXYWl0KTsKICAgICAgICB9CgogICAgICAgIC8vIDIuMy4g
V2FpdCBvbiB0aGUgY29uZGl0aW9uIGZvciB0aGUgc2Vjb25kIHRpbWUKICAgICAgICBwdGhyZWFk
X2NvbmRfd2FpdCgmdGVzdENvbmQsICZtdXRleCk7CiAgICAgICAgKytzZWNvbmRXYWl0V29rZW5D
b3VudDsKCiAgICAgICAgaWYgKGV4aXRQcm9ncmFtKSB7CiAgICAgICAgICAgIGJyZWFrOwogICAg
ICAgIH0KCiAgICAvLyAzLiBXYWl0IGZvciB0aGUgdGVzdCBpdGVyYXRpb24gdG8gZW5kCiAgICAg
ICAgd2hpbGUgKHNpZ25hbHNTZW50Q291bnQgPiAwICYmICFleGl0UHJvZ3JhbSkgewogICAgICAg
ICAgICBwdGhyZWFkX2NvbmRfd2FpdCgmdGVzdENvbmQsICZtdXRleCk7CiAgICAgICAgfQoKICAg
ICAgICBpZiAoZXhpdFByb2dyYW0pIHsKICAgICAgICAgICAgYnJlYWs7CiAgICAgICAgfQogICB9
CiAgICBwdGhyZWFkX211dGV4X3VubG9jaygmbXV0ZXgpOwogICAgcHRocmVhZF9leGl0KE5VTEwp
Owp9Cg0KaW50IG1haW4oKQ0KewovLyAxLiBUZXN0IHNldHVwOgogICAgLy8gMS4xLiBJbml0aWFs
aXplIHRoZSBtdXRleCBhbmQgdGhlIGNvbmRpdGlvbiB2YXJpYWJsZXMKICAgIHB0aHJlYWRfbXV0
ZXhfaW5pdCgmbXV0ZXgsIE5VTEwpOwogICAgcHRocmVhZF9jb25kX2luaXQoJmFsbFRocmVhZHNX
YWl0aW5nLCBOVUxMKTsKICAgIHB0aHJlYWRfY29uZF9pbml0KCZ0ZXN0Q29uZCwgTlVMTCk7CiAg
ICBwdGhyZWFkX2NvbmRfaW5pdCgmYWxsU2lnbmFsc1NlbnQsIE5VTEwpOwogICAgcHRocmVhZF9j
b25kX2luaXQoJmFsbFRocmVhZHNVbmJsb2NrZWQsIE5VTEwpOwogICAgcHRocmVhZF9jb25kX2lu
aXQoJnR3b09yQWxsQXRTZWNvbmRXYWl0LCBOVUxMKTsKDQogICAgLy8gMS4yLiBDcmVhdGUgYXMg
bWFueSB0aHJlYWRzIGFzIHBvc3NpYmxlLiBBbGwgY3JlYXRlZCB0aHJlYWRzIHdpbGwgaW1lZGlh
dGVseQogICAgLy8gICAgICBibG9jayBvbiB0aGUgbXV0ZXgKICAgIHB0aHJlYWRfbXV0ZXhfbG9j
aygmbXV0ZXgpOwogICAgZXhpdFByb2dyYW0gPSBmYWxzZTsKCiAgICBmb3IgKHRocmVhZHNDb3Vu
dCA9IDA7IDsgKyt0aHJlYWRzQ291bnQpIHsKICAgICAgICBwdGhyZWFkX3QgdGhyZWFkOwoKICAg
ICAgICBpZiAocHRocmVhZF9jcmVhdGUoJnRocmVhZCwgTlVMTCwgVGhyZWFkTWFpbiwgTlVMTCkg
IT0gMCkgewogICAgICAgICAgICBicmVhazsKICAgICAgICB9CgogICAgICAgIHB0aHJlYWRfZGV0
YWNoKHRocmVhZCk7CiAgICB9CgogICAgY291dCA8PCAiQ3JlYXRlZCAiIDw8IHRocmVhZHNDb3Vu
dCA8PCAiIHRocmVhZHMuIiA8PCBlbmRsOwoKLy8gMi4gVGVzdCBib2R5CiAgICB3aGlsZSAoIWV4
aXRQcm9ncmFtKSB7CiAgICAgICAgLy8gMi4xLiBSZWxlYXNlIHRoZSBtdXRleCBhbmQgc3RhcnQg
d2FpdGluZyBmb3IgYWxsIHRocmVhZHMgdG8gYmxvY2sgb24gdGhlCiAgICAgICAgLy8gICAgICBj
b25kaXRpb24gdmFyaWFibGUKICAgICAgICB3YWl0aW5nQ291bnQgPSAwOwogICAgICAgIHNpZ25h
bHNTZW50Q291bnQgPSAwOwoKICAgICAgICB3aGlsZSAod2FpdGluZ0NvdW50IDwgdGhyZWFkc0Nv
dW50KSB7CiAgICAgICAgICAgIHB0aHJlYWRfY29uZF93YWl0KCZhbGxUaHJlYWRzV2FpdGluZywg
Jm11dGV4KTsKICAgICAgICB9CgogICAgICAgIC8vIDIuMi4gVW5ibG9jayBhbGwgdGhyZWFkcyBi
dXQgdXNlIGluZGl2aWR1YWwgc2lnbmFscyBpbnN0ZWFkIG9mIGJyb2FkY2FzdAogICAgICAgIHNl
Y29uZFdhaXRpbmdDb3VudCA9IDA7CiAgICAgICAgc2Vjb25kV2FpdFdva2VuQ291bnQgPSAwOwoK
ICAgICAgICBmb3IgKHNpZ25hbHNTZW50Q291bnQgPSAwOyBzaWduYWxzU2VudENvdW50IDwgdGhy
ZWFkc0NvdW50OyArK3NpZ25hbHNTZW50Q291bnQpIHsKICAgICAgICAgICAgcHRocmVhZF9jb25k
X3NpZ25hbCgmdGVzdENvbmQpOwoKICAgICAgICAgICAgaWYgKHJlbGVhc2VNdXRleEJldHdlZW5T
aWduYWxzKSB7CiAgICAgICAgICAgICAgICBwdGhyZWFkX211dGV4X3VubG9jaygmbXV0ZXgpOwog
ICAgICAgICAgICAgICAgcHRocmVhZF9tdXRleF9sb2NrKCZtdXRleCk7CiAgICAgICAgICAgIH0K
ICAgICAgICB9CgogICAgICAgIC8vIDIuMy4gU2VuZCBhIGJyb2FkY2FzdCBvbiB0aGUgYWxsU2ln
bmFsc1NlbnQgY29uZGl0aW9uIGluIGNhc2UgdGhlcmUgYXJlCiAgICAgICAgLy8gICAgICB0aHJl
YWRzIHRoYXQgd2VyZSB1bmJsb2NrZWQgZnJvbSB0aGUgZmlyc3Qgd2FpdCBhbmQgYXJlIHdhaXRp
bmcgdG8KICAgICAgICAvLyAgICAgIGVudGVyIHRoZSBzZWNvbmQgd2FpdAogICAgICAgIHB0aHJl
YWRfY29uZF9icm9hZGNhc3QoJmFsbFNpZ25hbHNTZW50KTsKCiAgICAgICAgLy8gMi4zLiBXYWl0
IGZvciBhdCBsZWFzdCB0d28gdGhyZWFkcyB0byBiZSBibG9ja2VkIG9uIHRoZSBzZWNvbmQgd2Fp
dAogICAgICAgIHdoaWxlIChzZWNvbmRXYWl0aW5nQ291bnQgPCAyKSB7CiAgICAgICAgICAgIHB0
aHJlYWRfY29uZF93YWl0KCZ0d29PckFsbEF0U2Vjb25kV2FpdCwgJm11dGV4KTsKICAgICAgICB9
CgogICAgICAgIC8vIDIuNC4gU2VuZCBhIHNpbmdsZSBzaWduYWwKICAgICAgICBwdGhyZWFkX2Nv
bmRfc2lnbmFsKCZ0ZXN0Q29uZCk7CgogICAgICAgIC8vIDIuNS4gV2FpdCBmb3IgYWxsIHRocmVh
ZHMgdG8gbW92ZSBiZXlvbmQgdGhlIGZpcnN0IHdhaXQKICAgICAgICBzdHJ1Y3QgdGltZXZhbCBj
dXJyZW50VGltZTsKICAgICAgICAgICAgICAgIHN0cnVjdCB0aW1lc3BlYyBlbmRUaW1lOwoKICAg
ICAgICBnZXR0aW1lb2ZkYXkoJmN1cnJlbnRUaW1lLCBOVUxMKTsKICAgICAgICBlbmRUaW1lLnR2
X3NlYyA9IGN1cnJlbnRUaW1lLnR2X3NlYzsKICAgICAgICBlbmRUaW1lLnR2X25zZWMgPSBjdXJy
ZW50VGltZS50dl91c2VjICogMTAwMDsKICAgICAgICBlbmRUaW1lLnR2X3NlYyArPSB3YWl0VGlt
ZVNlYzsKCiAgICAgICAgd2hpbGUgKHdhaXRpbmdDb3VudCA+IDApIHsKICAgICAgICAgICAgaWYg
KHB0aHJlYWRfY29uZF90aW1lZHdhaXQoJmFsbFRocmVhZHNVbmJsb2NrZWQsICZtdXRleCwgJmVu
ZFRpbWUpID09CiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICBFVElNRURPVVQpIHsKICAgICAgICAgICAgICAg
IGJyZWFrOwogICAgICAgICAgICB9CiAgICAgICAgfQoKICAgICAgICAvLyAyLjYuIElmIHNvbWUg
dGhyZWFkcyBhcmUgc3RpbGwgYmxvY2tlZCBvbiB0aGUgZmlyc3Qgd2FpdCwgd2UgaGF2ZQogICAg
ICAgIC8vICAgICAgcmVwcm9kdWNlZCB0aGUgcHJvYmxlbS4gRW5kIHRoZSB0ZXN0LgogICAgICAg
IGlmICh3YWl0aW5nQ291bnQgPiAwICl7CiAgICAgICAgICAgIGNvdXQgPDwgIkFmdGVyICIgPDwg
d2FpdFRpbWVTZWMgPDwgIiBzZWNvbmRzICIgPDwgd2FpdGluZ0NvdW50CiAgICAgICAgICAgICAg
ICAgPDwgIiB0aHJlYWRzIGFyZSBzdGlsbCBibG9ja2VkIG9uIHRoZSBmaXJzdCB3YWl0IGFuZCAi
CiAgICAgICAgICAgICAgICAgPDwgc2Vjb25kV2FpdFdva2VuQ291bnQgPDwgIiB0aHJlYWRzIGhh
dmUgd29rZW4gZnJvbSB0aGUgc2Vjb25kIHdhaXQiCiAgICAgICAgICAgICAgICAgPDwgZW5kbDsK
ICAgICAgICAgICAgZXhpdFByb2dyYW0gPSB0cnVlOwogICAgICAgICAgICAvLyBMZXQgYWxsIHRo
cmVhZHMgbW92ZSB0byB0aGUgc2Vjb25kIHdhaXQKICAgICAgICAgICAgcHRocmVhZF9jb25kX2Jy
b2FkY2FzdCgmdGVzdENvbmQpOwogICAgICAgICAgICAvLyBFeGl0IHRoZSB0ZXN0IGl0ZXJhdGlv
bnMgbG9vcAogICAgICAgICAgICBicmVhazsKICAgICAgICB9IGVsc2UgewogICAgICAgICAgICBj
b3V0IDw8ICJBbGwgdGhyZWFkcyB3ZXJlIHVuYmxvY2tlZC4iIDw8IGVuZGw7CiAgICAgICAgfQoK
ICAgICAgICAvLyAyLjcuIFdhaXQgZm9yIGFsbCB0aHJlYWRzIHRvIHJlYWNoIHRoZSBzZWNvbmQg
d2FpdC4KICAgICAgICB3aGlsZSAoc2Vjb25kV2FpdGluZ0NvdW50IDwgdGhyZWFkc0NvdW50KSB7
CiAgICAgICAgICAgIHB0aHJlYWRfY29uZF93YWl0KCZ0d29PckFsbEF0U2Vjb25kV2FpdCwgJm11
dGV4KTsKICAgICAgICB9CgogICAgICAgIC8vIDIuOC4gU2VuZCBhIGJyb2FkY2FzdCB0byBsZXQg
YWxsIHRocmVhZHMgbW92ZSB0byB0aGUgc3RhcnQgb2YgdGhlIG5leHQKICAgICAgICAvLyAgICAg
IHRlc3QgaXRlcmF0aW9uCiAgICAgICAgcHRocmVhZF9jb25kX2Jyb2FkY2FzdCgmdGVzdENvbmQp
OwogICAgfQoKICAgIHB0aHJlYWRfbXV0ZXhfdW5sb2NrKCZtdXRleCk7CiAgICBwdGhyZWFkX2V4
aXQoTlVMTCk7DQp9DQo=
</data>

          </attachment>
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>5946</attachid>
            <date>2011-09-27 10:10:00 +0000</date>
            <delta_ts>2011-09-27 10:10:23 +0000</delta_ts>
            <desc>Simpler test to observe the race</desc>
            <filename>main.cpp</filename>
            <type>text/x-c++src</type>
            <size>5584</size>
            <attacher name="Mihail Mihaylov">mihaylov.mihail</attacher>
            
              <data encoding="base64">I2luY2x1ZGUgPGlvc3RyZWFtPg0KI2luY2x1ZGUgPHB0aHJlYWQuaD4NCiNpbmNsdWRlIDxzY2hl
ZC5oPg0KI2luY2x1ZGUgPHN5cy90aW1lLmg+DQojaW5jbHVkZSA8dGltZS5oPg0KI2luY2x1ZGUg
PGVycm5vLmg+DQoNCnVzaW5nIG5hbWVzcGFjZSBzdGQ7DQoNCmNvbnN0IGludCBtYXhXYWl0ZXJU
aHJlYWRzQ291bnQgPSAzMDsNCmNvbnN0IGJvb2wgcmVsZWFzZU11dGV4QmV0d2VlblNpZ25hbHMg
PSBmYWxzZTsNCmNvbnN0IGludCB3YWl0VGltZVNlYyA9IDE1Ow0KDQpib29sIGV4aXRQcm9ncmFt
Ow0KaW50IHRocmVhZHNDb3VudDsNCmludCBiZWZvcmVXYWl0ZXJzQ291bnQ7DQppbnQgYWZ0ZXJX
YWl0ZXJzQ291bnQ7DQppbnQgc2lnbmFsc1NlbnRDb3VudDsNCmludCBiZWZvcmVXYWl0ZXJzV29r
ZW5Db3VudDsNCmludCBhZnRlcldhaXRlcnNXb2tlbkNvdW50Ow0KaW50IGl0ZXJhdGlvbkNvdW50
Ow0KDQpwdGhyZWFkX211dGV4X3QgbXV0ZXg7DQpwdGhyZWFkX2NvbmRfdCB0ZXN0Q29uZDsNCg0K
dm9pZCogVGhyZWFkTWFpbih2b2lkKiBkYXRhKQ0Kew0KICAgIC8vIFdhaXQgZm9yIHRoZSBtYWlu
IHRocmVhZCB0byBjcmVhdGUgYWxsIHdhaXRlciB0aHJlYWRzDQogICAgcHRocmVhZF9tdXRleF9s
b2NrKCZtdXRleCk7DQoNCiAgICB3aGlsZSAoIWV4aXRQcm9ncmFtKSB7DQogICAgICAgIC8vIDEu
IFJlbWVtYmVyIHdoaWNoIGlzIHRoZSBjdXJyZW50IHRlc3QgaXRlcmF0aW9uDQogICAgICAgIGlu
dCBjdXJySXRlcmF0aW9uQ291bnQgPSBpdGVyYXRpb25Db3VudDsNCg0KICAgICAgICAvLyAyLiBD
aGVjayBpZiB3ZSBhcmUgc3RhcnRpbmcgdGhlIHdhaXQgYmVmb3JlIGFsbCBzaWduYWxzIGhhdmUg
YmVlbiBzZW50Lg0KICAgICAgICBib29sIGJlZm9yZUFsbFNpZ25hbHNTZW50ID0gc2lnbmFsc1Nl
bnRDb3VudCA8IHRocmVhZHNDb3VudDsNCg0KICAgICAgICAvLyAzLiBDb3VudCB0aGUgdGhyZWFk
IHRvd2FyZHMgdGhlIGFwcHJvcHJpYXRlIGdyb3VwIG9mIHdhaXRlcnMgKGJlZm9yZS9hZnRlcg0K
ICAgICAgICAvLyAgICBhbGwgc2lnbmFscyBhcmUgc2VudCkNCiAgICAgICAgaWYgKGJlZm9yZUFs
bFNpZ25hbHNTZW50KSB7DQogICAgICAgICAgICArK2JlZm9yZVdhaXRlcnNDb3VudDsNCiAgICAg
ICAgfSBlbHNlIHsNCiAgICAgICAgICAgICsrYWZ0ZXJXYWl0ZXJzQ291bnQ7DQogICAgICAgIH0N
Cg0KICAgICAgICAvLyA0LiBXYWl0IG9uIHRoZSBjb25kaXRpb24gdmFyaWFibGUuDQogICAgICAg
IHB0aHJlYWRfY29uZF93YWl0KCZ0ZXN0Q29uZCwgJm11dGV4KTsNCg0KICAgICAgICAvLyA1LiBJ
ZiB0aGUgaXRlcmF0aW9uIGNvdW50IGhhcyBjaGFuZ2VkLCBzdGFydCBvdmVyDQogICAgICAgIGlm
IChjdXJySXRlcmF0aW9uQ291bnQgIT0gaXRlcmF0aW9uQ291bnQpIHsNCiAgICAgICAgICAgIGNv
bnRpbnVlOw0KICAgICAgICB9DQoNCiAgICAgICAgLy8gNi4gQ291bnQgdGhlIHdva2VuIHRocmVh
ZCB0b3dhcmRzIHRoZSBhcHByb3ByaWF0ZSBncm91cCBvZiB3b2tlbiB0aHJlYWRzICh0aGUNCiAg
ICAgICAgLy8gICAgc2FtZSBncm91cCBhcyBzdGVwIDIpLg0KICAgICAgICBpZiAoYmVmb3JlQWxs
U2lnbmFsc1NlbnQpIHsNCiAgICAgICAgICAgICsrYmVmb3JlV2FpdGVyc1dva2VuQ291bnQ7DQog
ICAgICAgIH0gZWxzZSB7DQogICAgICAgICAgICArK2FmdGVyV2FpdGVyc1dva2VuQ291bnQ7DQog
ICAgICAgIH0NCiAgIH0NCiAgICBwdGhyZWFkX211dGV4X3VubG9jaygmbXV0ZXgpOw0KICAgIHB0
aHJlYWRfZXhpdChOVUxMKTsNCn0NCg0KaW50IG1haW4oKQ0Kew0KLy8gMS4gVGVzdCBzZXR1cDoN
CiAgICAvLyAxLjEuIEluaXRpYWxpemUgdGhlIG11dGV4IGFuZCB0aGUgY29uZGl0aW9uIHZhcmlh
YmxlDQogICAgcHRocmVhZF9tdXRleF9pbml0KCZtdXRleCwgTlVMTCk7DQogICAgcHRocmVhZF9j
b25kX2luaXQoJnRlc3RDb25kLCBOVUxMKTsNCg0KICAgIC8vIDEuMi4NCg0KICAgIGV4aXRQcm9n
cmFtID0gZmFsc2U7DQoNCiAgICAvLyAxLjMuIENyZWF0ZSBzb21lIHdhaXRlciB0aHJlYWRzLiBB
bGwgd2FpdGVycyB3aWxsIGltZWRpYXRlbHkgYmxvY2sgb24gdGhlIG11dGV4DQogICAgcHRocmVh
ZF9tdXRleF9sb2NrKCZtdXRleCk7DQoNCiAgICBmb3IgKHRocmVhZHNDb3VudCA9IDA7IHRocmVh
ZHNDb3VudCA8IG1heFdhaXRlclRocmVhZHNDb3VudDsgKyt0aHJlYWRzQ291bnQpIHsNCiAgICAg
ICAgcHRocmVhZF90IHRocmVhZDsNCg0KICAgICAgICBpZiAocHRocmVhZF9jcmVhdGUoJnRocmVh
ZCwgTlVMTCwgVGhyZWFkTWFpbiwgTlVMTCkgIT0gMCkgew0KICAgICAgICAgICAgYnJlYWs7DQog
ICAgICAgIH0NCg0KICAgICAgICBwdGhyZWFkX2RldGFjaCh0aHJlYWQpOw0KICAgIH0NCg0KICAg
IGNvdXQgPDwgIkNyZWF0ZWQgIiA8PCB0aHJlYWRzQ291bnQgPDwgIiB0aHJlYWRzLiIgPDwgZW5k
bDsNCg0KLy8gMi4gVGVzdCBib2R5LiBDb3VudCB0aGUgdGVzdCBib2R5IGl0ZXJhdGlvbnMgdG8g
aGVscCB3YWl0ZXJzIGRldGVjdCB3aGVuIGENCi8vICAgIG5ldyBpdGVyYXRpb24gaGFzIHN0YXJ0
ZWQNCiAgICBmb3IgKGl0ZXJhdGlvbkNvdW50ID0gMDsgIWV4aXRQcm9ncmFtOyArK2l0ZXJhdGlv
bkNvdW50KSB7DQogICAgICAgIC8vIDIuMS4gUmVzZXQgdGhlIGNvdW50ZXJzDQogICAgICAgIGJl
Zm9yZVdhaXRlcnNDb3VudCA9IDA7DQogICAgICAgIGFmdGVyV2FpdGVyc0NvdW50ID0gMDsNCiAg
ICAgICAgYmVmb3JlV2FpdGVyc1dva2VuQ291bnQgPSAwOw0KICAgICAgICBhZnRlcldhaXRlcnNX
b2tlbkNvdW50ID0gMDsNCiAgICAgICAgc2lnbmFsc1NlbnRDb3VudCA9IDA7DQoNCiAgICAgICAg
Ly8gMi4yLiBXYWl0IGZvciBhbGwgd2FpdGVycyB0byBibG9jayBvbiB0aGUgY29uZGl0aW9uIHZh
cmlhYmxlDQogICAgICAgIHdoaWxlIChiZWZvcmVXYWl0ZXJzQ291bnQgPCB0aHJlYWRzQ291bnQp
IHsNCiAgICAgICAgICAgIHB0aHJlYWRfbXV0ZXhfdW5sb2NrKCZtdXRleCk7DQogICAgICAgICAg
ICBzY2hlZF95aWVsZCgpOw0KICAgICAgICAgICAgcHRocmVhZF9tdXRleF9sb2NrKCZtdXRleCk7
DQogICAgICAgIH0NCg0KICAgICAgICAvLyAyLjMuIFNlbmQgYXMgbWFueSBpbmlkaXZpZHVhbCBz
aWduYWxzIGFzIHRoZXJlIGFyZSB3YWl0ZXIgdGhyZWFkcw0KICAgICAgICBmb3IgKDsgc2lnbmFs
c1NlbnRDb3VudCA8IHRocmVhZHNDb3VudDsgKytzaWduYWxzU2VudENvdW50KSB7DQogICAgICAg
ICAgICBwdGhyZWFkX2NvbmRfc2lnbmFsKCZ0ZXN0Q29uZCk7DQoNCiAgICAgICAgICAgIGlmIChy
ZWxlYXNlTXV0ZXhCZXR3ZWVuU2lnbmFscykgew0KICAgICAgICAgICAgICAgIHB0aHJlYWRfbXV0
ZXhfdW5sb2NrKCZtdXRleCk7DQogICAgICAgICAgICAgICAgc2NoZWRfeWllbGQoKTsNCiAgICAg
ICAgICAgICAgICBwdGhyZWFkX211dGV4X2xvY2soJm11dGV4KTsNCiAgICAgICAgICAgIH0NCiAg
ICAgICAgfQ0KDQogICAgICAgIC8vIDIuNC4gV2FpdCBmb3IgYXQgbGVhc3QgdHdvIHRocmVhZHMg
dG8gYmxvY2sgYWdhaW4gYWZ0ZXIgdGhlIHNpZ25hbHMgd2VyZQ0KICAgICAgICAvLyAgICAgIHNl
bnQuIEJ1dCBpZiB3ZSB3ZXJlIHJlbGVhc2luZyB0aGUgbXV0ZXggYmV0d2VlbiBzaWduYWxzIGl0
IGlzIHBvc3NpYmxlDQogICAgICAgIC8vICAgICAgdGhhdCB0b28gbXVjaCB0aHJlYWRzIHdva2Ug
dXAgYW5kIHJlZW50ZXJlZCB0aGUgd2FpdCBzbyB3ZSBuZWVkIHRvIG1ha2UNCiAgICAgICAgLy8g
ICAgICBzdXJlIHRoYXQgdGhlcmUgYXJlIGVub3VnaCB0aHJlYWRzIHRoYXQgaGF2ZW50IHdva2Vu
IHlldC4NCiAgICAgICAgaWYgKHNpZ25hbHNTZW50Q291bnQgLSBiZWZvcmVXYWl0ZXJzV29rZW5D
b3VudCA+IDIpIHsNCiAgICAgICAgICB3aGlsZSAoYWZ0ZXJXYWl0ZXJzQ291bnQgPCAyKSB7DQog
ICAgICAgICAgICAgIHB0aHJlYWRfbXV0ZXhfdW5sb2NrKCZtdXRleCk7DQogICAgICAgICAgICAg
IHNjaGVkX3lpZWxkKCk7DQogICAgICAgICAgICAgIHB0aHJlYWRfbXV0ZXhfbG9jaygmbXV0ZXgp
Ow0KICAgICAgICAgIH0NCiAgICAgICAgfQ0KDQogICAgICAgIC8vIDIuNS4gU2VuZCBhIHNpbmds
ZSBzaWduYWwNCiAgICAgICAgcHRocmVhZF9jb25kX3NpZ25hbCgmdGVzdENvbmQpOw0KDQoNCiAg
ICAgICAgLy8gMi42LiBXYWl0IGZvciBhIHByZWRlZmluZWQgdGltZSBmb3IgYXQgbGVhc3Qgc2ln
bmFsc1NlbnRDb3VudCAnYmVmb3JlJw0KICAgICAgICAvLyAgICAgIHdhaXRlcnMgdG8gd2FrZSB1
cC4NCiAgICAgICAgdGltZV90IGVuZFRpbWU7DQogICAgICAgIHRpbWVfdCBzdGFydFRpbWUgPSB0
aW1lKCZlbmRUaW1lKTsNCiAgICAgICAgd2hpbGUgKGJlZm9yZVdhaXRlcnNXb2tlbkNvdW50IDwg
c2lnbmFsc1NlbnRDb3VudCkgew0KICAgICAgICAgICAgcHRocmVhZF9tdXRleF91bmxvY2soJm11
dGV4KTsNCiAgICAgICAgICAgIHNjaGVkX3lpZWxkKCk7DQogICAgICAgICAgICBwdGhyZWFkX211
dGV4X2xvY2soJm11dGV4KTsNCg0KICAgICAgICAgICAgaWYgKHRpbWUoJmVuZFRpbWUpIC0gc3Rh
cnRUaW1lID4gd2FpdFRpbWVTZWMpIHsNCiAgICAgICAgICAgICAgICBicmVhazsNCiAgICAgICAg
ICAgIH0NCiAgICAgICAgfQ0KDQogICAgICAgIC8vIDIuNy4gSWYgc29tZSB0aHJlYWRzIHRoYXQg
c2hvdWxkIGhhdmUgYmVlbiB1bmJsb2NrZWQgYnkgdGhlIGZpcnN0DQogICAgICAgIC8vICAgICAg
c2lnbmFsc1NlbnRDb3VudCBzaWduYWxzIGhhdmUgcmVtYWluZWQgYmxvY2tlZCwgcmVwb3J0IHRo
YXQgdGhlDQogICAgICAgIC8vICAgICAgcmFjZSB3YXMgaGl0DQogICAgICAgIGlmIChiZWZvcmVX
YWl0ZXJzV29rZW5Db3VudCA8IHNpZ25hbHNTZW50Q291bnQpIHsNCiAgICAgICAgICAgIGNvdXQg
PDwgIlJhY2UgaGl0LiIgPDwgZW5kbA0KICAgICAgICAgICAgICAgICA8PCAiXHRXYWl0ZWQ6XHRc
dCIgPDwgZW5kVGltZSAtIHN0YXJ0VGltZSA8PCAicyIgPDwgZW5kbA0KICAgICAgICAgICAgICAg
ICA8PCAiXHRGYWlsZWQgdG8gd2FrZVx0OiAiIDw8IHNpZ25hbHNTZW50Q291bnQgLSBiZWZvcmVX
YWl0ZXJzV29rZW5Db3VudCA8PCBlbmRsDQogICAgICAgICAgICAgICAgIDw8ICJcdEV4dHJhIHdv
a2VuOlx0IiA8PCBhZnRlcldhaXRlcnNXb2tlbkNvdW50IC0gMSA8PCBlbmRsOw0KICAgICAgICAg
ICAgLy8gTm90aWZ5IGFsbCB3YWl0ZXJzIHRoYXQgdGhleSBzaG91bGQgZXhpdA0KICAgICAgICAg
ICAgZXhpdFByb2dyYW0gPSB0cnVlOw0KICAgICAgICB9IGVsc2Ugew0KICAgICAgICAgICAgY291
dCA8PCAiUmFjZSBub3QgaGl0LiIgPDwgZW5kbDsNCiAgICAgICAgfQ0KDQogICAgICAgIC8vIDIu
OC4gU2VuZCBhIGJyb2FkY2FzdCB0byBsZXQgYWxsIHdhaXRlcnMgbW92ZSB0byB0aGUgc3RhcnQg
b2YgdGhlIG5leHQNCiAgICAgICAgLy8gICAgICB0ZXN0IGl0ZXJhdGlvbiBvciBleGl0DQogICAg
ICAgIHB0aHJlYWRfY29uZF9icm9hZGNhc3QoJnRlc3RDb25kKTsNCiAgICB9DQoNCiAgICBwdGhy
ZWFkX211dGV4X3VubG9jaygmbXV0ZXgpOw0KICAgIHB0aHJlYWRfZXhpdChOVUxMKTsNCn0NCg==
</data>

          </attachment>
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>5947</attachid>
            <date>2011-09-28 02:08:00 +0000</date>
            <delta_ts>2011-09-28 02:08:00 +0000</delta_ts>
            <desc>Simpler test, converted to pure C</desc>
            <filename>condvar_race_2.c</filename>
            <type>text/plain</type>
            <size>5728</size>
            <attacher name="Rich Felker">bugdal</attacher>
            
              <data encoding="base64">I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxzdGRib29sLmg+CiNpbmNsdWRlIDxwdGhyZWFk
Lmg+CiNpbmNsdWRlIDxzY2hlZC5oPgojaW5jbHVkZSA8c3lzL3RpbWUuaD4KI2luY2x1ZGUgPHRp
bWUuaD4KI2luY2x1ZGUgPGVycm5vLmg+Cgpjb25zdCBpbnQgbWF4V2FpdGVyVGhyZWFkc0NvdW50
ID0gMzA7CmNvbnN0IGJvb2wgcmVsZWFzZU11dGV4QmV0d2VlblNpZ25hbHMgPSBmYWxzZTsKY29u
c3QgaW50IHdhaXRUaW1lU2VjID0gMTU7Cgpib29sIGV4aXRQcm9ncmFtOwppbnQgdGhyZWFkc0Nv
dW50OwppbnQgYmVmb3JlV2FpdGVyc0NvdW50OwppbnQgYWZ0ZXJXYWl0ZXJzQ291bnQ7CmludCBz
aWduYWxzU2VudENvdW50OwppbnQgYmVmb3JlV2FpdGVyc1dva2VuQ291bnQ7CmludCBhZnRlcldh
aXRlcnNXb2tlbkNvdW50OwppbnQgaXRlcmF0aW9uQ291bnQ7CgpwdGhyZWFkX211dGV4X3QgbXV0
ZXg7CnB0aHJlYWRfY29uZF90IHRlc3RDb25kOwoKdm9pZCogVGhyZWFkTWFpbih2b2lkKiBkYXRh
KQp7CiAgICAvLyBXYWl0IGZvciB0aGUgbWFpbiB0aHJlYWQgdG8gY3JlYXRlIGFsbCB3YWl0ZXIg
dGhyZWFkcwogICAgcHRocmVhZF9tdXRleF9sb2NrKCZtdXRleCk7CgogICAgd2hpbGUgKCFleGl0
UHJvZ3JhbSkgewogICAgICAgIC8vIDEuIFJlbWVtYmVyIHdoaWNoIGlzIHRoZSBjdXJyZW50IHRl
c3QgaXRlcmF0aW9uCiAgICAgICAgaW50IGN1cnJJdGVyYXRpb25Db3VudCA9IGl0ZXJhdGlvbkNv
dW50OwoKICAgICAgICAvLyAyLiBDaGVjayBpZiB3ZSBhcmUgc3RhcnRpbmcgdGhlIHdhaXQgYmVm
b3JlIGFsbCBzaWduYWxzIGhhdmUgYmVlbiBzZW50LgogICAgICAgIGJvb2wgYmVmb3JlQWxsU2ln
bmFsc1NlbnQgPSBzaWduYWxzU2VudENvdW50IDwgdGhyZWFkc0NvdW50OwoKICAgICAgICAvLyAz
LiBDb3VudCB0aGUgdGhyZWFkIHRvd2FyZHMgdGhlIGFwcHJvcHJpYXRlIGdyb3VwIG9mIHdhaXRl
cnMgKGJlZm9yZS9hZnRlcgogICAgICAgIC8vICAgIGFsbCBzaWduYWxzIGFyZSBzZW50KQogICAg
ICAgIGlmIChiZWZvcmVBbGxTaWduYWxzU2VudCkgewogICAgICAgICAgICArK2JlZm9yZVdhaXRl
cnNDb3VudDsKICAgICAgICB9IGVsc2UgewogICAgICAgICAgICArK2FmdGVyV2FpdGVyc0NvdW50
OwogICAgICAgIH0KCiAgICAgICAgLy8gNC4gV2FpdCBvbiB0aGUgY29uZGl0aW9uIHZhcmlhYmxl
LgogICAgICAgIHB0aHJlYWRfY29uZF93YWl0KCZ0ZXN0Q29uZCwgJm11dGV4KTsKCiAgICAgICAg
Ly8gNS4gSWYgdGhlIGl0ZXJhdGlvbiBjb3VudCBoYXMgY2hhbmdlZCwgc3RhcnQgb3ZlcgogICAg
ICAgIGlmIChjdXJySXRlcmF0aW9uQ291bnQgIT0gaXRlcmF0aW9uQ291bnQpIHsKICAgICAgICAg
ICAgY29udGludWU7CiAgICAgICAgfQoKICAgICAgICAvLyA2LiBDb3VudCB0aGUgd29rZW4gdGhy
ZWFkIHRvd2FyZHMgdGhlIGFwcHJvcHJpYXRlIGdyb3VwIG9mIHdva2VuIHRocmVhZHMgKHRoZQog
ICAgICAgIC8vICAgIHNhbWUgZ3JvdXAgYXMgc3RlcCAyKS4KICAgICAgICBpZiAoYmVmb3JlQWxs
U2lnbmFsc1NlbnQpIHsKICAgICAgICAgICAgKytiZWZvcmVXYWl0ZXJzV29rZW5Db3VudDsKICAg
ICAgICB9IGVsc2UgewogICAgICAgICAgICArK2FmdGVyV2FpdGVyc1dva2VuQ291bnQ7CiAgICAg
ICAgfQogICB9CiAgICBwdGhyZWFkX211dGV4X3VubG9jaygmbXV0ZXgpOwogICAgcHRocmVhZF9l
eGl0KE5VTEwpOwp9CgppbnQgbWFpbigpCnsKLy8gMS4gVGVzdCBzZXR1cDoKICAgIC8vIDEuMS4g
SW5pdGlhbGl6ZSB0aGUgbXV0ZXggYW5kIHRoZSBjb25kaXRpb24gdmFyaWFibGUKICAgIHB0aHJl
YWRfbXV0ZXhfaW5pdCgmbXV0ZXgsIE5VTEwpOwogICAgcHRocmVhZF9jb25kX2luaXQoJnRlc3RD
b25kLCBOVUxMKTsKCiAgICAvLyAxLjIuCgogICAgZXhpdFByb2dyYW0gPSBmYWxzZTsKCiAgICAv
LyAxLjMuIENyZWF0ZSBzb21lIHdhaXRlciB0aHJlYWRzLiBBbGwgd2FpdGVycyB3aWxsIGltZWRp
YXRlbHkgYmxvY2sgb24gdGhlIG11dGV4CiAgICBwdGhyZWFkX211dGV4X2xvY2soJm11dGV4KTsK
CiAgICBmb3IgKHRocmVhZHNDb3VudCA9IDA7IHRocmVhZHNDb3VudCA8IG1heFdhaXRlclRocmVh
ZHNDb3VudDsgKyt0aHJlYWRzQ291bnQpIHsKICAgICAgICBwdGhyZWFkX3QgdGhyZWFkOwoKICAg
ICAgICBpZiAocHRocmVhZF9jcmVhdGUoJnRocmVhZCwgTlVMTCwgVGhyZWFkTWFpbiwgTlVMTCkg
IT0gMCkgewogICAgICAgICAgICBicmVhazsKICAgICAgICB9CgogICAgICAgIHB0aHJlYWRfZGV0
YWNoKHRocmVhZCk7CiAgICB9CgogICAgcHJpbnRmKCJDcmVhdGVkICVkIHRocmVhZHMuXG4iLCB0
aHJlYWRzQ291bnQpOwovLyAgICBjb3V0IDw8ICJDcmVhdGVkICIgPDwgdGhyZWFkc0NvdW50IDw8
ICIgdGhyZWFkcy4iIDw8IGVuZGw7CgovLyAyLiBUZXN0IGJvZHkuIENvdW50IHRoZSB0ZXN0IGJv
ZHkgaXRlcmF0aW9ucyB0byBoZWxwIHdhaXRlcnMgZGV0ZWN0IHdoZW4gYQovLyAgICBuZXcgaXRl
cmF0aW9uIGhhcyBzdGFydGVkCiAgICBmb3IgKGl0ZXJhdGlvbkNvdW50ID0gMDsgIWV4aXRQcm9n
cmFtOyArK2l0ZXJhdGlvbkNvdW50KSB7CiAgICAgICAgLy8gMi4xLiBSZXNldCB0aGUgY291bnRl
cnMKICAgICAgICBiZWZvcmVXYWl0ZXJzQ291bnQgPSAwOwogICAgICAgIGFmdGVyV2FpdGVyc0Nv
dW50ID0gMDsKICAgICAgICBiZWZvcmVXYWl0ZXJzV29rZW5Db3VudCA9IDA7CiAgICAgICAgYWZ0
ZXJXYWl0ZXJzV29rZW5Db3VudCA9IDA7CiAgICAgICAgc2lnbmFsc1NlbnRDb3VudCA9IDA7Cgog
ICAgICAgIC8vIDIuMi4gV2FpdCBmb3IgYWxsIHdhaXRlcnMgdG8gYmxvY2sgb24gdGhlIGNvbmRp
dGlvbiB2YXJpYWJsZQogICAgICAgIHdoaWxlIChiZWZvcmVXYWl0ZXJzQ291bnQgPCB0aHJlYWRz
Q291bnQpIHsKICAgICAgICAgICAgcHRocmVhZF9tdXRleF91bmxvY2soJm11dGV4KTsKICAgICAg
ICAgICAgc2NoZWRfeWllbGQoKTsKICAgICAgICAgICAgcHRocmVhZF9tdXRleF9sb2NrKCZtdXRl
eCk7CiAgICAgICAgfQoKICAgICAgICAvLyAyLjMuIFNlbmQgYXMgbWFueSBpbmlkaXZpZHVhbCBz
aWduYWxzIGFzIHRoZXJlIGFyZSB3YWl0ZXIgdGhyZWFkcwogICAgICAgIGZvciAoOyBzaWduYWxz
U2VudENvdW50IDwgdGhyZWFkc0NvdW50OyArK3NpZ25hbHNTZW50Q291bnQpIHsKICAgICAgICAg
ICAgcHRocmVhZF9jb25kX3NpZ25hbCgmdGVzdENvbmQpOwoKICAgICAgICAgICAgaWYgKHJlbGVh
c2VNdXRleEJldHdlZW5TaWduYWxzKSB7CiAgICAgICAgICAgICAgICBwdGhyZWFkX211dGV4X3Vu
bG9jaygmbXV0ZXgpOwogICAgICAgICAgICAgICAgc2NoZWRfeWllbGQoKTsKICAgICAgICAgICAg
ICAgIHB0aHJlYWRfbXV0ZXhfbG9jaygmbXV0ZXgpOwogICAgICAgICAgICB9CiAgICAgICAgfQoK
ICAgICAgICAvLyAyLjQuIFdhaXQgZm9yIGF0IGxlYXN0IHR3byB0aHJlYWRzIHRvIGJsb2NrIGFn
YWluIGFmdGVyIHRoZSBzaWduYWxzIHdlcmUKICAgICAgICAvLyAgICAgIHNlbnQuIEJ1dCBpZiB3
ZSB3ZXJlIHJlbGVhc2luZyB0aGUgbXV0ZXggYmV0d2VlbiBzaWduYWxzIGl0IGlzIHBvc3NpYmxl
CiAgICAgICAgLy8gICAgICB0aGF0IHRvbyBtdWNoIHRocmVhZHMgd29rZSB1cCBhbmQgcmVlbnRl
cmVkIHRoZSB3YWl0IHNvIHdlIG5lZWQgdG8gbWFrZQogICAgICAgIC8vICAgICAgc3VyZSB0aGF0
IHRoZXJlIGFyZSBlbm91Z2ggdGhyZWFkcyB0aGF0IGhhdmVudCB3b2tlbiB5ZXQuCiAgICAgICAg
aWYgKHNpZ25hbHNTZW50Q291bnQgLSBiZWZvcmVXYWl0ZXJzV29rZW5Db3VudCA+IDIpIHsKICAg
ICAgICAgIHdoaWxlIChhZnRlcldhaXRlcnNDb3VudCA8IDIpIHsKICAgICAgICAgICAgICBwdGhy
ZWFkX211dGV4X3VubG9jaygmbXV0ZXgpOwogICAgICAgICAgICAgIHNjaGVkX3lpZWxkKCk7CiAg
ICAgICAgICAgICAgcHRocmVhZF9tdXRleF9sb2NrKCZtdXRleCk7CiAgICAgICAgICB9CiAgICAg
ICAgfQoKICAgICAgICAvLyAyLjUuIFNlbmQgYSBzaW5nbGUgc2lnbmFsCiAgICAgICAgcHRocmVh
ZF9jb25kX3NpZ25hbCgmdGVzdENvbmQpOwoKCiAgICAgICAgLy8gMi42LiBXYWl0IGZvciBhIHBy
ZWRlZmluZWQgdGltZSBmb3IgYXQgbGVhc3Qgc2lnbmFsc1NlbnRDb3VudCAnYmVmb3JlJwogICAg
ICAgIC8vICAgICAgd2FpdGVycyB0byB3YWtlIHVwLgogICAgICAgIHRpbWVfdCBlbmRUaW1lOwog
ICAgICAgIHRpbWVfdCBzdGFydFRpbWUgPSB0aW1lKCZlbmRUaW1lKTsKICAgICAgICB3aGlsZSAo
YmVmb3JlV2FpdGVyc1dva2VuQ291bnQgPCBzaWduYWxzU2VudENvdW50KSB7CiAgICAgICAgICAg
IHB0aHJlYWRfbXV0ZXhfdW5sb2NrKCZtdXRleCk7CiAgICAgICAgICAgIHNjaGVkX3lpZWxkKCk7
CiAgICAgICAgICAgIHB0aHJlYWRfbXV0ZXhfbG9jaygmbXV0ZXgpOwoKICAgICAgICAgICAgaWYg
KHRpbWUoJmVuZFRpbWUpIC0gc3RhcnRUaW1lID4gd2FpdFRpbWVTZWMpIHsKICAgICAgICAgICAg
ICAgIGJyZWFrOwogICAgICAgICAgICB9CiAgICAgICAgfQoKICAgICAgICAvLyAyLjcuIElmIHNv
bWUgdGhyZWFkcyB0aGF0IHNob3VsZCBoYXZlIGJlZW4gdW5ibG9ja2VkIGJ5IHRoZSBmaXJzdAog
ICAgICAgIC8vICAgICAgc2lnbmFsc1NlbnRDb3VudCBzaWduYWxzIGhhdmUgcmVtYWluZWQgYmxv
Y2tlZCwgcmVwb3J0IHRoYXQgdGhlCiAgICAgICAgLy8gICAgICByYWNlIHdhcyBoaXQKICAgICAg
ICBpZiAoYmVmb3JlV2FpdGVyc1dva2VuQ291bnQgPCBzaWduYWxzU2VudENvdW50KSB7CiAgICAg
ICAgICAgIHByaW50ZigiUmFjZSBoaXQuXG5cdFdhaXRlZDpcdFx0JWRzXG5cdEZhaWxlZCB0byB3
YWtlOlx0JWRcblx0RXh0cmEgd29rZW46XHQlZFxuIiwKICAgICAgICAgICAgICAgIGVuZFRpbWUt
c3RhcnRUaW1lLCBzaWduYWxzU2VudENvdW50LWJlZm9yZVdhaXRlcnNXb2tlbkNvdW50LAogICAg
ICAgICAgICAgICAgYWZ0ZXJXYWl0ZXJzV29rZW5Db3VudC0xKTsKLy8gICAgICAgICAgICBjb3V0
IDw8ICJSYWNlIGhpdC4iIDw8IGVuZGwKLy8gICAgICAgICAgICAgICAgIDw8ICJcdFdhaXRlZDpc
dFx0IiA8PCBlbmRUaW1lIC0gc3RhcnRUaW1lIDw8ICJzIiA8PCBlbmRsCi8vICAgICAgICAgICAg
ICAgICA8PCAiXHRGYWlsZWQgdG8gd2FrZVx0OiAiIDw8IHNpZ25hbHNTZW50Q291bnQgLSBiZWZv
cmVXYWl0ZXJzV29rZW5Db3VudCA8PCBlbmRsCi8vICAgICAgICAgICAgICAgICA8PCAiXHRFeHRy
YSB3b2tlbjpcdCIgPDwgYWZ0ZXJXYWl0ZXJzV29rZW5Db3VudCAtIDEgPDwgZW5kbDsKICAgICAg
ICAgICAgLy8gTm90aWZ5IGFsbCB3YWl0ZXJzIHRoYXQgdGhleSBzaG91bGQgZXhpdAogICAgICAg
ICAgICBleGl0UHJvZ3JhbSA9IHRydWU7CiAgICAgICAgfSBlbHNlIHsKICAgICAgICAgICAgcHJp
bnRmKCJSYWNlIG5vdCBoaXQuXG4iKTsKLy8gICAgICAgICAgICBjb3V0IDw8ICJSYWNlIG5vdCBo
aXQuIiA8PCBlbmRsOwogICAgICAgIH0KCiAgICAgICAgLy8gMi44LiBTZW5kIGEgYnJvYWRjYXN0
IHRvIGxldCBhbGwgd2FpdGVycyBtb3ZlIHRvIHRoZSBzdGFydCBvZiB0aGUgbmV4dAogICAgICAg
IC8vICAgICAgdGVzdCBpdGVyYXRpb24gb3IgZXhpdAogICAgICAgIHB0aHJlYWRfY29uZF9icm9h
ZGNhc3QoJnRlc3RDb25kKTsKICAgIH0KCiAgICBwdGhyZWFkX211dGV4X3VubG9jaygmbXV0ZXgp
OwogICAgcHRocmVhZF9leGl0KE5VTEwpOwp9Cg==
</data>

          </attachment>
      

    </bug>

</bugzilla>