This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Thread-, Signal- and Cancellation-safety documentation


On Sun, 2013-06-02 at 02:06 -0300, Alexandre Oliva wrote:
> On Jun  1, 2013, Alexandre Oliva <aoliva@redhat.com> wrote:
> 
> > What must not happen is for mem synch primitives, called by users, to
> > fail to propagate effects of posix API calls executed locally to be
> > published to global memory, and make effects of API calls executed in
> > other threads before their latest mem synch primitives visible to the
> > local thread.  Other than that, all bets are off.
> 
> Now, bringing that back to the topic that matters to the project I'm on,
> and trying to show that the guarantees you're looking for are not
> given by posix, consider this:
> 
> char a, b; // globals
> int pipefd[2]; // the two ends of a pipe internal to this process
> 
> thread1 {
>   a = 42;
>   b = 6 * 9;
>   write (pipefd[1], &b, 1);
> }
> 
> thread2 {
>   read (pipefd[0], &a, 1);
>   printf ("%i\n", a);
> }
> 
> 
> Now, obviously there's a happens-before relationship between the write
> and the read, right?

More precisely, there is an ordering implied between the return of the
read and the start of the write.
This would make the program have a data race because read() will access
a prior to this ordering.

So let's use this example instead:

thread1 {
  a = 42;
  b = a;  // Reveal the program logic / intra-thread order to thread2
  write (pipefd[1], &b, 1);
}

thread2 {
  tmp = 23;
  read (pipefd[0], &tmp, 1);
  printf ("%i %i\n", tmp, a); // Show both a and what we read.
}



> However, since neither read nor write are memory
> synchronization operations, nothing in posix guarantees that the write
> to a in thread1 won't prevail over the write to a implied by the read
> call in thread2: in the absence of a memory sync operation, it is a data
> race, that invokes undefined behavior, even when there's an obvious
> happens-before.

Okay so let's look at the revised example from the perspective of a
programmer.  You say there's an *obvious* happens-before (or, IOW, an
implied ordering).  And this make sense because otherwise we wouldn't
read the value 42 at all.  But at the same time, you're saying that this
has doesn't actually have a happens-before order.  That's just confusing
to programmers.

> Now, both read and write are thread safe, so it follows that being
> thread safe doesn't offer ordering guarantees, not even when a
> happens-before is present!

First, let's keep the terminology straight: happens-before is for the
memory model.  Second, we already established that there are ordering
guarantees, notably that thread-safe functions respect happens-before
relations established by the caller.

So, you are arguing that logical orderings that are implied by how
functions behave (eg, which values they return) do not imply
happens-before relations *of any kind*.  In turn, this means that we can
get behavior that would not be possible in a sequentially consistent
execution of your program (ie, in which all thread-safe functions are
executed sequentially).  Do you see the difficulties that can arise from
this for programmers?

Furthermore, you'd still need to make an atomicity requirement for
thread-safe functions.  Otherwise, if they can consist of several steps,
the sequential specifications we started with aren't sufficient anymore
to describe their behavior.

Thread-safe functions are supposed to make concurrency *easier* for
programmers.  We could specify that thread-safe functions behave like
C/C++ atomics with relaxed memory order, but I don't think this makes
sense as the default because it's much harder to use.  There's a reason
why C++11 atomics have sequential consistency as default memory order...

> How could this be possible?  Well, precisely because the internal
> implementation of the read and write calls could use internal mechanisms
> that, in spite of flushing the data in the write buffer all the way to
> the read buffer in another thread, does not guarantee that any other
> data is flushed.

I asked you whether you wanted the semantics of release/consume, and you
declined.  Now you want to have similarly weak semantics as default for
everyone?  I'm a little confused :)

What we are arguing here in terms of the C/C++ memory model is
essentially the difference between requiring release and consume memory
orders.  In terms of implementations, there's no difference between both
in the x86 and SPARC memory models.  On Power, consume allows for using
somewhat weaker memory barriers, but only on the consumer side.

So, you *might* get a little benefit on some architectures and if the
implementation doesn't use locks to synchronize on the pipe (because the
standard locks give you the stronger release/acquire by default; who
would have thought? :) ).  Comparing that to all the other
synchronization overheads that we have on a pipe, does it really matter
that much that we need to make the complicated semantics the *default*?
IMO, it would be much better to have a more intuitive default that is
easier to use, and provide a version with weaker guarantees -- if
there's actually demand for it.

> Now, using streams rather than file descriptors would not make any
> difference as far as ordering guarantees are concerned: in spite of the
> obvious happens-before, no memory synchronization is guaranteed.
> Indeed, nothing in posix precludes the FILE* to be fake pointers used
> just as identifiers, or pointers to memory accessible only to the
> kernel, as long as the stream manipulating interfaces behave as
> specified; they could all be atomic system calls, or they could use any
> form of magic (or sufficiently advanced technology ;-) to implement
> behavior that meets the specification while ensuring consistency of the
> internal data structures even in the presence of concurrent calls.
> 
> Even the availability of explicit stream locking (flockfile) does not
> bring memory synchronization with it, so in spite of mutual exclusion,
> ordering is not guaranteed.  I don't see any requirement that would
> render non-compliant an implementation of stream write operations that,
> given multiple writes within a flockfile/funlockfile pair, queued them
> up in memory local to the thread, getting them all ordered and written
> out at subsequent explicit mem sync calls, or implicitly.
> 
> Any evidence that this is not so?

Trying to answer this question doesn't make sense because we don't know
whether the incomplete definition in the standard is meant to represent
weak guarantees or stronger guarantees (remember that it doesn't seem to
specify even the minimal guarantees we need, eg, atomicity).  Yeah we
can interpret the standard as not giving anything stronger, because it
doesn't even give the weaker guarantees.

So from a language/standards lawyer perspective, I agree that the
implementation can be considered compliant because it does guarantee a
too vague, too-weak-to-be-useful guarantee.(*)  

But should this guide what we want to guarantee to users?  I don't think
so.

((*) Let's ignore that our locks -- rightfully so -- don't guarantee to
be like full memory barriers, so you don't get the total order for
synchronization operations.)


Torvald


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]