This is the mail archive of the
mailing list for the glibc project.
RE: Implementing C++1x and C1x atomics (really an aside on SFENCE)
> -----Original Message-----
> From: Lawrence Crowl [mailto:firstname.lastname@example.org]
> The problem is that gcc does support 80386. It also supports
> other processors that have less-than-complete support for
> concurrency. Just in the x86 line, we get some additional
> capability in many new layers.
> 8086 LOCK XCHG
> 80486 CMPXCHG XADD
> Pentium CMPXCHG8B
> SSE SFENCE
Aside to an interesting discussion:
I believe the current conclusion is that SFENCE should be ignored, except for library or compiler-generated code that uses non-temporal/coalescing stores, which I believe are also a recent addition. Normal stores are ordered anyway, so it's not needed. Thus you are faced with a choice of either (a) implementing fences on the assumption that ordinary code may contain non-temporal stores, or (b) making sure that non-temporal stores are always surrounded by the appropriate fences. This is really an important ABI issue, but it's something that I believe no ABI currently specifies. Our conclusion in earlier discussions among a different group of people was that (b) made more sense, since non-temporal stores of various kinds seemed to be largely confined to a few library routines.
It would be really nice if everyone somehow managed to agree on this. Inconsistency here, probably even between Windows and Linux, seems likely to result in really subtle bugs.
Note that this also affects correctness of spinlock implementations, not just atomics. A simple store to release a lock doesn't work if the critical section may contain unfenced non-temporal stores.
> SSE2 MFENCE
> late AMD64 CMPXCHG16B
> So, we do not get to ignore the problem as a relic of 80386.