Bug 17428 - lll_trylock barrier semantics when lock was not acquired differ between architectures
Summary: lll_trylock barrier semantics when lock was not acquired differ between archi...
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: nptl (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-23 17:54 UTC by Torvald Riegel
Modified: 2018-01-24 16:21 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Torvald Riegel 2014-09-23 17:54:54 UTC
This matters because lll_trylock is used by pthread_mutex_trylock; POSIX' barrier requirements are impractical IMO (the wording would require full barriers for every synchronization function), but stronger than what all arcs provide.  C++11 only requires weak barrier semantics (memory_order_relaxed, essentially, and spurious false negatives (ie, trylock returns that it can't acquire the lock although it could) are allowed); these semantics make more sense IMHO.

The default (/sysdeps/nptl/lowlevellock.h) uses atomic_compare_and_exchange_bool_acq, which is memory_order_relaxed on the failure path of the CAS (lock not acquired).  ARM uses the default.

Power has memory_order_acquire on the failure path.  x86/x86_64 as well, effectively.

We should both unify this and, IMO, work towards improving POSIX' semantics.
Comment 1 Rich Felker 2014-09-23 18:28:38 UTC
Could you explain the basis of your claim that C11 mtx_trylock allows spurious failures? I suspect this is a misstatement of something rather different: roughly, that there are situations where operations are non-ordered to an extent that both a view that the mutex is already locked, and a view that it's not already locked, are both consistent with everything else observable. But in cases where the mutex is, from the calling thread's perspective, provably in either the locked or unlocked state, I think mtx_trylock is required to produce an accurate result. I don't see any language that would exempt it from this requirement that's specified in its normative Description/Returns text (7.26.4.5 p2-3).

As for "improving" POSIX's semantics, that's another matter, and it really depends on the outcome of the alignment process and further action by the Austin Group and sponsors. I'm still not clear on whether weakening trylock would be an improvement, and my view largely depends on the degree to which such a weakening could be inconsistent/counter-intuitive and lead to bugs, especially in applications which were correct with respect to a previous issue of the standard. But I think it's outside the scope of this glibc tracker issue.
Comment 2 Torvald Riegel 2014-09-23 20:58:00 UTC
See the C++ standard 30.4.1.2/16 (not that I spoke about C++11).

C11 doesn't spell out the spurious failures, but C++11 allows them, IIRC, to be able to give the guarantee to programmers that a data-race-free program will behave like under sequential consistency if only locks and sequentially-consistent atomics are used; that won't work with a trylock because then you can use a lock like a single-assignment atomic variable, revealing non-seq_cst semantics.  C11 wants to follow the same memory model, so I believe it should also allow the spurious failures.  There's several things that C11 doesn't say although it should...
Anyway, even C11 doesn't give a synchronizes-with guarantee for trylock that fails to acquire.

POSIX' semantics do matter because that's what pthread_mutex_trylock is meant to implement.
Comment 3 Torvald Riegel 2017-04-06 12:11:29 UTC
I do not remember who brought this up, but one can argue that the POSIX requirements on synchronization only apply to function calls that do not fail.  This seems sensible to me.  It means that trylocks that fail by returning EBUSY do not imply any synchronization effects, so implementing lock acquisition attempts in trylock with an atomic_compare_exchange_weak_acquire would be okay (whose failure path has relaxed MO semantics and can fail spuriously).
Comment 4 Torvald Riegel 2017-04-06 12:21:40 UTC
See http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html, subsection "Memory Synchronization":

"Unless explicitly stated otherwise, if one of the above functions returns an error, it is unspecified whether the invocation causes memory to be synchronized."

This allows for using just relaxed memory order when observing that a lock is already acquired and returning EBUSY due to that; however, it is unclear whether this allows for using a weak CAS that may fail spuriously (ie, returning EBUSY because of a spurious failure).
Comment 5 Rich Felker 2017-04-11 01:28:48 UTC
Perhaps I'm being redundant saying this again, but I believe that whether the function synchronizes memory on failure (I agree that it doesn't have to) is a separate issue from whether it's permitted to return an error when the conditions for that error code are not met. The answer to the latter is no, but there are subtleties to the condition itself that make the topic nontrivial still.

I don't believe there is any viable argument for returning EBUSY simply due to a weak cas, when the mutex was never locked or the semaphore value was never zero, and regardless of any ordering (perhaps this is even a single-threaded process!) the operation should succeed. Likewise if you have a strong cas, but the cas failure is due to something that wouldn't preclude the operation from succeeding (like waiting on a semaphore whose value is >1 from another thread), this should not be able to cause EBUSY. The only case in which there is an argument to be made that a "spurious" EBUSY is valid is when, despite a lack of other operations that synchronize memory, some other side channel (like a pipe) establishes an ordering whereby the operation "should have" succeeded.

If no such order-imposing operation is present at all, then I think it's clear that you can "spuriously" fail whenever there's any possible ordering that would have produced failure.