This is the mail archive of the
mailing list for the glibc project.
Re: [RFC] mutex destruction (#13690): problem description and workarounds
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Rich Felker <dalias at libc dot org>
- Cc: Torvald Riegel <triegel at redhat dot com>, GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Mon, 01 Dec 2014 12:30:13 -0500
- Subject: Re: [RFC] mutex destruction (#13690): problem description and workarounds
- Authentication-results: sourceware.org; auth=none
- References: <1396621230 dot 10643 dot 7191 dot camel at triegel dot csb> <54762514 dot 2030301 at redhat dot com> <20141201154420 dot GW29621 at brightrain dot aerifal dot cx>
On 12/01/2014 10:44 AM, Rich Felker wrote:
> On Wed, Nov 26, 2014 at 02:08:04PM -0500, Carlos O'Donell wrote:
>>> === Workaround 2: New FUTEX_UNLOCK operation that makes resetting the
>>> futex var and unblocking other threads atomic wrt. other FUTEX_WAIT ops
>>> This is like UNLOCK_PI, except for not doing PI. Variations of this new
>>> futex op could store user-supplied values, or do a compare-and-set (or
>>> similar) read-modify-write operations. FUTEX_WAKE_OP could be used as
>>> well, but we don't need to wake another futex (unnecessary overhead).
>>> (I haven't checked the kernel's FUTEX_WAKE_OP implementation, and there
>>> might be reasons why it can't be used as is (e.g., due to how things are
>>> locked regarding the second futex).
>>> The biggest performance drawback that I see is a potential increase in
>>> the latency of unlocking any thread (blocked *or* spinning) when any
>>> thread is blocked. This is because we'll have to ask the kernel to
>>> reset the futex var (instead of, like now, userspace doing it), which
>>> means that we'll have to enter the kernel first before a spinning thread
>>> can get the okay to acquire the lock. This could decrease lock
>>> scalability for short critical sections in particular because those
>>> effectively get longer.
>>> I don't think it's sufficient to merely count the number of waiting
>>> blocked threads in the futex var to get around pending FUTEX_WAKE calls.
>>> If there is *any* potentially blocked waiter, we'll have to use the
>>> kernel to reset the futex var.
>>> Perhaps this could be mitigated if we'd do a lot more spinning in the
>>> futexes, so that it's unlikely to slow down spinning waiters just
>>> because there's some blocked thread. For blocked threads, the slow down
>>> should be less because if a waiter is going to block, there's just a
>>> small time window where the FUTEX_WAIT will actually fail (EWOULDBLOCK)
>>> due to the futex var changing concurrently.
>>> * Correct futex uses will need no changes.
>>> * glibc implementation will have to change (mutexes, semaphores,
>>> barriers, perhaps condvars).
>>> * Needs a new futex operation (or might use FUTEX_WAKE_OP with some
>>> performance penalty).
>>> * Potential decrease in lock scalability unless it can be mitigated by
>>> more aggressive spinning.
>> This is a "strict semantic" workaround, and the kernel does the work.
>> I don't like this solution. The performance impact is not worth it given
>> the other workarounds. However, without any performance measurement I don't
>> know exactly how bad this is, but entering the kernel is bad enough that
>> we don't want to consider it.
>>> === Summary / RFC
>>> IMO, workarounds 1, 1a, or 2 might be the best ones, although 4 might
>>> also be good.
>> I vote 2. I don't want to relax the semantics of the original operations
>> to support spurious wakes. Such spurious wakes might be fixed in future
>> kernels and maybe other applications can start depending on them even if
>> glibc doesn't.
> You said you don't like 2, but voted for 2? I'm confused.
That was a mistake.
I did not notice Torvald numbered it 1a instead of 2.
I vote 1a :}