This is the mail archive of the
mailing list for the glibc project.
Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
- From: Rich Felker <dalias at aerifal dot cx>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>, "GNU C. Library" <libc-alpha at sourceware dot org>
- Date: Sun, 14 Sep 2014 20:46:23 -0400
- Subject: Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
- Authentication-results: sourceware.org; auth=none
- References: <5410C70E dot 70207 at linux dot vnet dot ibm dot com> <1410533066 dot 4967 dot 96 dot camel at triegel dot csb> <20140912153235 dot GP23797 at brightrain dot aerifal dot cx> <1410538290 dot 4967 dot 111 dot camel at triegel dot csb> <20140912171721 dot GR23797 at brightrain dot aerifal dot cx> <1410561872 dot 4967 dot 126 dot camel at triegel dot csb> <20140913015816 dot GT23797 at brightrain dot aerifal dot cx> <1410717649 dot 4967 dot 130 dot camel at triegel dot csb>
On Sun, Sep 14, 2014 at 08:00:49PM +0200, Torvald Riegel wrote:
> > Actually, cancellation of pthread_cond_[timed]wait is complicated.
> > Depending on how unblocking a waiter works, it's possible that the
> > thread being cancelled has already "consumed the signal", and
> > therefore can't act on cancellation. This is a case where the program
> > counter at cancellation signal time is not sufficient to determine if
> > cancellation can be acted upon; the decision needs to be made later
> > based on userspace criteria (cond var state), not based on the
> > completion or non-completion of the futex syscall.
> If something that got cancelled has consumed a signal already, then this
> isn't visible to other threads yet except that they don't wake up. Have
> you considered sending another signal (which is indistinguishable from
> the one consumed by the cancelled thread) to undo the consumption?
It's not that simple, because it's hard to guarantee waking a waiter
from the right set. The signal that was consumed by the cancelled
thread must wake a waiter which was already a waiter at the time of
that signal, not another waiter that arrives later. There may be other