pselect is an operation that must be performed atomically. As such, the only
race free implementation is one done in the kernel. If the race exists, then it
is possible that "select" will hang until the timeout (or forever), because the
signal that the programmer thought would wake it up happened before "select" was
called. The glibc implementation is only as a stop gap for platforms where the
function is not defined, to encourage people to use it anyways, and is known not
to cover 100% of the cases.
That being said, the current pselect implementation makes the race condition
worse, almost guaranteeing that the race will take place.
The current implementation looks like this:
1: sigprocmask // Enable the signals
2: select // Perform the actual select
3: sigprocmask // Re-disable the signals
A typical use scenario would be:
6: if( signal happened ) ...
7: Do something not signal related
8: loop over the while
In the current implementation, any signal arriving after the sigprocmask in line
3, and before the "select" in line 2 is GUARANTEED to trigger the race
condition, as the signal will take effect as soon as the sigprocmask in line 1
takes place, necessarily before the select in line 2. This means the chances for
the race are directly proportional to the relative amount of time the program
spends doing something other than waiting on the select.
I am attaching a modified implementation of pselect that greatly reduces the
window in which the race can take effect, limiting it to only within the actual
Created attachment 3710 [details]
Proposed patch to narrow the race window
Proposed patch to the problem
Forgot to add - in the above patch, NSIG_LONGS is undefined. Here is its definition:
// Number of __vals in sigset_t that actually contain useful data
#define NSIG_LONGS (_NSIG/(8*sizeof(((sigset_t *)NULL)->__val)))
Created attachment 3712 [details]
Program demonstrating the problem
This program demonstrate the problem. Under a kernel with pselect support, it
And exits almost immediately.
Shachar, I suspect that it's not worth trying to make the fix you suggest. The fix will only appear in modern glibc, and any modern system will have a kernel-supported. The fundamental problem can't be remedied: the idea to add a userspace implementation of pselect() was extremely muddleheaded, and worsens portability problems for applications. The portability question goes from being "do I have pselect() or not?" to "do I have a pselect() or not, and if I do, is it one that works?"; the last part of the second question can only be verified with a check of the kernel (and glibc) versions.
As according to http://lwn.net/Articles/176911/ pselect apperared at 2.6.16 and required kernel version is 2.6.16 this patch is moot now.
On various architectures, pselect was only added in later kernel versions. Please carefully check *all* kernel-features.h files in glibc, or kernel sources of appropriate versions for *all* architectures, before making assertions about syscall availability. News sources likely to focus mainly on x86 are not sufficient.
On Mon, Oct 14, 2013 at 02:24:37PM +0000, jsm28 at gcc dot gnu.org wrote:
> Joseph Myers <jsm28 at gcc dot gnu.org> changed:
> What |Removed |Added
> Status|RESOLVED |REOPENED
> Resolution|FIXED |---
> --- Comment #6 from Joseph Myers <jsm28 at gcc dot gnu.org> ---
> On various architectures, pselect was only added in later kernel versions.
> Please carefully check *all* kernel-features.h files in glibc, or kernel
> sources of appropriate versions for *all* architectures, before making
> assertions about syscall availability. News sources likely to focus mainly on
> x86 are not sufficient.
Then patch in bugzilla is still valid. Could you review it and send to