Bug 9813 - pselect implementation (when not implemneted by the kernel) agriviates the race
Summary: pselect implementation (when not implemneted by the kernel) agriviates the race
Status: REOPENED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-04 10:05 UTC by Shachar Shemesh
Modified: 2020-03-04 21:24 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Proposed patch to narrow the race window (1.08 KB, patch)
2009-02-04 10:36 UTC, Shachar Shemesh
Details | Diff
Program demonstrating the problem (387 bytes, text/x-csrc)
2009-02-04 10:52 UTC, Shachar Shemesh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Shachar Shemesh 2009-02-04 10:05:33 UTC
pselect is an operation that must be performed atomically. As such, the only
race free implementation is one done in the kernel. If the race exists, then it
is possible that "select" will hang until the timeout (or forever), because the
signal that the programmer thought would wake it up happened before "select" was
called. The glibc implementation is only as a stop gap for platforms where the
function is not defined, to encourage people to use it anyways, and is known not
to cover 100% of the cases.

That being said, the current pselect implementation makes the race condition
worse, almost guaranteeing that the race will take place.

The current implementation looks like this:
1: sigprocmask // Enable the signals
2: select // Perform the actual select
3: sigprocmask // Re-disable the signals

A typical use scenario would be:

4: while
5: pselect
6: if( signal happened ) ...
7: Do something not signal related
8: loop over the while

In the current implementation, any signal arriving after the sigprocmask in line
3, and before the "select" in line 2 is GUARANTEED to trigger the race
condition, as the signal will take effect as soon as the sigprocmask in line 1
takes place, necessarily before the select in line 2. This means the chances for
the race are directly proportional to the relative amount of time the program
spends doing something other than waiting on the select.

I am attaching a modified implementation of pselect that greatly reduces the
window in which the race can take effect, limiting it to only within the actual
pselect function.
Comment 1 Shachar Shemesh 2009-02-04 10:36:40 UTC
Created attachment 3710 [details]
Proposed patch to narrow the race window

Proposed patch to the problem
Comment 2 Shachar Shemesh 2009-02-04 10:42:25 UTC
Forgot to add - in the above patch, NSIG_LONGS is undefined. Here is its definition:

// Number of __vals in sigset_t that actually contain useful data
#define NSIG_LONGS (_NSIG/(8*sizeof(((sigset_t *)NULL)->__val[0])))

Shachar
Comment 3 Shachar Shemesh 2009-02-04 10:52:12 UTC
Created attachment 3712 [details]
Program demonstrating the problem

This program demonstrate the problem. Under a kernel with pselect support, it
prints:
sig_happened=1
sig_happened=1
sig_happened=1
sig_happened=1
sig_happened=1

And exits almost immediately.
Comment 4 Michael Kerrisk 2012-02-19 22:06:49 UTC
Shachar, I suspect that it's not worth trying to make the fix you suggest. The fix will only appear in modern glibc, and any modern system will have a kernel-supported. The fundamental problem can't be remedied: the idea to add a userspace implementation of pselect() was extremely muddleheaded, and worsens portability problems for applications. The portability question goes from being "do I have pselect() or not?" to "do I have a pselect() or not, and if I do, is it one that works?"; the last part of the second question can only be verified with a check of the kernel (and glibc) versions.
Comment 5 Ondrej Bilka 2013-10-13 08:28:27 UTC
As according to http://lwn.net/Articles/176911/ pselect apperared at 2.6.16 and required kernel version is 2.6.16 this patch is moot now.
Comment 6 Joseph Myers 2013-10-14 14:24:37 UTC
On various architectures, pselect was only added in later kernel versions.  Please carefully check *all* kernel-features.h files in glibc, or kernel sources of appropriate versions for *all* architectures, before making assertions about syscall availability.  News sources likely to focus mainly on x86 are not sufficient.
Comment 7 Ondrej Bilka 2013-10-14 14:39:29 UTC
On Mon, Oct 14, 2013 at 02:24:37PM +0000, jsm28 at gcc dot gnu.org wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=9813
> 
> Joseph Myers <jsm28 at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|RESOLVED                    |REOPENED
>          Resolution|FIXED                       |---
> 
> --- Comment #6 from Joseph Myers <jsm28 at gcc dot gnu.org> ---
> On various architectures, pselect was only added in later kernel versions. 
> Please carefully check *all* kernel-features.h files in glibc, or kernel
> sources of appropriate versions for *all* architectures, before making
> assertions about syscall availability.  News sources likely to focus mainly on
> x86 are not sufficient.
 
Then patch in bugzilla is still valid. Could you review it and send to
libc-alpha?
Comment 8 Adhemerval Zanella 2020-03-04 21:24:08 UTC
Currently only microblaze-linux-gnu is affected by this issue and the faulty fallback code is not used when --enable-kernel=3.15 or newer is used.