Bug 18433

Summary: posix_spawn does not return correctly upon failure to execute
Product: glibc Reporter: Nach M. S. <nachms+sourceware>
Component: libcAssignee: navid R <rahimi.nv>
Status: RESOLVED FIXED    
Severity: normal CC: adhemerval.zanella, drepper.fsp, fweimer, pb, rahimi.nv, vtjnash
Priority: P2 Flags: fweimer: security-
Version: 2.19   
Target Milestone: 2.24   
Host: Target:
Build: Last reconfirmed:
Bug Depends on: 20178    
Bug Blocks:    
Attachments: Checking existence of file before forking
using pipe approach to check status of child

Description Nach M. S. 2015-05-19 20:34:29 UTC
posix_spawn is supposed to return a value indicating an error when it fails.
The specification for it is here: http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn.html

From the spec: Otherwise, no child process shall be created, the value stored into the variable pointed to by a non-NULL pid is unspecified, and *an error number shall be returned as the function return value* to indicate the error.

However, glibc is returning 0 (success) instead of some kind of indication of error (such as 2 for ENOENT).

Test program:
-------------------------
#include <stdio.h>
#include <spawn.h>

int main()
{
  extern char **environ;
  char *argv[] = { "/bin/blah_blah-blah", 0 };
  pid_t pid = -1;
  int ret = posix_spawn(&pid, argv[0], 0, 0, argv, environ);
  printf("ret: %d\npid: %ld\n", ret, (long)pid);
  return(0);
}
-------------------------

"/bin/blah_blah-blah" doesn't exist, so it should not successfully spawn.

GLIBC for this program is returning:
ret: 0
pid: 17302

FreeBSD, DragonFlyBSD, NetBSD, and musl-libc all handle this correctly.
With them:
ret: 2
pid: -1 (some will put a different value here, which is allowed by the standard)
Comment 1 navid R 2015-09-10 14:25:34 UTC
Created attachment 8598 [details]
Checking existence of file before forking
Comment 2 Phil Blundell 2015-09-10 18:20:07 UTC
The standard does not seem to specifically require this situation to be diagnosed in the way you suggest.  In particular, it permits;

If [an] error occurs after the calling process successfully returns from the posix_spawn() or posix_spawnp() function, the child process may exit with exit status 127.

and notes:

A library implementation of posix_spawn() or posix_spawnp() may not be able to detect all possible errors before it forks the child process. POSIX.1-2008 provides for an error indication returned from a child process which could not successfully complete the spawn operation via a special exit status which may be detected using the status value returned by wait(), waitid(), and waitpid().
Comment 3 Nach M. S. 2015-09-12 19:03:44 UTC
Note the return value (emphasis included):
Upon successful completion, posix_spawn() and posix_spawnp() shall return the process ID of the child process to the parent process, in the variable pointed to by a non-NULL pid argument, and shall return zero as the function return value. Otherwise, no child process shall be created, the value stored into the variable pointed to by a non-NULL pid is unspecified, ***and an error number shall be returned as the function return value to indicate the error***. If the pid argument is a null pointer, the process ID of the child is not returned to the caller.

You are correct that the standard allows to exit with 127, however it discourages that behavior and says only if the library is unable to detect all issues is it okay to do so. I can't speak for other platforms, but it is perfectly possible to implement this function in a library on Linux in a feature-complete fashion.

FreeBSD, NetBSD, and musl-libc (Linux) at the library level are all capable of detecting the errors and properly returning them without any special kernel supports. I've also implemented this function on Linux correctly and feature-complete with proper error handling, and it's not really a big deal to do so.

Using any simple form of IPC, the error can be returned from the child to the parent. The BSDs use vfork() with a variable that the child modifies and the parent looks at. musl-libc uses a pipe. Personally, I recommend the latter approach.
Comment 4 navid R 2015-09-12 20:56:41 UTC
I developed a pipe solution but didn't get accepted . the problem raise because glibc is supporting system without pipe2 , (we can use pipe for creating file descriptor and  use fcntl and FD_CLOEXEC to set it proper flag , but there is race condition between two calls . and the overall code is so messy .


#ifdef __ASSUME_PIPE2
  if (__pipe2 (pipefd, O_CLOEXEC))
    return errno; 
#else
  if (__pipe (pipefd))
    return errno; 
  if (__fcntl (pipefd[0], FD_CLOEXEC) == -1 ||
 __fcntl (pipefd[1], FD_CLOEXEC) == -1
    return errno;
Comment 5 navid R 2015-09-12 21:00:10 UTC
I developed a pipe solution but didn't get accepted in mailing list . The problem raise because glibc is supporting system without pipe2 , (we can use pipe for creating file descriptor and  use fcntl and FD_CLOEXEC to set it proper flag , but there is race condition between two calls . and the overall code is so messy .


#ifdef __ASSUME_PIPE2
  if (__pipe2 (pipefd, O_CLOEXEC))
    return errno; 
#else  // FALLBACK mode in systems which does not have pipe2 call
  if (__pipe (pipefd))
    return errno; 
  /*  <==== here we have race condition ==> */
  if (__fcntl (pipefd[0], FD_CLOEXEC) == -1 ||
 __fcntl (pipefd[1], FD_CLOEXEC) == -1)
    return errno;   

With this explanation , I have change of heart , I think vfork solution would be better .
Comment 6 navid R 2015-09-12 21:07:11 UTC
By the way for further reading this the link of 
http://git.musl-libc.org/cgit/musl/tree/src/process/posix_spawn.c
musl(pipe2 approach) implementation.
and this is the link of freebsd (vfork approach) implementation :
https://github.com/freebsd/freebsd/blob/master/lib/libc/gen/posix_spawn.c
Comment 7 navid R 2015-09-13 12:15:33 UTC
Created attachment 8602 [details]
using pipe approach to check status of child
Comment 8 cvs-commit@gcc.gnu.org 2016-03-07 04:59:12 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  9ff72da471a509a8c19791efe469f47fa6977410 (commit)
       via  1eb8930608705702d5746e5491bab4e4429fcb83 (commit)
       via  f83bb9b8e97656ae0d3e2a31e859363e2d4d5832 (commit)
      from  fee9eb6200f0e44a4b684903bc47fde36d46f1a5 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9ff72da471a509a8c19791efe469f47fa6977410

commit 9ff72da471a509a8c19791efe469f47fa6977410
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Tue Jan 19 17:33:32 2016 -0200

    posix: New Linux posix_spawn{p} implementation
    
    This patch implements a new posix_spawn{p} implementation for Linux.  The main
    difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK
    flags and a direct allocated stack.  The new stack and start function solves
    most the vfork limitation (possible parent clobber due stack spilling).  The
    remaning issue are related to signal handling:
    
      1. That no signal handlers must run in child context, to avoid corrupt
         parent's state.
      2. Child must synchronize with parent to enforce stack deallocation and
         to possible return execv issues.
    
    The first one is solved by blocking all signals in child, even NPTL-internal
    ones (SIGCANCEL and SIGSETXID).  The second issue is done by a stack allocation
    in parent and a synchronization with using a pipe or waitpid (in case or error).
    The pipe has the advantage of allowing the child signal an exec error (checked
    with new tst-spawn2 test).
    
    There is an inherent race condition in pipe2 usage for architectures that do not
    support the syscall directly.  In such cases the a pipe plus fctnl is used
    instead and it may lead to file descriptor leak in parent (as decribed by fcntl
    documentation).
    
    The child process stack is allocate with a mmap with MAP_STACK flag using
    default architecture stack size.  Although it is slower than use a stack buffer
    from parent, it allows some slack for the compatibility code to run scripts
    with no shebang (which may use a buffer with size depending of argument list
    count).
    
    Performance should be similar to the vfork default posix implementation and
    way faster than fork path (vfork on mostly linux ports are basically
    clone with CLONE_VM plus CLONE_VFORK).  The only difference is the syscalls
    required for the stack allocation/deallocation.
    
    It fixes BZ#10354, BZ#14750, and BZ#18433.
    
    Tested on i386, x86_64, powerpc64le, and aarch64.
    
    	[BZ #14750]
    	[BZ #10354]
    	[BZ #18433]
    	* include/sched.h (__clone): Add hidden prototype.
    	(__clone2): Likewise.
    	* include/unistd.h (__dup): Likewise.
    	* posix/Makefile (tests): Add tst-spawn2.
    	* posix/tst-spawn2.c: New file.
    	* sysdeps/posix/dup.c (__dup): Add hidden definition.
    	* sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/i386/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/ia64/clone2.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/m68k/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/microblaze/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone):
    	Likewise.
    	* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone):
    	Likewise.
    	* sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/sh/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise.
    	* sysdeps/unix/sysv/linux/nptl-signals.h
    	(____nptl_is_internal_signal): New function.
    	* sysdeps/unix/sysv/linux/spawni.c: New file.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1eb8930608705702d5746e5491bab4e4429fcb83

commit 1eb8930608705702d5746e5491bab4e4429fcb83
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Jan 22 09:58:49 2016 -0200

    posix: execvpe cleanup
    
    This patch removes all the dynamic allocation on execvpe code and
    instead use direct stack allocation.  This is QoI approach to make
    it possible use in scenarios where memory is shared with parent
    (vfork or clone with CLONE_VM).
    
    For default process spawn (script file without a shebang), stack
    allocation is bounded by NAME_MAX plus PATH_MAX plus 1.  Large
    file arguments returns an error (ENAMETOOLONG).  This differs than
    current GLIBC pratice in general, but it used to limit stack
    allocation for large inputs.  Also, path in PATH environment variable
    larger than PATH_MAX are ignored.
    
    The shell direct execution exeception, where execve returns ENOEXEC,
    might requires a large stack allocation due large input argument list.
    
    Tested on i686, x86_64, powerpc64le, and aarch64.
    
    	* posix/execvpe.c (__execvpe): Remove dynamic allocation.
    	* posix/Makefile (tests): Add tst-execvpe{1,2,3,4,5,6}.
    	* posix/tst-execvp1.c (do_test): Use a macro to call execvp.
    	* posix/tst-execvp2.c (do_test): Likewise.
    	* posix/tst-execvp3.c (do_test): Likewise.
    	* posix/tst-execvp4.c (do_test): Likewise.
    	* posix/tst-execvpe1.c: New file.
    	* posix/tst-execvpe2.c: Likewise.
    	* posix/tst-execvpe3.c: Likewise.
    	* posix/tst-execvpe4.c: Likewise.
    	* posix/tst-execvpe5.c: Likewise.
    	* posix/tst-execvpe6.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f83bb9b8e97656ae0d3e2a31e859363e2d4d5832

commit f83bb9b8e97656ae0d3e2a31e859363e2d4d5832
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Jan 29 11:43:40 2016 -0200

    posix: Remove dynamic memory allocation from execl{e,p}
    
    GLIBC execl{e,p} implementation might use malloc if the total number of
    arguments exceed initial assumption size (1024).  This might lead to
    issues in two situations:
    
    1. execl/execle is stated to be async-signal-safe by POSIX [1].  However
       if execl is used in a signal handler with a large argument set (that
       may call malloc internally) and if the resulting call fails it might
       lead malloc in the program in a bad state.
    
    2. If the functions are used in a vfork/clone(VFORK) situation it also
       might issue malloc internal bad state.
    
    This patch fixes it by using stack allocation instead.  It also fixes
    BZ#19534.
    
    Tested on x86_64.
    
    [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
    
    	[BZ #19534]
    	* posix/execl.c (execl): Remove dynamic memory allocation.
    	* posix/execle.c (execle): Likewise.
    	* posix/execlp.c (execlp): Likewise.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                                         |   54 +++
 include/sched.h                                   |    2 +
 include/unistd.h                                  |    1 +
 posix/Makefile                                    |    5 +-
 posix/execl.c                                     |   68 ++---
 posix/execle.c                                    |   70 ++---
 posix/execlp.c                                    |   66 ++--
 posix/execvpe.c                                   |  255 ++++++--------
 posix/tst-execvp1.c                               |    6 +-
 posix/tst-execvp2.c                               |    5 +-
 posix/tst-execvp3.c                               |    5 +-
 posix/tst-execvp4.c                               |    6 +-
 posix/tst-execvpe1.c                              |   20 +
 posix/tst-execvpe2.c                              |   20 +
 posix/tst-execvpe3.c                              |   20 +
 posix/tst-execvpe4.c                              |   20 +
 posix/tst-execvpe5.c                              |  157 ++++++++
 posix/tst-execvpe6.c                              |  150 ++++++++
 posix/tst-spawn2.c                                |   72 ++++
 sysdeps/posix/dup.c                               |    2 +-
 sysdeps/unix/sysv/linux/aarch64/clone.S           |    1 +
 sysdeps/unix/sysv/linux/alpha/clone.S             |    1 +
 sysdeps/unix/sysv/linux/arm/clone.S               |    1 +
 sysdeps/unix/sysv/linux/hppa/clone.S              |    1 +
 sysdeps/unix/sysv/linux/i386/clone.S              |    1 +
 sysdeps/unix/sysv/linux/ia64/clone2.S             |    2 +
 sysdeps/unix/sysv/linux/m68k/clone.S              |    1 +
 sysdeps/unix/sysv/linux/microblaze/clone.S        |    1 +
 sysdeps/unix/sysv/linux/mips/clone.S              |    1 +
 sysdeps/unix/sysv/linux/nios2/clone.S             |    1 +
 sysdeps/unix/sysv/linux/nptl-signals.h            |   10 +
 sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S |    1 +
 sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S |    1 +
 sysdeps/unix/sysv/linux/s390/s390-32/clone.S      |    2 +
 sysdeps/unix/sysv/linux/s390/s390-64/clone.S      |    2 +
 sysdeps/unix/sysv/linux/sh/clone.S                |    1 +
 sysdeps/unix/sysv/linux/sparc/sparc32/clone.S     |    1 +
 sysdeps/unix/sysv/linux/sparc/sparc64/clone.S     |    1 +
 sysdeps/unix/sysv/linux/spawni.c                  |  404 +++++++++++++++++++++
 sysdeps/unix/sysv/linux/tile/clone.S              |    1 +
 sysdeps/unix/sysv/linux/x86_64/clone.S            |    1 +
 41 files changed, 1162 insertions(+), 278 deletions(-)
 create mode 100644 posix/tst-execvpe1.c
 create mode 100644 posix/tst-execvpe2.c
 create mode 100644 posix/tst-execvpe3.c
 create mode 100644 posix/tst-execvpe4.c
 create mode 100644 posix/tst-execvpe5.c
 create mode 100644 posix/tst-execvpe6.c
 create mode 100644 posix/tst-spawn2.c
 create mode 100644 sysdeps/unix/sysv/linux/spawni.c
Comment 9 Adhemerval Zanella 2016-03-07 05:38:17 UTC
Fixed by 9ff72da471a509a8c19791efe469f47fa6977410.
Comment 10 Jameson 2021-07-26 19:03:58 UTC
Sorry to ping an old issue, but I was noticing the documentation appears to explicitly state the old behavior (https://man7.org/linux/man-pages/man3/posix_spawn.3.html#ERRORS), in contradiction to this change and the updated opengroups standard at the link in the original issue or at https://linux.die.net/man/3/posix_spawn. Is this text now incorrect:

ERRORS
       The posix_spawn() and posix_spawnp() functions fail only in the case where the underlying fork(2) or vfork(2) call fails;  in these cases, these  functions
       return an error number, which will be one of the errors described for fork(2) or vfork(2).
Comment 11 Adhemerval Zanella 2021-07-26 19:33:30 UTC
(In reply to Jameson from comment #10)
> Sorry to ping an old issue, but I was noticing the documentation appears to
> explicitly state the old behavior
> (https://man7.org/linux/man-pages/man3/posix_spawn.3.html#ERRORS), in
> contradiction to this change and the updated opengroups standard at the link
> in the original issue or at https://linux.die.net/man/3/posix_spawn. Is this
> text now incorrect:
> 
> ERRORS
>        The posix_spawn() and posix_spawnp() functions fail only in the case
> where the underlying fork(2) or vfork(2) call fails;  in these cases, these 
> functions
>        return an error number, which will be one of the errors described for
> fork(2) or vfork(2).

You will need to check with linux-man list:
https://www.kernel.org/doc/man-pages/linux-man-ml.html