Bug 22834 - Subprocess forked by popen may crash in Linux when multithreads call popen
Summary: Subprocess forked by popen may crash in Linux when multithreads call popen
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: stdio (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: 2.29
Assignee: Adhemerval Zanella
URL:
Keywords:
: 26342 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-02-11 14:54 UTC by huanjiajia
Modified: 2020-08-07 07:16 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2018-02-11 00:00:00
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description huanjiajia 2018-02-11 14:54:44 UTC
When multithreads concurrently call popen to execute command, there is some chance that subprocess forked by popen may crash in Linux.

The popen use proc_file_chain to save _IO_proc_file structure of pipe fds, and proc_file_chain is the head of a list of _IO_proc_file.In _IO_new_proc_open in libio/Iopopen.c, after fork a subprocess, the parent process will add the _IO_proc_file structure at the beginning of the list.The code is:

  ((_IO_proc_file *) fp)->next = proc_file_chain;
  proc_file_chain = (_IO_proc_file *) fp;

It seems that the order is right, but in fact, it depends on the behaviour of the compiler. For example, in my centos7.1, the assembly is:

   6db27:       48 8b 05 72 ee 34 00    mov    0x34ee72(%rip),%rax        # 3bc9a0 <proc_file_chain>
   6db2e:       4c 89 25 6b ee 34 00    mov    %r12,0x34ee6b(%rip)        # 3bc9a0 <proc_file_chain>
   6db35:       49 89 84 24 e8 00 00    mov    %rax,0xe8(%r12)

the real effect is:

instruction 1 : $rax = proc_file_chain;
instruction 2 : proc_file_chain = fp;
instruction 3 : fp->next = $rax (or proc_file_chain->next = $rax)

between instruction 2 and 3, there is a wrong state: proc_file_chain has been pointed to fp, but the pointer next of new proc_file_chain(fp) haven't been pointed to original proc_file_chain, it points to an invalid address. 

Consider the following scene:
Process A has two thread, thread 1 and thread 2. Thread 1 calls popen, and after instruction 2 being executed, thread 1 scheduled by system. Then thread 2 also calls popen and forks process B before instruction 3 being executed by thread 1. Because fork copies the mem of process A to process B, so process B also will also see an invalid next pointer of proc_file_chain, when process B traverse proc_file_chain:

 for (p = proc_file_chain; p; p = p->next)
	{
	  int fd = _IO_fileno ((_IO_FILE *) p);

	  /* If any stream from previous popen() calls has fileno
	     child_std_end, it has been already closed by the dup2 syscall
	     above.  */
	  if (fd != child_std_end)
	    __close_nocancel (fd);
	}

it will crash.

I find this bug in glibc-2.17, and have checked the newest stable version glibc-2.27, this bus still exists.

If any question, contact me with my
private email: huanjiajia1983@gmail.com
company email: huan.jiajia@zte.com.cn
Comment 1 Florian Weimer 2018-02-11 15:02:14 UTC
Do we actually need the proc_file_chain variable anymore?  I think CLOEXEC takes care of closing all of the descriptors: The parent pipe end always has it set, and the child end loses it only in the child process, so pending child pipe ends from other threads have CLOEXEC set as well.
Comment 2 Florian Weimer 2018-02-11 15:13:10 UTC
So the strangeness here is that the O_CLOEXEC is optional in the parent process.  The implementation assumes that other popen descriptors always closed by popen in its subprocess, but for execve, the closing is optional.

I'm not sure what POSIX intends here.
Comment 3 huanjiajia 2018-02-12 01:53:47 UTC
(In reply to Florian Weimer from comment #2)
> So the strangeness here is that the O_CLOEXEC is optional in the parent
> process.  The implementation assumes that other popen descriptors always
> closed by popen in its subprocess, but for execve, the closing is optional.
> 
> I'm not sure what POSIX intends here.

I think you are right. The list proc_file_chain only handle the fds of pipes created by popen, and these fds have been carefully handled by popen if
O_CLOEXEC has been defined. But it is tricky that the define of O_CLOEXEC depends on parent process. :(
Comment 4 Andreas Schwab 2018-02-14 11:45:41 UTC
Adding an atomic_write_barrier inbetween should fix that, but maybe a compiler barrier is enough.
Comment 5 Sourceware Commits 2018-11-30 21:10:32 UTC
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  5fb7fc96350575c9adb1316833e48ca11553be49 (commit)
       via  14d0e87d9b8caaa2eca7ca81f1189596671fe4fb (commit)
      from  e5d262effe3a87164308a3f37e61b32d0348692a (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5fb7fc96350575c9adb1316833e48ca11553be49

commit 5fb7fc96350575c9adb1316833e48ca11553be49
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Wed Oct 24 16:29:38 2018 -0300

    posix: Use posix_spawn on system
    
    This patch uses posix_spawn on system implementation.  On Linux this has
    the advantage of much lower memory consumption (usually 32 Kb minimum for
    the mmap stack area).
    
    Although POSIX does not require, glibc system implementation aims to be
    thread and cancellation safe.  The cancellation code is moved to generic
    implementation and enabled iff SIGCANCEL is defined (similar on how the
    cancellation handler is enabled on nptl-init.c).
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
    arm-linux-gnueabihf, and powerpc64le-linux-gnu.
    
    	* sysdeps/unix/sysv/linux/spawni.c (__spawni_child): Use
    	__sigismember instead of sigismember.
    	* sysdeps/posix/system.c [SIGCANCEL] (cancel_handler_args,
    	cancel_handler): New definitions.
    	(CLEANUP_HANDLER, CLEANUP_RESET): Likewise.
    	(DO_LOCK, DO_UNLOCK, INIT_LOCK, ADD_REF, SUB_REF): Remove.
    	(do_system): Use posix_spawn instead of fork and execl and remove
    	reentracy code.
    	* sysdeps/generic/not-errno.h (__kill_noerrno): New prototype.
    	* sysdeps/unix/sysv/linux/not-errno.h (__kill_noerrno): Likewise.
    	* sysdeps/unix/sysv/linux/ia64/system.c: Remove file.
    	* sysdeps/unix/sysv/linux/s390/system.c: Likewise.
    	* sysdeps/unix/sysv/linux/sparc/system.c: Likewise.
    	* sysdeps/unix/sysv/linux/system.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=14d0e87d9b8caaa2eca7ca81f1189596671fe4fb

commit 14d0e87d9b8caaa2eca7ca81f1189596671fe4fb
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Wed Sep 12 10:32:05 2018 -0300

    posix: Use posix_spawn on popen
    
    This patch uses posix_spawn on popen instead of fork and execl.  On Linux
    this has the advantage of much lower memory consumption (usually 32 Kb
    minimum for the mmap stack area).
    
    Two issues are also fixed with this change:
    
      * BZ#17490: although POSIX pthread_atfork description only list 'fork'
        as the function that should execute the atfork handlers, popen
        description states that:
    
          '[...] shall be *as if* a child process were created within the popen()
           call using the fork() function [...]'
    
        Other libc/system seems to follow the idea atfork handlers should not be
        executed for popen:
    
        libc/system	| run atfork handles   | notes
        ------------|----------------------|---------------------------------------
        Freebsd	|        no            | uses vfork
        Solaris 11	|        no            |
        MacOSX 11   |        no            | implemented through posix_spawn syscall
        ------------|----------------------|----------------------------------------
    
        Similar to posix_spawn and system, popen idea is to spawn a different
        binary so all the POSIX rationale to run the atfork handlers to avoid
        internal process inconsistency is not really required and in some cases
        might be unsafe.
    
      * BZ#22834: the described scenario, where the forked process might access
        invalid memory due an inconsistent state in multithreaded environment,
        should not happen because posix_spawn does not access the affected
        data structure (proc_file_chain).
    
    Checked on x86_64-linux-gnu and i686-linux-gnu.
    
    	[BZ #22834]
    	[BZ #17490]
    	* NEWS: Add new semantic for atfork with popen and system.
    	* libio/iopopen.c (_IO_new_proc_open): use posix_spawn instead of
    	fork and execl.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                              |   21 ++++
 NEWS                                   |    6 +
 libio/iopopen.c                        |  134 +++++++++++++++--------
 sysdeps/generic/not-errno.h            |    2 +
 sysdeps/posix/system.c                 |  189 +++++++++++++++++---------------
 sysdeps/unix/sysv/linux/ia64/system.c  |   30 -----
 sysdeps/unix/sysv/linux/not-errno.h    |   14 +++
 sysdeps/unix/sysv/linux/s390/system.c  |   29 -----
 sysdeps/unix/sysv/linux/sparc/system.c |   29 -----
 sysdeps/unix/sysv/linux/spawni.c       |    4 +-
 sysdeps/unix/sysv/linux/system.c       |   76 -------------
 11 files changed, 231 insertions(+), 303 deletions(-)
 delete mode 100644 sysdeps/unix/sysv/linux/ia64/system.c
 delete mode 100644 sysdeps/unix/sysv/linux/s390/system.c
 delete mode 100644 sysdeps/unix/sysv/linux/sparc/system.c
 delete mode 100644 sysdeps/unix/sysv/linux/system.c
Comment 6 Adhemerval Zanella 2018-11-30 21:13:33 UTC
Fixed on 2.29 (commit 14d0e87d9b8caaa2eca7ca81f1189596671fe4fb)
Comment 7 Florian Weimer 2020-08-07 07:16:33 UTC
*** Bug 26342 has been marked as a duplicate of this bug. ***