When multithreads concurrently call popen to execute command, there is some chance that subprocess forked by popen may crash in Linux. The popen use proc_file_chain to save _IO_proc_file structure of pipe fds, and proc_file_chain is the head of a list of _IO_proc_file.In _IO_new_proc_open in libio/Iopopen.c, after fork a subprocess, the parent process will add the _IO_proc_file structure at the beginning of the list.The code is: ((_IO_proc_file *) fp)->next = proc_file_chain; proc_file_chain = (_IO_proc_file *) fp; It seems that the order is right, but in fact, it depends on the behaviour of the compiler. For example, in my centos7.1, the assembly is: 6db27: 48 8b 05 72 ee 34 00 mov 0x34ee72(%rip),%rax # 3bc9a0 <proc_file_chain> 6db2e: 4c 89 25 6b ee 34 00 mov %r12,0x34ee6b(%rip) # 3bc9a0 <proc_file_chain> 6db35: 49 89 84 24 e8 00 00 mov %rax,0xe8(%r12) the real effect is: instruction 1 : $rax = proc_file_chain; instruction 2 : proc_file_chain = fp; instruction 3 : fp->next = $rax (or proc_file_chain->next = $rax) between instruction 2 and 3, there is a wrong state: proc_file_chain has been pointed to fp, but the pointer next of new proc_file_chain(fp) haven't been pointed to original proc_file_chain, it points to an invalid address. Consider the following scene: Process A has two thread, thread 1 and thread 2. Thread 1 calls popen, and after instruction 2 being executed, thread 1 scheduled by system. Then thread 2 also calls popen and forks process B before instruction 3 being executed by thread 1. Because fork copies the mem of process A to process B, so process B also will also see an invalid next pointer of proc_file_chain, when process B traverse proc_file_chain: for (p = proc_file_chain; p; p = p->next) { int fd = _IO_fileno ((_IO_FILE *) p); /* If any stream from previous popen() calls has fileno child_std_end, it has been already closed by the dup2 syscall above. */ if (fd != child_std_end) __close_nocancel (fd); } it will crash. I find this bug in glibc-2.17, and have checked the newest stable version glibc-2.27, this bus still exists. If any question, contact me with my private email: huanjiajia1983@gmail.com company email: huan.jiajia@zte.com.cn
Do we actually need the proc_file_chain variable anymore? I think CLOEXEC takes care of closing all of the descriptors: The parent pipe end always has it set, and the child end loses it only in the child process, so pending child pipe ends from other threads have CLOEXEC set as well.
So the strangeness here is that the O_CLOEXEC is optional in the parent process. The implementation assumes that other popen descriptors always closed by popen in its subprocess, but for execve, the closing is optional. I'm not sure what POSIX intends here.
(In reply to Florian Weimer from comment #2) > So the strangeness here is that the O_CLOEXEC is optional in the parent > process. The implementation assumes that other popen descriptors always > closed by popen in its subprocess, but for execve, the closing is optional. > > I'm not sure what POSIX intends here. I think you are right. The list proc_file_chain only handle the fds of pipes created by popen, and these fds have been carefully handled by popen if O_CLOEXEC has been defined. But it is tricky that the define of O_CLOEXEC depends on parent process. :(
Adding an atomic_write_barrier inbetween should fix that, but maybe a compiler barrier is enough.
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via 5fb7fc96350575c9adb1316833e48ca11553be49 (commit) via 14d0e87d9b8caaa2eca7ca81f1189596671fe4fb (commit) from e5d262effe3a87164308a3f37e61b32d0348692a (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5fb7fc96350575c9adb1316833e48ca11553be49 commit 5fb7fc96350575c9adb1316833e48ca11553be49 Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Wed Oct 24 16:29:38 2018 -0300 posix: Use posix_spawn on system This patch uses posix_spawn on system implementation. On Linux this has the advantage of much lower memory consumption (usually 32 Kb minimum for the mmap stack area). Although POSIX does not require, glibc system implementation aims to be thread and cancellation safe. The cancellation code is moved to generic implementation and enabled iff SIGCANCEL is defined (similar on how the cancellation handler is enabled on nptl-init.c). Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, arm-linux-gnueabihf, and powerpc64le-linux-gnu. * sysdeps/unix/sysv/linux/spawni.c (__spawni_child): Use __sigismember instead of sigismember. * sysdeps/posix/system.c [SIGCANCEL] (cancel_handler_args, cancel_handler): New definitions. (CLEANUP_HANDLER, CLEANUP_RESET): Likewise. (DO_LOCK, DO_UNLOCK, INIT_LOCK, ADD_REF, SUB_REF): Remove. (do_system): Use posix_spawn instead of fork and execl and remove reentracy code. * sysdeps/generic/not-errno.h (__kill_noerrno): New prototype. * sysdeps/unix/sysv/linux/not-errno.h (__kill_noerrno): Likewise. * sysdeps/unix/sysv/linux/ia64/system.c: Remove file. * sysdeps/unix/sysv/linux/s390/system.c: Likewise. * sysdeps/unix/sysv/linux/sparc/system.c: Likewise. * sysdeps/unix/sysv/linux/system.c: Likewise. https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=14d0e87d9b8caaa2eca7ca81f1189596671fe4fb commit 14d0e87d9b8caaa2eca7ca81f1189596671fe4fb Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Wed Sep 12 10:32:05 2018 -0300 posix: Use posix_spawn on popen This patch uses posix_spawn on popen instead of fork and execl. On Linux this has the advantage of much lower memory consumption (usually 32 Kb minimum for the mmap stack area). Two issues are also fixed with this change: * BZ#17490: although POSIX pthread_atfork description only list 'fork' as the function that should execute the atfork handlers, popen description states that: '[...] shall be *as if* a child process were created within the popen() call using the fork() function [...]' Other libc/system seems to follow the idea atfork handlers should not be executed for popen: libc/system | run atfork handles | notes ------------|----------------------|--------------------------------------- Freebsd | no | uses vfork Solaris 11 | no | MacOSX 11 | no | implemented through posix_spawn syscall ------------|----------------------|---------------------------------------- Similar to posix_spawn and system, popen idea is to spawn a different binary so all the POSIX rationale to run the atfork handlers to avoid internal process inconsistency is not really required and in some cases might be unsafe. * BZ#22834: the described scenario, where the forked process might access invalid memory due an inconsistent state in multithreaded environment, should not happen because posix_spawn does not access the affected data structure (proc_file_chain). Checked on x86_64-linux-gnu and i686-linux-gnu. [BZ #22834] [BZ #17490] * NEWS: Add new semantic for atfork with popen and system. * libio/iopopen.c (_IO_new_proc_open): use posix_spawn instead of fork and execl. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 21 ++++ NEWS | 6 + libio/iopopen.c | 134 +++++++++++++++-------- sysdeps/generic/not-errno.h | 2 + sysdeps/posix/system.c | 189 +++++++++++++++++--------------- sysdeps/unix/sysv/linux/ia64/system.c | 30 ----- sysdeps/unix/sysv/linux/not-errno.h | 14 +++ sysdeps/unix/sysv/linux/s390/system.c | 29 ----- sysdeps/unix/sysv/linux/sparc/system.c | 29 ----- sysdeps/unix/sysv/linux/spawni.c | 4 +- sysdeps/unix/sysv/linux/system.c | 76 ------------- 11 files changed, 231 insertions(+), 303 deletions(-) delete mode 100644 sysdeps/unix/sysv/linux/ia64/system.c delete mode 100644 sysdeps/unix/sysv/linux/s390/system.c delete mode 100644 sysdeps/unix/sysv/linux/sparc/system.c delete mode 100644 sysdeps/unix/sysv/linux/system.c
Fixed on 2.29 (commit 14d0e87d9b8caaa2eca7ca81f1189596671fe4fb)
*** Bug 26342 has been marked as a duplicate of this bug. ***