Created attachment 6870 [details] Testcase When _IO_flush_all_lockp is called from _IO_cleanup it doesn't do any locking on _IO_list_all, which races with fopen/fclose from other threads. This can result in heap corruption.
I have two related issues open on the Austin Group bug tracker: http://austingroupbugs.net/view.php?id=610 http://austingroupbugs.net/view.php?id=611 Unfortunately, I believe the current glibc behavior of not performing appropriate locking is intentional, so that exit works even when locks would/should block exit. This is contrary to the requirements of the standard and harmful to applications that have expectations on the atomicity/integrity of stdio operations performed under lock.
Doesn't seem any recent progress on the issues.
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via 19f82f358670f4b80533156b9edbf81223358bf9 (commit) from 91e7cf982d0104f0e71770f5ae8e3faf352dea9f (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=19f82f358670f4b80533156b9edbf81223358bf9 commit 19f82f358670f4b80533156b9edbf81223358bf9 Author: Andreas Schwab <schwab@suse.de> Date: Mon Aug 21 16:07:29 2017 +0200 Always do locking when iterating over list of streams (bug 15142) _IO_list_all should only be traversed while locking list_all_lock. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 8 +++++++ libio/genops.c | 60 ++++++++++++++++--------------------------------------- 2 files changed, 26 insertions(+), 42 deletions(-)
*** Bug 30510 has been marked as a duplicate of this bug. ***
Fixed for 2.38 via: commit af130d27099651e0d27b2cf2cfb44dafd6fe8a26 Author: Andreas Schwab <schwab@suse.de> Date: Tue Jan 30 10:16:00 2018 +0100 Always do locking when accessing streams (bug 15142, bug 14697) Now that abort no longer calls fflush there is no reason to avoid locking the stdio streams anywhere. This fixes a conformance issue and potential heap corruption during exit.
We started getting hangs on the following program: https://github.com/llvm/llvm-project/blob/995d1d114e4e4ff708a03cdb0a975209c6197f9f/compiler-rt/test/tsan/getline_nohang.cpp#L28 Basically just calls a blocking getline in one thread and another thread tries to exit. Does it mean it's illegal to exit while there any blocking stream calls anywhere in the program?
(In reply to Dmitry Vyukov from comment #6) > We started getting hangs on the following program: > > https://github.com/llvm/llvm-project/blob/ > 995d1d114e4e4ff708a03cdb0a975209c6197f9f/compiler-rt/test/tsan/ > getline_nohang.cpp#L28 > > Basically just calls a blocking getline in one thread and another thread > tries to exit. It's blocking on this: FILE *stream = fdopen(fd[0], "r"); while (1) { volatile int res = getline(&line, &size, stream); (void)res; } It's not a writable stream, so we could avoid the blocking with a more complex handshake between stdio streams and exit. I'm not sure if it's worth doing that. We could perhaps add another flag to fopen/fdopen that indicates that the stream should not participate in fflush (NULL) or exit flushing. For streams which are blocked in writing, POSIX does not really give us a way to make forward progress because we have to flush the unwritten data before exiting.
> For streams which are blocked in writing, POSIX does not really give us a way to make forward progress because we have to flush the unwritten data before exiting. Is it really the case for this program? If a write does not happen before exit (which is the case in any such blocking), then program cannot potentially know the write has even started before fflush/exit, so it cannot possibly expect the write side-effects to be flushed. What am I missing? > We could perhaps add another flag to fopen/fdopen that indicates that the stream should not participate in fflush (NULL) or exit flushing. Should we worry about all of the existing programs that will start hanging?
(In reply to Dmitry Vyukov from comment #8) > > For streams which are blocked in writing, POSIX does not really give us a way to make forward progress because we have to flush the unwritten data before exiting. > > Is it really the case for this program? No, this program does not have any unflushed data to be written, hence my comment about a more complex locking protocol avoiding the issue. Exit flushing is special and not specified as equivalent to fflush (NULL), so maybe it's sufficient to put read-only streams on a separate list, and flush only writable streams on exit. But it's not clear to me if it's worth making changes here if that only fixes this LLVM test case, and the real-world issues are with applications exiting with pending unwritten data. > If a write does not happen before exit (which is the case in any such > blocking), then program cannot potentially know the write has even started > before fflush/exit, so it cannot possibly expect the write side-effects to > be flushed. > > What am I missing? There are cases where we must block according to POSIX. Lack of blocking is observable by another process. > > We could perhaps add another flag to fopen/fdopen that indicates that the stream should not participate in fflush (NULL) or exit flushing. > > Should we worry about all of the existing programs that will start hanging? Andreas Schwab wrote this: “ This has been part of SUSE/openSUSE for several years, and I have not seen any complaints so far. It's more likely that you get a crash during the unlocked access to the streams. ” <https://inbox.sourceware.org/libc-alpha/mvmr0pptpmm.fsf@suse.de/> This reduced my worries considerably.