This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RFC: mutexes acquired before fork() remain to be acquired by the parent process after fork()

From: Torvald Riegel <triegel at redhat dot com>
To: GLIBC Devel <libc-alpha at sourceware dot org>
Cc: Rich Felker <dalias at libc dot org>, "Carlos O'Donell" <codonell at redhat dot com>, Florian Weimer <fweimer at redhat dot com>
Date: Tue, 20 Dec 2016 18:02:17 +0100
Subject: RFC: mutexes acquired before fork() remain to be acquired by the parent process after fork()
Authentication-results: sourceware.org; auth=none

I'm looking for discussion and consensus about what happens when a
process is forked that has acquired mutexes. I'm not aware of a
definite answer given by POSIX, and this affects robust mutexes in
particular.
I'm first looking for opinions and hopefully consensus within glibc, and
would then follow up with the Austin Group if necessary.

I think the most practical choice would be one of these two requirements
(I prefer R1 for reasons I'll mention below):

(R1) Any interaction of the child process with mutexes that are in an
acquired state when fork() was called in the parent is undefined.

(R2) Any mutexes that are in an acquired state when fork() was called in
the parent remain to be locked by the parent process.

If a mutex is process-shared, it should not have two owners after fork()
because this is against the whole principle of exclusive-ownership
mutexes. It would also be hard to implement for the error-checking and
PI mutex kinds.
Thus, there should be just one owner, and there is no reason to prefer
the child over the parent. Both R1 and R2 would make no difference for
the parent process, only the child process would have to take care.

That leaves non-process-shared mutexes and mutexes that are of the
process-shared kind but are not actually shared. However, I think we
should discard the latter distinction because it's too hard for
implementations to efficiently track which mutexes are actually shared
and which aren't; a process-shared kind mutex should just be assumed to
be process-shared. So, the remaining question is whether there is a
need to treat process-private mutexes differently.

Process-private mutexes in an acquired state could more easily be
"duplicated" because they are separate objects in the parent and child
process. However, this would mean that the glibc implementations of
recursive, error-checking, robust, and PI mutexes have to change because
these rely on TID to determine ownership. Recursive and error-checking
would have to just use a different ID, but robust and PI rely on kernel
functionality and can't simply use a different ID than TID; having to
rewrite the owner field of all those mutexes in the child process might
be the most practical solution -- which still requires keeping a list of
all of them, which is hardly practical.

I'm not aware of explicit wording in POSIX for what should happen to
mutex owners when fork() is called. It is said that a mutex is owned by
the thread that acquires it
(http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html#tag_03_229), which would align with specifying that the parent process remains the owner of mutexes.
The fork() spec
(http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html)
states that file locks are not inherented to the child process, which
aligns too with not letting child processes be owners of mutexes.
There is also a statement that for multi-threaded programs, the child is
a replica of the parent, "possibly including the states of mutexes", and
that this may mean the child needs to only call signal-safe functions.

POSIX states in the rationale that fork() is only used to either create
(something like) a new thread or to call exec(). Both align well with
letting only the parent be the owner of a mutex.
It also states: "When a programmer is writing a multi-threaded program,
the first described use of fork(), creating new threads in the same
program, is provided by the pthread_create() function. The fork()
function is thus used only to run new programs, and the effects of
calling functions that require certain resources between the call to
fork() and the call to an exec function are undefined." Even though
this ignores the possibility of acquiring mutexes in a single-threaded
program, it states that requiring resources (eg, attempting to lock a
mutex) between fork() and exec() is undefined -- which would align well
with R1.
It also explains that a forkall() idea was rejected that would have
"allow[ed] locks and the state to be preserved without explicit
pthread_atfork() code"; this is again an indication that R1 is the
intent or compatible with the intent.

I think that R1 is preferable over R2 because R2 is in a tough spot when
the parent process terminates: When the parent terminates, it's TID may
be reused by the child for a newly created thread; so, there is an ABA
problem, which we can only solve by rewriting mutex owners in the child
after fork (which is not practical).

Does anyone disagree with R1? (Stating your agreement would be helpful
too for determining consensus.)

If we can agree on R1, then we can fix a current robust mutex problem
(https://sourceware.org/bugzilla/show_bug.cgi?id=19402) by simply
clearing the head of the robust-mutex list in the child before
registering the list with the kernel. We currently don't do that, and
it can mess up the list and prevent the kernel from recovering robust
mutexes (in particular those acquired by the child after fork()) when
the child dies.

Follow-Ups:
- Re: RFC: mutexes acquired before fork() remain to be acquired by the parent process after fork()
  - From: Florian Weimer

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]