This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: mutexes acquired before fork() remain to be acquired by the parent process after fork()


On 12/23/2016 10:18 PM, Torvald Riegel wrote:
On Thu, 2016-12-22 at 13:38 +0100, Florian Weimer wrote:
On 12/20/2016 06:02 PM, Torvald Riegel wrote:
I'm looking for discussion and consensus about what happens when a
process is forked that has acquired mutexes.  I'm not aware of a
definite answer given by POSIX, and this affects robust mutexes in
particular.
I'm first looking for opinions and hopefully consensus within glibc, and
would then follow up with the Austin Group if necessary.

I think the most practical choice would be one of these two requirements
(I prefer R1 for reasons I'll mention below):

(R1) Any interaction of the child process with mutexes that are in an
acquired state when fork() was called in the parent is undefined.

Can you clarify if this refers to robust mutexes, or ordinary mutexes alone?

The most pressing need to do something about this is for robust mutexes.
For everything else, there is a status quo that is not obviously broken,
but still makes specific choices regarding ownership.

Agreed.

Forking with acquired locks is quite common

Do you have examples?  Note that R1 and R2 do not forbid forking when
having acquired locks; even R1 just forbids to touch them after forking.

and we do it in glibc as
well, to support malloc after fork even if the parent process was
multi-threaded at the time of the fork.

What we do in glibc is an entirely different question.  This RFC is
about what we need to guarantee to the user.

I believe interposable mallocs do something similar to support calling malloc after a fork of a multi-threaded process.

Even if we make an exception for that, it's somewhat worrying that this issue also affects recursive and error-checking mutexes (due to the embedded TID). This means that normal mutexes provide a superset of the functionality of those mutexes, and that's rather counter-intuitive.

But perhaps I'm mistaken and mutex reinitialization in the child process covers all cases where the mutex is not effectively shared.

I'm not aware of any uses of robust mutexes in glibc.

The liveness detection for the nscd mapping will eventually have to use them (or something very similar) because the set_tid_address hack which is currently used (see sysdeps/unix/sysv/linux/nscd_setup_thread.c) does not work anymore.

The dynamic linker uses a recursive mutex if I deciphered all those indirections correctly. I don't think we handle it quite correctly because we simply reinitialize it in the child.

(R2) Any mutexes that are in an acquired state when fork() was called in
the parent remain to be locked by the parent process.


If a mutex is process-shared, it should not have two owners after fork()
because this is against the whole principle of exclusive-ownership
mutexes.

I think you refer to the effectively process-shared case here, and not
just mutexes with a process-shared attribute.

No, generally.  See this paragraph in my previous email:

That leaves non-process-shared mutexes and mutexes that are of the
process-shared kind but are not actually shared.  However, I think we
should discard the latter distinction because it's too hard for
implementations to efficiently track which mutexes are actually shared
and which aren't; a process-shared kind mutex should just be assumed to
be process-shared.  So, the remaining question is whether there is a
need to treat process-private mutexes differently.

Do you think it's practical to figure out which regions of memory are
shared?  We'd really have to query the kernel to figure this out.

We'd need kernel help for this. I think it's possible in theory, with limited performance impact on fork operations and no impact anywhere else. But I don't think the added complexity is desirable.

I disagree with this conclusion.  pthread_atfork handlers typically
acquire locks to ensure that the parent process is in a specific state,
and release them in the parent and child after the fork.  With R1, this
common pattern is suddenly undefined, which is not what we want, I think.

So, this must be a multi-threaded parent or you'd know exactly which
state you're in.  POSIX states this in the fork() description:

If a multi-threaded process calls fork(), the new process shall contain
a replica of the calling thread and its entire address space, possibly
including the states of mutexes and other resources. Consequently, to
avoid errors, the child process may only execute async-signal-safe
operations until such time as one of the exec functions is called.  Fork
handlers may be established by means of the pthread_atfork() function in
order to maintain application invariants across fork() calls.

This already says that you can't unlock nor lock a mutex in the child
process.

Maybe we can support lock reinitialization in the child? I think this should cover the majority of pthread_atfork use cases. It would work nicely if we start the child process with an empty robust mutex list, too.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]