This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Deadlock in multithreaded application when doing IO and fork.


I think it certainly makes sense for the malloc atfork handlers always to
run last.  Otherwise a user atfork handler can produce deadlock with the
user's own code in other threads in ways that don't really seem to be under
the user's control.  For example:

T1					T2
					takes user lock L
fork					
	malloc atfork takes malloc locks
	user atfork blocks locking L
					calls malloc
						blocks on malloc locks

In fact, it's pretty easy to set this up so it always deadlocks, not even
needing a race (e.g. T2 creates T1 after locking L and then uses a
pthread_barrier to wait for T2 to enter its atfork handler before T2 calls
malloc).  That seems like a good test case to write, since I think you can
write one like this and pretty easily see that it is POSIX-compliant.

With user atfork handlers that call malloc themselves, it can probably get
even weirder.  (Calling malloc in an atfork handler seems like a bad idea
all around, but AFAICT it is kosher under POSIX.)

It's less clear whether that's really sufficient for all kosher scenarios.
If I understood you correctly, the scenario you cited is one that in the
best case would lead to a crash.  That is, the user has provoked undefined
behavior.  In that case, it's as kosher to deadlock as it is to crash
coherently with malloc assertions, albeit much less useful.  So we don't
have a hard mandate to avoid those deadlock cases.  Hence it's a tradeoff
of difficulty, maintenance burden, and performance hits vs being extra nice
in helping people diagnose their own bugs.  Similarly, setting malloc hooks
is something that really requires knowing about subtleties and internal
implementation issues already and probably always will, so putting the onus
on people who write their own malloc hooks (and especially people who think
that using malloc hooks in a multithreaded program is something they should
be doing) is fine.

I'm not all that clear on the details of the further mitigations you
suggest after the atfork change.  I think the right things to do are
(in this order):
1. write the aforementioned test and verify it always deadlocks
2. fix that test by making the malloc atfork handlers always run last
3. commit those
4. reconsider remaining undesireable scenarios in that context


Thanks,
Roland


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]