This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] malloc: Run fork handler as late as possible [BZ #19431]
- From: Florian Weimer <fweimer at redhat dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 12 Apr 2016 21:15:16 +0200
- Subject: Re: [PATCH] malloc: Run fork handler as late as possible [BZ #19431]
- Authentication-results: sourceware.org; auth=none
- References: <56BBAF3D dot 5030905 at redhat dot com> <1455559833 dot 20971 dot 11 dot camel at localhost dot localdomain> <56C5EE32 dot 1020605 at redhat dot com> <1460485015 dot 3869 dot 267 dot camel at localhost dot localdomain>
On 04/12/2016 08:16 PM, Torvald Riegel wrote:
On Thu, 2016-02-18 at 17:15 +0100, Florian Weimer wrote:
On 02/15/2016 07:10 PM, Torvald Riegel wrote:
It explicitly encodes the lock order in the implementations of fork,
and does not rely on the registration order, thus avoiding the deadlock.
I couldn't test the Hurd bits, but the changes look straightforward enough.
Are those changes, and thus the new scheme, documented anywhere?
Fork handlers used by the implementation should be invisible to
applications. The old one wasn't, and this was a bug.
But have you documented this and your understanding of the
synchronization scheme anywhere? Your explanation in the email with the
patch seems more detailed than the comments in the code.
That's because I'm talking mainly about removed code and explaining a
bug. I'm not sure how this information will be useful to future
developers once the bug is gone. We have a regression test, which
should avoid reintroducing precisely the same bug. For similar issues,
we need to rely on code review and collective memory.
I think it's really worthwhile to always update and improve the
documentation, especially if we're kind of (re)discovering knowledge
about the current implementation such as in this case. I know this is
extra work, but if this isn't documented, it's less likely anybody but
you will be aware of how this all works.
Do you mean something like this inside the fork implementation?
/* Acquire malloc locks. This needs to come last because fork
handlers may use malloc, and the libio list lock has an indirect
malloc dependency as well (via the getdelim function). */
Not quite. IIRC this is one comment in one part of the code, but it's
not easy to find it from the other parts of code that are related.
One of the main motivations to document the synchronization scheme is to
avoid requiring future developers that want to modify the code to
closely review all the code (eg, to look for comments such as the one
The cycle isn't there anymore, so it would not particularly helpful to
mention fork handlers in a comment on _IO_flush_all_lockp.
I would even go one step further: It would be misleading because it
implies that we are confident about the libio locking behavior and lack
of cycles. But libio calls user-supplied callbacks (set by fopencookie)
while implementation locks are acquired (again in _IO_flush_all_lockp).
Diagnostic functions in the malloc subsystem call into libio (via
stdio streams) while malloc locks are acquired, and libio may in turn
use malloc. I'm sure I can find more examples with more poking.
(The thinking is that, simplified, in sequential code it's
sufficient to look at the callers or the contract of your function,
whereas in concurrent code you don't easily know who is actually
affected by a particiular lock acquisition order.)
With callbacks, you can't be sure even with sequential code.
This is made easier if there's an overview of the synchronization
somewhere in the code, and it is referenced in affected parts of the
I can add some comments to libio with what I have found, but this
doesn't really related to the locking cycle I removed (because again,
it's no longer there).
diff --git a/malloc/malloc-private.h b/malloc/malloc-private.h
new file mode 100644
Should this perhaps be malloc-internal.h to match libc-internal.h?
We also have libio/libioP.h, and include/malloc.h. But I think a
separate header makes sense if its contents is truly internal.
What are the preferences here?
I think having an internal header is good, but I can't give a definitive
reply regarding what the project prefers.
See my reply to Roland. I was under the impression you objected to the