This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 3/3] Refactor atfork handlers
On 23/02/2018 07:41, Florian Weimer wrote:
> On 02/20/2018 03:23 PM, Adhemerval Zanella wrote:
>
>> Aside of the two scenarios (callbacks issuing fork/pthread_atfork), the only
>> other scenario I see which might trigger a deadlock in this case is a signal
>> handler issuing fork/pthread_atfork.
>>
>> Former is BZ#4737 and my understanding is this should be a EWONTFIX due
>> indication future POSIX specification to interpret fork as async-signal-unsafe
>> (comment #19 and I am not sure if fork could be made async-signal-safe with
>> ticket locks as Rich stated in comment #21).
>>
>> Regarding later I think pthread_atfork is inherent async-signal-unsafe due
>> it might return ENOMEM indicating it might allocate memory and our malloc
>> is also async-signal-unsafe.
>>
>> Am I missing a scenario you might be considering?
>
> I looked at the acquired locks during fork, and you are right, the corner cases where a deadlock can happen in the upstream sources are quite obscure. However, we do not currently acquire any ld.so locks, and I think I've seen patches which change that (because upstream is buggy and crash in the new child process). If any ld.so locks are acquired around fork, then we have a lock ordering conflict in case an ELF constructor calls pthread_register_atfork (which is an extremely natural thing to do), like this:
>
> Fork:
>
> pthread_register_atfork lock
> rtld load lock
>
> dlopen:
>
> rtld load lock
> calling ELF constructors, and then:
> pthread_register_atfork lock
>
> The older lock-free code avoids this. You could do the same even with locks if you created a copy of the handler list on the heap.
MY understanding is ld.so locks might be acquired in the callback calls from
__run_fork_handlers:
fork:
__run_fork_handlers (atfork_run_prepare)
lll_lock (atfork_lock)
<callback>
rtld load lock
However I do not see who in a different thread dlopen would acquire the same
lock since it has been already acquired by the callback. The only way is if
dlopen is being called by a signal handler, which I think it another obscure
corner case.