Bug 1148 - Race condition between fork() and exit() when using pthread_atfork() from a shared library
Summary: Race condition between fork() and exit() when using pthread_atfork() from a s...
Status: RESOLVED INVALID
Alias: None
Product: glibc
Classification: Unclassified
Component: nptl (show other bugs)
Version: 2.3.5
: P2 normal
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-08-02 00:56 UTC by Andrew Suffield
Modified: 2019-04-10 10:07 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Suffield 2005-08-02 00:56:32 UTC
This is Debian bug #223110, still present in Debian release 2.3.5-2. I've looked
through the cvs changelogs and the relevant code (it looks broken to me) but not
tried it against a current cvs build yet.

I've attached this to nptl but it probably afflicts linuxthreads too; see below.

Minimised test case:

--8<--- foo.c ----------------
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <stdlib.h>

void exit_on_signal(int signr)
{
  fprintf(stderr, "Exiting on signal from child\n");
  exit(0);
}

extern void foo(void);

int main(void)
{
  foo();
  signal(SIGUSR2,exit_on_signal);
  pid_t parent = getpid();
  if (fork() == 0)
    kill(parent, SIGUSR2);
  else
    sleep(10);
  return 0;
}
------------------------------

--8<--- libfoo.c -------------
#include <pthread.h>

void
do_prepare(void)
{
}

void
do_child(void)
{
}

void
foo(void)
{
  pthread_atfork(&do_prepare, NULL, &do_child);
}
------------------------------

gcc -shared -o libfoo.so libfoo.c -pthread
gcc -o foo foo.c -L. -lfoo
LD_LIBRARY_PATH=. ./foo

This program should exit, but it hangs instead inside exit() (race condition,
but I've never had it avoid hanging on a 2.6 kernel). Interestingly enough, it
doesn't appear to be specific to nptl, in that it also hangs with linuxthreads -
but the rest of this mail deals with the nptl version; I haven't investigated
what's going on with linuxthreads.

Here's my analysis of the problem (dates from libc 2.3.2, but I don't think
anything significant has changed):

Enter main()
 -> Enter foo()
     -> pthread_atfork() registers the handlers (it doesn't matter
        which ones are present; I think three NULLs will still break),
        and associates them with libfoo.so. refcntr on this handler is
        initialised to 1
 -> fork()
     -> Enter __libc_fork() (in nptl/sysdeps/unix/sysv/linux/fork.c)
         -> Call do_prepare()
         -> Increment refcntr on the atfork handler (refcntr == 2)
         -> Invoke the fork syscall
       child -> Call do_child()
             -> Decrement refcntr on the atfork handler (refcntr == 1)
 -> Send signal SIGUSR2 to the parent
 -> Exit
parent -> Enter exit_on_signal()
           -> Enter exit()
               ...
               -> Unload libfoo
                   -> Call __unregister_atfork() for libfoo (in
nptl/sysdeps/unix/sysv/linux/unregister-atfork.c)
                       -> Decrement refcntr on the atfork handler (refcntr == 1)
                       -> Wait for refcntr to reach zero

This condition will never be true. __libc_fork() incremented refcntr
on the atfork handler, but will never decrement it because in order
for that to happen, the signal handler would have to return, which
would require exit() to return. __unregister_atfork() will hang
waiting for this variable to reach zero.

Note that the parent never woke up from the fork syscall until after
the child had sent the signal. This is a race condition; the child
must send the signal almost right away.
Comment 1 Ulrich Drepper 2005-08-02 01:19:36 UTC
You're not allowed to call exit from a asignal handler.