This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

fork bug


The attached code tickles a bug in the implementation of fork on the
Hurd.

The scenario is as follows: a process is invoked which immediately
calls fork.  (In the example, although the initial process is linked
against libpthread, the new process never actually uses it with the
exception of the implicit initialization code which, in this case,
does not change anything.)  At this point, it has a signal thread and
a main thread (that is, the visible thread).  The fork code frees the
signal thread's sigstate structure, however it does not clear the
pointer at the base of the stack (c.f. <hurd/threadvar.h>
_HURD_THREADVAR_SIGSTATE).  As such, the signal thread believes that
the pointer is still valid and does not allocate a new sigstate
structure in its next call to _hurd_thread_sigstate.

This bug does not become obvious until later when the child process
creates a thread which immediately causes a segmentation fault.  The
signal thread receives the message and attempts to process it.  It
sees that the faulter thread's sigstate is not allocated and proceeds
to allocate it.  It just so happens that the pointer to the returned
memory block is the one that the signal thread is (wrongly) using for
its own sigstate structure.  At this point, all chaos ensures as the
signal thread and the faulter share the same state.  Eventually, we
see a dead lock in the expansion of __USEPORT in line 768 of
libc/hurd/hurdsig.c:

  err = __USEPORT (PROC, __proc_dostop (port, _hurd_msgport_thread));

Just before this call, the faulter's sigstate is locked.  During this
expansion, the signal thread will try to lock "its" sigstate in
_hurd_critical_section_unlock.

I tested this with both the current version of glibc in Debian
(2.2.5-13) and with a version from CVS.  The last change log entry
being:

2002-09-08  Roland McGrath  <roland@redhat.com>

        * resolv/resolv.h: Include <sys/types.h> for u_long even in
        the [__need_res_state] case.
        Reported by Bruno Haible <bruno@clisp.org>.


Rather than clear the signal thread's sigstate pointer, I thought it
better to not deallocate it.  Here is a patch which does just that and
also updates the comments to reflect reality a bit more accurately:


2002-09-19  Neal H. Walfield  <neal@cs.uml.edu>

	* sysdeps/mach/hurd/fork.c (__fork): Do not free the signal
	thread's sigstate data structure if it has been allocated.


Index: fork.c
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/mach/hurd/fork.c,v
retrieving revision 1.53
diff -u -r1.53 fork.c
--- fork.c	20 Jun 2002 22:43:02 -0000	1.53
+++ fork.c	20 Sep 2002 02:06:00 -0000
@@ -609,14 +609,17 @@
       for (i = 0; i < _hurd_nports; ++i)
 	__spin_unlock (&_hurd_ports[i].lock);
 
-      /* We are the only thread in this new task, so we will
-	 take the task-global signals.  */
+      /* We are one of the (exactly) two threads in this new task, we
+	 will take the task-global signals.  */
       _hurd_sigthread = ss->thread;
 
-      /* Unchain the sigstate structures for threads that existed in the
-	 parent task but don't exist in this task (the child process).
-	 Delay freeing them until later because some of the further setup
-	 and unlocking might be required for free to work.  */
+      /* Claim our sigstate structure and unchain the rest: the
+	 threads existed in the parent task but don't exist in this
+	 task (the child process).  Delay freeing them until later
+	 because some of the further setup and unlocking might be
+	 required for free to work.  Before we finish cleaning up,
+	 we will reclaim the signal thread's sigstate structure (if
+	 it had one).  */
       oldstates = _hurd_sigstates;
       if (oldstates == ss)
 	oldstates = ss->next;
@@ -650,13 +653,26 @@
       if (!err)
 	err = __thread_resume (_hurd_msgport_thread);
 
-      /* Free the old sigstate structures.  */
+      /* Reclaim the signal thread's sigstate structure and free the
+	 other old sigstate structures.  */
       while (oldstates != NULL)
 	{
 	  struct hurd_sigstate *next = oldstates->next;
-	  free (oldstates);
+
+	  if (oldstates->thread == _hurd_msgport_thread)
+	    {
+	      /* If we have a second signal state structure then we
+		 must have been through here before--not good.  */
+	      assert (_hurd_sigstates->next == 0);
+	      _hurd_sigstates->next = oldstates;
+	      oldstates->next = 0;
+	    }
+	  else
+	    free (oldstates);
+
 	  oldstates = next;
 	}
+
       /* XXX what to do if we have any errors here? */
 
       pid = 0;



Here is the test program:

#define _GNU_SOURCE

#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <error.h>
#include <assert.h>
#include <sys/resource.h>
#include <sys/wait.h>

void *
thr (void *arg)
{
  * (int *)0 = 0;
  return 0;
}

int foobar;

int
main (int argc, char *argv[])
{
  error_t err;
  pid_t child;

  struct rlimit limit;

  limit.rlim_cur = 0;
  limit.rlim_max = 0;

  err = setrlimit (RLIMIT_CORE, &limit);
  if (err)
    error (1, err, "setrlimit");

  child = fork ();
  switch (child)
    {
    case -1:
      error (1, errno, "fork");
      break;

    case 0:
      {
	pthread_t tid;
	void *ret;

	err = pthread_create (&tid, 0, thr, 0);
	if (err)
	  error (1, err, "pthread_create");

	err = pthread_join (tid, &ret);
	assert_perror (err);

	/* Should have never returned.  Our parent expects us to fail
	   thus we succeed and indicate the error.  */
	return 0;
      }

    default:
      {
	pid_t pid;
	int status;

	pid = waitpid (child, &status, 0);
	printf ("pid = %d; child = %d; status = %d\n", pid, child, status);
	assert (pid == child);
	assert (status != 0);
      }
    }

  return 0;
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]