[PATCH] Fix the race between atexit() and exit()

Anoop acv@linux.vnet.ibm.com
Mon Sep 15 13:09:00 GMT 2008


Carlos O'Donell wrote:
> On Tue, Jun 3, 2008 at 8:08 AM, Anoop <acv@linux.vnet.ibm.com> wrote:
>   
>> POSIX requires that at normal program termination, all functions
>> registered by the atexit() function shall be called. A race exits among
>> __cxa_atexit() and exit() on __exit_funcs can cause a successfully
>> registered handler not to be called. I have a test-case which reveals this
>> bug, which can be posted if required. The following patch solves this by
>> extending the locking
>> used by __new_exitfn() to exit() and failing atexit() registrations once the
>> exit() code completed traversing the __exit_funcs list.
>>
>> Feedbacks appreciated.
>>     
>
> 1. Describe in detail the conditions of the race and your proposed fix.
> 2. Provide the test case.
> 3. Run the testsuite for your target to show there were no
> regressions. Mention the target.
> 4. When quoting POSIX  please provide references.
>
> Cheers,
> Carlos.
>   

This has been further discussed on libc-help

where the patch is explained and a testcase is given for problem recreation.
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html
http://sourceware.org/ml/libc-help/2008-06/msg00011.html

and opengroup
https://www.opengroup.org/austin/mailarchives/ag/msg11495.html
https://www.opengroup.org/austin/mailarchives/ag/msg11496.html
https://www.opengroup.org/austin/mailarchives/ag/msg11498.html
https://www.opengroup.org/austin/mailarchives/ag/msg11501.html
https://www.opengroup.org/austin/mailarchives/ag/msg11502.html
https://www.opengroup.org/austin/mailarchives/ag/msg11505.html
https://www.opengroup.org/austin/mailarchives/ag/msg11506.html
https://www.opengroup.org/austin/mailarchives/ag/msg11507.html
https://www.opengroup.org/austin/mailarchives/ag/msg11508.html
https://www.opengroup.org/austin/mailarchives/ag/msg11509.html
https://www.opengroup.org/austin/mailarchives/ag/msg11510.html
https://www.opengroup.org/austin/mailarchives/ag/msg11511.html
https://www.opengroup.org/austin/mailarchives/ag/msg11515.html
https://www.opengroup.org/austin/mailarchives/ag/msg11517.html
https://www.opengroup.org/austin/mailarchives/ag/msg11518.html

The result of this discussion as I would infer is that exit() and atexit()
need to be safe with each other in the matter of keeping the list
consistent.
But it CAN'T be guaranteed that, if atexit() is called, 'during or after'
exit() is called, from a different thread, the handler that atexit() tries
to register will be invoked by exit().

The proposed patch does 2 things -
1. Keeps the list in a consistent state by protecting it with lock
2. Fails the atexit() registrations after exit() finishes processing the
list

Accommodating as many handlers as possible is the important and right thing
to do. But the "visibility problem" cant be avoided where a new list 
head appears only after exit() processing is over.
This problem should be taken care of, more by an application programmer 
than by glibc. Making the registration fail may not be of any worth as the 
thread might not get a chance to take any action afterwards, but from glibc 
perspective this is just for the sake of keeping with the standard.

I was waiting for getting myself added to the FSF assignment contract to 
post this patch, and I got it now.

No regressions were reported by the glibc tests (x86_64) before or 
after including this path.

Thanks to Ryan for helping out with the ChangeLog.

Cheers!
- Anoop


2008-09-12  Anoop V Chakkalakkal  <anoop.vijayan@in.ibm.com>

	* stdlib/cxa_atexit.c (_libc_lock_define_initialized): Parameter
	'lock' changed to global lock '__exit_funcs_lock'.
	(__new_exitfn): Fail new registration if exit code finished processing
	all handlers.
	* stdlib/exit.c: Include atomic.h, bits/libc-lock.h
	(__exit_funcs_done_): New flag to indicate status of list traversal.
	(exit): Changes the handler processing style to that of
	__cxa_finalize, with locking.
	* stdlib/exit.h: Include bits/libc-lock.h
	(__exit_funcs_lock): Declaration for external access.
	(__exit_funcs_done): Declaration for external access.


diff -Naurp glibc-2.7.orig/stdlib/cxa_atexit.c glibc-2.7/stdlib/cxa_atexit.c
--- glibc-2.7.orig/stdlib/cxa_atexit.c	2006-07-26 12:54:45.000000000 +0530
+++ glibc-2.7/stdlib/cxa_atexit.c	2008-06-03 02:36:57.000000000 +0530
@@ -51,7 +51,7 @@ INTDEF(__cxa_atexit)
 
 
 /* We change global data, so we need locking.  */
-__libc_lock_define_initialized (static, lock)
+__libc_lock_define_initialized (, __exit_funcs_lock)
 
 
 static struct exit_function_list initial;
@@ -66,7 +66,14 @@ __new_exitfn (void)
   struct exit_function *r = NULL;
   size_t i = 0;
 
-  __libc_lock_lock (lock);
+  __libc_lock_lock (__exit_funcs_lock);
+  if (__exit_funcs_done == 1)
+  {
+    /* exit code finished processing all handlers
+       so fail this registration */
+    __libc_lock_unlock (__exit_funcs_lock);
+    return NULL;
+  }  
 
   for (l = __exit_funcs; l != NULL; p = l, l = l->next)
     {
@@ -117,7 +124,7 @@ __new_exitfn (void)
       ++__new_exitfn_called;
     }
 
-  __libc_lock_unlock (lock);
+  __libc_lock_unlock (__exit_funcs_lock);
 
   return r;
 }
diff -Naurp glibc-2.7.orig/stdlib/exit.c glibc-2.7/stdlib/exit.c
--- glibc-2.7.orig/stdlib/exit.c	2005-12-18 23:01:14.000000000 +0530
+++ glibc-2.7/stdlib/exit.c	2008-06-03 04:19:52.000000000 +0530
@@ -18,13 +18,17 @@
 
 #include <stdio.h>
 #include <stdlib.h>
+#include <atomic.h>
 #include <unistd.h>
 #include <sysdep.h>
+#include <bits/libc-lock.h>
 #include "exit.h"
 
 #include "set-hooks.h"
 DEFINE_HOOK (__libc_atexit, (void))
 
+/* initialise the processing complete flag to 0 */
+int __exit_funcs_done = 0;
 
 /* Call all functions registered with `atexit' and `on_exit',
    in the reverse of the order in which they were registered
@@ -36,53 +40,87 @@ exit (int status)
      the functions registered with `atexit' and `on_exit'. We call
      everyone on the list and use the status value in the last
      exit (). */
-  while (__exit_funcs != NULL)
+    while(1)
     {
-      struct exit_function_list *old;
-
-      while (__exit_funcs->idx > 0)
-	{
-	  const struct exit_function *const f =
-	    &__exit_funcs->fns[--__exit_funcs->idx];
-	  switch (f->flavor)
-	    {
-	      void (*atfct) (void);
-	      void (*onfct) (int status, void *arg);
-	      void (*cxafct) (void *arg, int status);
-
-	    case ef_free:
-	    case ef_us:
-	      break;
-	    case ef_on:
-	      onfct = f->func.on.fn;
+        struct exit_function_list *funcs;
+        struct exit_function *f;
+restart:
+        __libc_lock_lock (__exit_funcs_lock);
+        funcs = __exit_funcs;
+        if(funcs==NULL)
+        {
+            /* mark the processing complete
+               we wont allow to register more handlers */
+            __exit_funcs_done = 1;
+            __libc_lock_unlock (__exit_funcs_lock);
+            break;
+        }
+        f = &funcs->fns[funcs->idx - 1];
+        uint64_t check1 = __new_exitfn_called;
+        __libc_lock_unlock (__exit_funcs_lock);
+        for(; f >= &funcs->fns[0]; --f)
+        {
+            uint64_t check2 = __new_exitfn_called;
+            switch (f->flavor)
+            {
+                void (*atfct) (void);
+                void (*onfct) (int status, void *arg);
+                void (*cxafct) (void *arg, int status);
+                void *arg;
+
+                case ef_free:
+                                break;
+                case ef_us:
+                                goto restart;
+                case ef_on:
+                                onfct = f->func.on.fn;
+                                arg = f->func.on.arg;
+                while(atomic_compare_and_exchange_bool_acq (&f->flavor, ef_free, ef_on));
 #ifdef PTR_DEMANGLE
-	      PTR_DEMANGLE (onfct);
+                PTR_DEMANGLE (onfct);
 #endif
-	      onfct (status, f->func.on.arg);
-	      break;
-	    case ef_at:
-	      atfct = f->func.at;
+                                onfct (status, arg);
+                                break;
+                case ef_at:
+                                atfct = f->func.at;
+                while(atomic_compare_and_exchange_bool_acq (&f->flavor, ef_free, ef_at));
 #ifdef PTR_DEMANGLE
-	      PTR_DEMANGLE (atfct);
+                PTR_DEMANGLE (atfct);
 #endif
-	      atfct ();
-	      break;
-	    case ef_cxa:
-	      cxafct = f->func.cxa.fn;
+                                atfct ();
+                                break;
+                case ef_cxa:
+                                cxafct = f->func.cxa.fn;
+                                arg = f->func.cxa.arg;
+                while(atomic_compare_and_exchange_bool_acq (&f->flavor, ef_free, ef_cxa));
 #ifdef PTR_DEMANGLE
-	      PTR_DEMANGLE (cxafct);
+                PTR_DEMANGLE (cxafct);
 #endif
-	      cxafct (f->func.cxa.arg, status);
-	      break;
-	    }
-	}
-
-      old = __exit_funcs;
-      __exit_funcs = __exit_funcs->next;
-      if (__exit_funcs != NULL)
-	/* Don't free the last element in the chain, this is the statically
-	   allocate element.  */
-	free (old);
+                                cxafct (arg, status);
+                                break;
+            }
+
+            /* It is possible that that last exit function registered
+               more exit functions.  Start the loop over.  */
+            if (__builtin_expect (check2 != __new_exitfn_called, 0))
+                goto restart;
+        }
+
+        __libc_lock_lock (__exit_funcs_lock);
+        if (__builtin_expect (check1 != __new_exitfn_called, 0))
+        {
+            __libc_lock_unlock (__exit_funcs_lock);
+            goto restart;
+        }
+        else
+        {
+            __exit_funcs = __exit_funcs->next;
+            /* Don't free the last element in the chain, this is the statically
+               allocate element.  */
+            if(__exit_funcs != NULL)
+                free(funcs);
+            __libc_lock_unlock (__exit_funcs_lock);
+        }
     }
 
   RUN_HOOK (__libc_atexit, ());
diff -Naurp glibc-2.7.orig/stdlib/exit.h glibc-2.7/stdlib/exit.h
--- glibc-2.7.orig/stdlib/exit.h	2006-07-26 12:53:43.000000000 +0530
+++ glibc-2.7/stdlib/exit.h	2008-06-03 02:36:57.000000000 +0530
@@ -21,7 +21,7 @@
 #define _EXIT_H 1
 
 #include <stdint.h>
-
+#include <bits/libc-lock.h>
 enum
 {
   ef_free,	/* `ef_free' MUST be zero!  */
@@ -62,5 +62,9 @@ extern struct exit_function_list *__exit
 
 extern struct exit_function *__new_exitfn (void);
 extern uint64_t __new_exitfn_called attribute_hidden;
+__libc_lock_define (extern, __exit_funcs_lock);
+
+/* flag to mark that all registered atexit/onexit handlers are called */
+extern int __exit_funcs_done attribute_hidden;
 
 #endif	/* exit.h  */



More information about the Libc-alpha mailing list