This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] misc/tst-clone3: Fix waiting for exited thread.


Attached the updated patch.
See also notes below.

On 02/08/2019 07:37 PM, Adhemerval Zanella wrote:


On 08/02/2019 13:12, Stefan Liebler wrote:
Hi,

from time to time the test misc/tst-clone3 fails with an timeout.  Then futex_wait is blocking.  Usually ctid should be set to zero due to CLONE_CHILD_CLEARTID and the futex should be waken up.  But the fail occures if the thread has already exited before ctid is set to the return value of clone().  Then futex_wait() will block as there will be nobody who wakes the futex up again.

This patch initializes ctid to a known value before calling clone and the kernel is the only one who updates the value to zero after clone. If futex_wait is called then it is either waked up due to the exited thread or the futex syscall fails as *ctid_ptr is already zero instead of the specified value 1.

Okay to commit?

Bye
Stefan

ChangeLog:

     * sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
     Initialize ctid with a known value and remove update of ctid
     after clone.
     (wait_tid): Adjust arguments and call futex_wait with ctid_val
     as assumed current value of ctid_ptr.

20190208_misc_tst-clone3.patch

commit 5a8f80973dbbf06a0eebc2064d2a14521c6a6131
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Fri Feb 8 12:42:52 2019 +0100

     misc/tst-clone3: Fix waiting for exited thread.
From time to time the test misc/tst-clone3 fails with an timeout.
     Then futex_wait is blocking.  Usually ctid should be set to zero
     due to CLONE_CHILD_CLEARTID and the futex should be waken up.
     But the fail occures if the thread has already exited before
     ctid is set to the return value of clone().  Then futex_wait() will
     block as there will be nobody who wakes the futex up again.
This patch initializes ctid to a known value before calling clone
     and the kernel is the only one who updates the value to zero after clone.
     If futex_wait is called then it is either waked up due to the exited thread
     or the futex syscall fails as *ctid_ptr is already zero instead of the
     specified value 1.
ChangeLog: * sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
             Initialize ctid with a known value and remove update of ctid
             after clone.
             (wait_tid): Adjust arguments and call futex_wait with ctid_val
             as assumed current value of ctid_ptr.

Thanks for catching it.


diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c
index aa8e718afe..ffa2056eb6 100644
--- a/sysdeps/unix/sysv/linux/tst-clone3.c
+++ b/sysdeps/unix/sysv/linux/tst-clone3.c
@@ -42,11 +42,11 @@ f (void *a)
/* Futex wait for TID argument, similar to pthread_join internal
     implementation.  */
-#define wait_tid(tid) \
+#define wait_tid(ctid_ptr, ctid_val)	\
    do {					\
-    __typeof (tid) __tid;		\
-    while ((__tid = (tid)) != 0)	\
-      futex_wait (&(tid), __tid);	\
+    __typeof (*(ctid_ptr)) __tid;	\
+    while ((__tid = *(ctid_ptr)) != 0)	\
+      futex_wait (ctid_ptr, ctid_val);	\
    } while (0)

lll_wait_tid uses an atomic load with acquire semantic, I think we should
use it as well. We can either include the atomic header or use c11 atomic.

I've first tried to include atomic.h, but it failed building on x86_64. Thus I'm using the c11 atomic load in the updated patch.
Okay to commit?


As information, I've observed those gcc errors on x86_64:
In file included from ../sysdeps/unix/sysv/linux/x86_64/sysdep.h:30,
                 from ../sysdeps/x86_64/nptl/tls.h:28,
                 from ../sysdeps/x86/atomic-machine.h:23,
                 from ../include/atomic.h:50,
                 from ../sysdeps/unix/sysv/linux/tst-clone3.c:29:
../sysdeps/unix/sysv/linux/dl-sysdep.h: In function ‘_dl_discover_osversion’: ../sysdeps/unix/sysv/linux/dl-sysdep.h:31:42: error: expected declaration specifiers before ‘attribute_hidden’
 extern int _dl_discover_osversion (void) attribute_hidden;
                                          ^~~~~~~~~~~~~~~~
In file included from ../sysdeps/x86_64/nptl/tls.h:31,
                 from ../sysdeps/x86/atomic-machine.h:23,
                 from ../include/atomic.h:50,
                 from ../sysdeps/unix/sysv/linux/tst-clone3.c:29:
../sysdeps/generic/dl-dtv.h:22:1: error: empty declaration [-Werror]
 struct dtv_pointer
 ^~~~~~
../sysdeps/generic/dl-dtv.h:33:3: error: storage class specified for parameter ‘dtv_t’
 } dtv_t;
   ^~~~~
...
many many more errors
...

I've also tried to add tst-clone3 to tests_internal instead of tests in the Makefile, but then I've got:
In file included from ../sysdeps/x86_64/nptl/tls.h:130,
                 from ../sysdeps/x86/atomic-machine.h:23,
                 from ../include/atomic.h:50,
                 from ../sysdeps/unix/sysv/linux/tst-clone3.c:29:
../nptl/descr.h:104:8: error: redefinition of ‘struct robust_list_head’
 struct robust_list_head
        ^~~~~~~~~~~~~~~~
In file included from ../sysdeps/unix/sysv/linux/tst-clone3.c:26:
/usr/include/linux/futex.h:70:8: note: originally defined here
 struct robust_list_head {
        ^~~~~~~~~~~~~~~~

static inline int
@@ -64,7 +64,11 @@ do_test (void)
    clone_flags |= CLONE_VM | CLONE_SIGHAND;
    /* We will used ctid to call on futex to wait for thread exit.  */
    clone_flags |= CLONE_CHILD_CLEARTID;
-  pid_t ctid, tid;
+  /* Initialize with a known value.  ctid is set to zero by the kernel after the
+     cloned thread has exited.  */
+#define CTID_INIT_VAL 1
+  pid_t ctid = CTID_INIT_VAL;
+  pid_t tid;
#ifdef __ia64__
    extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base,
@@ -86,8 +90,7 @@ do_test (void)
    if (tid == -1)
      FAIL_EXIT1 ("clone failed: %m");
- ctid = tid;
-  wait_tid (ctid);
+  wait_tid (&ctid, CTID_INIT_VAL);
return 2;
  }



commit 3672d05bd3c5c51831ef8b91ee2b0ff491fd2986
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Fri Feb 8 12:42:52 2019 +0100

    misc/tst-clone3: Fix waiting for exited thread.
    
    From time to time the test misc/tst-clone3 fails with a timeout.
    Then futex_wait is blocking.  Usually ctid should be set to zero
    due to CLONE_CHILD_CLEARTID and the futex should be waken up.
    But the fail occures if the thread has already exited before
    ctid is set to the return value of clone().  Then futex_wait() will
    block as there will be nobody who wakes the futex up again.
    
    This patch initializes ctid to a known value before calling clone
    and the kernel is the only one who updates the value to zero after clone.
    If futex_wait is called then it is either waked up due to the exited thread
    or the futex syscall fails as *ctid_ptr is already zero instead of the
    specified value 1.
    
    ChangeLog:
    
            * sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
            Initialize ctid with a known value and remove update of ctid
            after clone.
            (wait_tid): Adjust arguments and call futex_wait with ctid_val
            as assumed current value of ctid_ptr.

diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c
index aa8e718afe..b345d04b4d 100644
--- a/sysdeps/unix/sysv/linux/tst-clone3.c
+++ b/sysdeps/unix/sysv/linux/tst-clone3.c
@@ -42,11 +42,13 @@ f (void *a)
 
 /* Futex wait for TID argument, similar to pthread_join internal
    implementation.  */
-#define wait_tid(tid) \
-  do {					\
-    __typeof (tid) __tid;		\
-    while ((__tid = (tid)) != 0)	\
-      futex_wait (&(tid), __tid);	\
+#define wait_tid(ctid_ptr, ctid_val)					\
+  do {									\
+    __typeof (*(ctid_ptr)) __tid;					\
+    /* We need acquire MO here so that we synchronize with the		\
+       kernel's store to 0 when the clone terminates.  */		\
+    while ((__tid = __atomic_load_n (ctid_ptr, __ATOMIC_ACQUIRE)) != 0)	\
+      futex_wait (ctid_ptr, ctid_val);					\
   } while (0)
 
 static inline int
@@ -64,7 +66,11 @@ do_test (void)
   clone_flags |= CLONE_VM | CLONE_SIGHAND;
   /* We will used ctid to call on futex to wait for thread exit.  */
   clone_flags |= CLONE_CHILD_CLEARTID;
-  pid_t ctid, tid;
+  /* Initialize with a known value.  ctid is set to zero by the kernel after the
+     cloned thread has exited.  */
+#define CTID_INIT_VAL 1
+  pid_t ctid = CTID_INIT_VAL;
+  pid_t tid;
 
 #ifdef __ia64__
   extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base,
@@ -86,8 +92,7 @@ do_test (void)
   if (tid == -1)
     FAIL_EXIT1 ("clone failed: %m");
 
-  ctid = tid;
-  wait_tid (ctid);
+  wait_tid (&ctid, CTID_INIT_VAL);
 
   return 2;
 }

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]