This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: shared data protection failed in pthread_cond_timedwait
- From: Will Newton <will dot newton at linaro dot org>
- To: Yang Yingliang <yangyingliang at huawei dot com>
- Cc: libc-help at sourceware dot org, libc-alpha <libc-alpha at sourceware dot org>
- Date: Fri, 25 Apr 2014 10:43:25 +0100
- Subject: Re: shared data protection failed in pthread_cond_timedwait
- Authentication-results: sourceware.org; auth=none
- References: <535A078F dot 3050003 at huawei dot com>
On 25 April 2014 07:58, Yang Yingliang <yangyingliang@huawei.com> wrote:
> Hi,
>
>
> I have 22 threads wait in pthread_cond_timedwait. When they are all woke up, I found
> there are more than one threads can access shared data in pthread_cond_timedwait.
>
> I added print messages as follow code:
>
> --- libc/nptl/pthread_cond_timedwait.c
> +++ libc/nptl/pthread_cond_timedwait.c
> @@ -34,6 +34,7 @@
> #else
> # include <bits/libc-vdso.h>
> #endif
> +#include <stdio.h>
>
> /* Cleanup handler, defined in pthread_cond_wait.c. */
> extern void __condvar_cleanup (void *arg)
> @@ -235,7 +239,9 @@
>
> bc_out:
>
> +printf("start do sub :%d, lock:%d %p\n", cond->__data.__nwaiters, cond->__data.__lock, pthread_self());
> cond->__data.__nwaiters -= 1 << COND_NWAITERS_SHIFT;
> +printf("end do sub :%d, lock:%d %p\n", cond->__data.__nwaiters, cond->__data.__lock, pthread_self());
>
> /* If pthread_cond_destroy was called on this variable already,
> notify the pthread_cond_destroy caller all waiters have left
>
>
> I tested on Linux arma15el 3.10.37+ #2 SMP Fri Apr 25 11:23:25 CST 2014 armv7l GNU/Linux.
> Here is the result:
>
> start do sub :45, lock:1 0xb6d9a460
> end do sub :43, lock:1 0xb6d9a460
> start do sub :43, lock:1 0xb6d9e460
> end do sub :41, lock:2 0xb6d9e460
> start do sub :43, lock:2 0xb6dbe460 //two threads both access the shared data
> start do sub :41, lock:1 0xb6daa460
> end do sub :39, lock:2 0xb6daa460
> start do sub :39, lock:2 0xb6de6460
> end do sub :37, lock:2 0xb6de6460
> start do sub :37, lock:2 0xb6db6460
> end do sub :35, lock:2 0xb6db6460
> start do sub :35, lock:2 0xb6dc2460
> end do sub :33, lock:2 0xb6dc2460
> end do sub :37, lock:2 0xb6dbe460
> start do sub :33, lock:2 0xb6dc6460
> end do sub :31, lock:0 0xb6dc6460
> start do sub :31, lock:2 0xb6dae460
> end do sub :29, lock:2 0xb6dae460
> start do sub :29, lock:2 0xb6db2460
> end do sub :27, lock:2 0xb6db2460
> start do sub :27, lock:2 0xb6dba460
> end do sub :25, lock:2 0xb6dba460
> start do sub :25, lock:2 0xb6da2460
> end do sub :23, lock:2 0xb6da2460
>
> Is lll_lock (cond->__data.__lock, pshared) failed?
>
> pshared is LLL_SHARED.
I have had a quick look at this and there is no obvious reason I can
see for this behaviour, unless there is some way that IO buffering
could cause the messages to be strangely interleaved. The other
alternative that may be worth investigating is whether or not
ldrex/strex is working correctly in your SMP system.
--
Will Newton
Toolchain Working Group, Linaro