Summary: | PTHREAD_MUTEX_LOCK() in multiple threads RETURNING EOWNERDEAD | ||
---|---|---|---|
Product: | glibc | Reporter: | zhenzhong.duan |
Component: | nptl | Assignee: | Siddhesh Poyarekar <siddhesh> |
Status: | RESOLVED INVALID | ||
Severity: | critical | CC: | bugdal, drepper.fsp, siddhesh |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | 2.14 | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | ||
Attachments: |
code to test concurrency of thread calling pthread_mutex_lock
rewrite futexCase1_r1.cpp to futexCase1_r1.c |
Description
zhenzhong.duan
2012-05-08 03:40:37 UTC
Could you simplify the test case a bit and narrow down the cause of the bug? I notice that you're using priority inheritance mutexes (which you didn't mention in the report) which have some different code paths, so it would be helpful to know if that's related to the problem. Also, I'm not one of the glibc developers so you can take this or leave this, but it would be nice if test cases could be C rather than C++, especially when the C++ is gratuitous (in this case, just one use of a struct name as a type, and a few references instead of pointers). Part of my reason for asking this is that I like to test glibc testcases against my implementation of the C library and on some of my test systems I don't have a C++ development environment. Created attachment 6442 [details]
rewrite futexCase1_r1.cpp to futexCase1_r1.c
Rewrite a c version. Per out customer's test, still see concurrency. @ $ gcc -Wall -O3 -m32 -march=i686 futexCase1_r1.c -o futexCase1_r1 -lpthread @ -D_GNU_SOURCE @ $ ./futexCase1_r1 @ 8279: created mutex: 0xf7f1a000 @ 8419: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument @ 8438: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument @ 8439: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument @ … @ 8279: Done! lock concurrency: 0, max: 7 @ $ The description is misleading. Your program does not use threads; it uses different processes using fork with mutexes in shared memory with pshared set. I have modified your program to the following and it shows that the problem is seen only with forked processes and not with threads: #include <stdio.h> #include <sys/types.h> #include <unistd.h> #include <sys/syscall.h> #include <pthread.h> #include <errno.h> #include <string.h> #include <stdlib.h> #include <wait.h> #include <sys/mman.h> #include <assert.h> #include <pthread.h> #define zero_or_die(ret, str) \ ({ \ if (ret) \ { \ printf ("%s: %s\n", str, strerror (ret)); \ abort (); \ } \ }) void initLock (pthread_mutex_t * mutex) { int ok = 0; pthread_mutexattr_t ma; ok = pthread_mutexattr_init (&ma); zero_or_die (ok, "pthread_mutexattr_init"); ok = pthread_mutexattr_settype (&ma, PTHREAD_MUTEX_ERRORCHECK); zero_or_die (ok, "pthread_mutexattr_settype"); #ifdef FORK ok = pthread_mutexattr_setpshared (&ma, PTHREAD_PROCESS_SHARED); zero_or_die (ok, "pthread_mutexattr_setpshared"); #endif ok = pthread_mutexattr_setprotocol (&ma, PTHREAD_PRIO_INHERIT); zero_or_die (ok, "pthread_mutexattr_setprotocol"); ok = pthread_mutexattr_setrobust (&ma, PTHREAD_MUTEX_ROBUST_NP); zero_or_die (ok, "pthread_mutexattr_setrobust"); ok = pthread_mutex_init (mutex, &ma); zero_or_die (ok, "pthread_mutex_init"); pthread_mutexattr_destroy (&ma); printf ("%u: created mutex: %p\n", getpid (), mutex); } int lock (pthread_mutex_t * mutex) { int rc = pthread_mutex_lock (mutex); if (rc == EOWNERDEAD) { int crc = pthread_mutex_consistent (mutex); if (crc != 0) fprintf (stderr, "%u: pthread_mutex_consistent failed: %p %d %s\n", getpid (), mutex, crc, strerror (crc)); } else if (rc != 0) fprintf (stderr, "%u: pthread_mutex_lock failed: %p %d %s\n", getpid (), mutex, rc, strerror (rc)); return rc; } void unlock (pthread_mutex_t * mutex) { int rc = pthread_mutex_unlock (mutex); if (rc != 0) fprintf (stderr, "%u: pthread_mutex_unlock failed: %p %d %s\n", getpid (), mutex, rc, strerror (rc)); } void child (pthread_mutex_t * mutex) { lock (mutex); sched_yield (); #ifdef VERBOSE fprintf (stderr, "%d process suicide - abandoning lock %p\n", getpid (), mutex); #endif /* */ // abandon the lock (i.e. dont unlock it) // unlock(mutex); } int main (int argc, char **argv) { #define iterations 500 #define parProcs 10 int i; #ifdef FORK void *shared = mmap (NULL, sizeof (pthread_mutex_t), PROT_WRITE | PROT_READ, MAP_SHARED | MAP_ANON, -1, 0); assert (shared != NULL); memset (shared, 0, sizeof (pthread_mutex_t)); pthread_mutex_t *mutex = (pthread_mutex_t *) shared; #else pthread_mutex_t *mutex = malloc (sizeof (pthread_mutex_t)); pthread_t t[iterations]; #endif initLock (mutex); lock (mutex); unlock (mutex); for (i = 0; i < (iterations + parProcs); i++) { if ((i < iterations)) { #ifdef FORK if (fork () == 0) { child (mutex); return -1; } #else pthread_create (&t[i], NULL, child, mutex); #endif } } return 0; } To build with fork (a variant of your original reproducer): gcc -DFORK foo.c -pthread To build it to use pthreads and non-shared memory: gcc foo.c -pthread Another thing to note is that the problem is reproducible only when PI is set. This is important because in the PI case, it is the kernel that synchronizes the threads (or in this case, processes) with the FUTEX_LOCK_PI operation. It may well end up being a kernel futex bug and not a glibc bug. I misunderstood the problem first. The race is present in threads also, albeit much less evident somehow. thanks for your reply. If you need other info, just let me know. As I suspected, this is a bug in the kernel. I have posted a patch that works for me: http://lkml.indiana.edu/hypermail/linux/kernel/1210.1/02508.html So this is not a bug in glibc. *** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla. |