This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: Crash deep inside pthread_cond_wait()
- From: Jan Wielemaker <J dot Wielemaker at cs dot vu dot nl>
- To: libc-help at sourceware dot org
- Date: Mon, 27 Jul 2009 16:55:27 +0200
- Subject: Re: Crash deep inside pthread_cond_wait()
- References: <1248467663.15229.9.camel@ct>
On Friday 24 July 2009 10:34:23 pm Jan Wielemaker wrote:
> Hi,
>
> I'm trying to debug a crash in pthread_cond_wait(), but I have little
> luck sofar. The condition variable guards a queue of alarms. There
> is a thread that goes over the queue and uses pthread_kill() to inform
> threads of alarms. The alarm signal handler itself sets a flag and
> returns.
>
> If I stress-test the design, both using SuSE-11.1 and Ubuntu-jaunty
> (both AMD64 dual-core machines), I get a crash like this every approx.
> 50,000 iterations:
>
> #0 0x00007f44ea95e5c4 in _L_unlock_56 () from /lib/libpthread.so.0
> #1 0x00007f44ea95e5c9 in _L_unlock_56 () from /lib/libpthread.so.0
> #2 0x00007f44ea95e206 in __pthread_mutex_unlock_usercnt
> (mutex=0x7f448a48a280, decr=-1974951296)
> at pthread_mutex_unlock.c:64
> #3 0x00007f44ea95f268 in pthread_cond_wait@@GLIBC_2.3.2 ()
> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:201
> #4 0x00007f448a288b00 in alarm_loop (closure=<value optimized out>) at
> time.c:685
>
> The mutex holds this data:
>
> (gdb) p *mutex
> $7 = {__data = {__lock = 2, __count = 0, __owner = 0, __nusers = 1,
> __kind = 2, __spins = 0, __list = {
> __prev = 0x0, __next = 0x0}},
> __size = "\002", '\0' <repeats 11 times>, "\001\000\000\000\002", '\0'
> <repeats 22 times>, __align = 2}
>
> and the cond this:
>
> $8 = {__data = {__lock = 1, __futex = 131006, __total_seq = 65503,
> __wakeup_seq = 65503,
> __woken_seq = 65503, __mutex = 0x7f448a48a280, __nwaiters = 0,
> __broadcast_seq = 0},
> __size = "\001\000\000\000ïï\001\000ïï\000\000\000\000\000\000ïï\000
> \000\000\000\000\000ïï\000\000\000\000\000\000\200ïH\212D\177\000\000
> \000\000\000\000\000\000\000", __align = 562666485579777}
>
> Does this ring a bell with someone? If not, I'd also be happy with
> a description what the fields of the structure are supposed to mean
> or other tips for debugging this.
False alarm. Eventually, it turned out to be a stack-overflow due to
errornous usage of alloca() somewhere in the code.
Sorry for the noise --- Jan