This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: bug in spinlock.c?
- From: Karsten Keil <kkeil at suse dot de>
- To: Wolfram Gloger <Wolfram dot Gloger at dent dot med dot uni-muenchen dot de>
- Cc: aj at suse dot de, libc-alpha at sources dot redhat dot com, kkeil at suse dot de
- Date: Fri, 21 Feb 2003 17:46:22 +0100
- Subject: Re: bug in spinlock.c?
- Organization: SuSE Linux AG
- References: <hoadgpn6a2.fsf@byrd.suse.de> <200302211617.RAA69021@max.zk-i.med.uni-muenchen.de>
On Fri, Feb 21, 2003 at 05:17:59PM +0100, Wolfram Gloger wrote:
> > Looking at the ex18 hang (sometimes ex18 even segfaulted) on x86-64,
> > Karsten noticed that we allocate a struct wait_node in
> > __pthread_alt_lock on the stack - and put it somehow also on the list
> > of waiting nodes.
>
> Yes.
>
> > In __pthread_alt_unlock we go through the waiting nodes and deque it.
> >
> > This looks broken, since we allocate something on the stack of a
> > function and leave the function with this data hanging around.
>
> ?? No, the function __pthread_alt_lock is not left until the
> wait_node has been signalled, check all the suspend(self); calls.
> Basically, all this works because only one function can acquire the
> lock.
>
Yes I found this by my self after some brainstorming :-), but it
was the first impression, that this code may be the reason for the
destroyed p_lock->next (set to something else, but not a valid pointer).
Thanks for confirmation.
> > Can somebody confirm this? Or do you have other ideas that would
> > explain the segfaults we noticed? gdb pointed to this code,
>
> I've also seen gdb stacktraces like this many times, however the
> actual spinlock code in LinuxThreads never was at fault, it was always
> a double unlock or similar..
>
> Regards,
> Wolfram.
--
Karsten Keil
SuSE Labs
ISDN development