This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

fork hang with corrupted list_all_lock

From: "Wayne H. Badger" <badger at yahoo-inc dot com>
To: libc-help at sourceware dot org
Date: Fri, 25 Jun 2010 17:42:42 -0500
Subject: fork hang with corrupted list_all_lock

I have discovered an anomaly whose investigation has led to glibc and I'm wondering if this has been seen before.

I have a cluster of machines running RHEL5.4 (glibc-2.5 based) on Nehalem E5530 processors (16 hyperthreaded CPUs, stepping 5) that are running a java process (hadoop TaskTracker). TaskTracker is 32-bit and multithreaded (~80 threads). The kernel is 64 bit running 2.6.18-164.2.1.el5.

I have caught the process in a relatively rare event that is one of those "can't happen" scenarios.

Whenever a process forks, __libc_fork (nptl/sysdeps/unix/sysv/linux/ fork.c) calls _IO_list_lock() to acquire list_all_lock before calling the fork system call. list_all_lock contains three fields: lock, cnt, and owner. After the fork system call, the child resets the lock and the parent releases it.

Normally, this works as you would expect, but when it fails, the parent's lock is zeroed (.lock=0, .cnt=0, .owner=0) and when subsequently released, results in a lock in an invalid state. At that time, the lock has these values. list_all_lock.lock: 2 list_all_lock.cnt: -1 list_all_lock.owner: <thread that released the lock>

From this state, no additional forks can be made. Many of the threads in the process are waiting for a lock in the malloc code (malloc_atfork) that runs when a fork is currently outstanding. The process is hung at this point.

So, the "can't happen" event is that some thread/process has scrozzled the lock while it is being held by a thread. Unless there is some glibc code that is just writing out-of-bounds zeroes, it looks like the lock is being reset with _IO_list_resetlock(). Since only the child calls this code in its own address space, it ought not affect the parent's version of the lock.

This anomaly occurs only when running RHEL5.4 on the Nehalem processors. I have not been able to reproduce the issue running either RHEL5.4 or RHEL5.1 on older E5420 processors.

Remediations tried so far have all resulted in the same TaskTracker hang. * latest java (jdk1.6_20) * set UseMemBar in java * use latest microcode from Intel for E5530 * restrict the CPU set to all CPUs on a single processor * disable HyperThreading in the BIOS * latest RHEL glibc: glibc-2.5-49

I have tried a couple of tests that resulted in the issue not being reproduced. * restrict all threads to the same CPU * add glibc debugging so that cache line containing list_all_lock was rearranged

I have looked at http://sourceware.org/ml/libc-hacker/2007-02/msg00009.html , but this doesn't seem quite like the issue that I'm seeing. If that were the bug, then I would expect to see a deadlock situation, not corrupted lock fields.

While it looks like this may be a silicon bug, it is possible that it is not and so I'm looking for anyone who might have seen this kind of behavior in glibc.

Wayne

--
Wayne Badger
Yahoo!

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]