Bug 19004 - tst-robust8 fails intermittently on x86_64
Summary: tst-robust8 fails intermittently on x86_64
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: nptl (show other bugs)
Version: 2.23
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-25 08:50 UTC by Florian Weimer
Modified: 2016-12-20 00:05 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
robust_test.stp (420 bytes, text/plain)
2016-01-27 11:49 UTC, Paulo Andrade
Details
robust_test2.c (2.47 KB, text/plain)
2016-01-27 11:54 UTC, Paulo Andrade
Details
robust_test.stp (610 bytes, text/plain)
2016-01-27 18:44 UTC, Paulo Andrade
Details
linux-kernel-test.patch (375 bytes, patch)
2016-01-29 13:10 UTC, Paulo Andrade
Details | Diff
robust_test.c (2.83 KB, text/plain)
2016-01-29 13:12 UTC, Paulo Andrade
Details
robust_test.stp (626 bytes, text/plain)
2016-01-29 13:20 UTC, Paulo Andrade
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Weimer 2015-09-25 08:50:41 UTC
The test machine is an x86_64 machine running the 3.10 kernel from Red Hat Enterprise Linux 7, two Xeon E5-2450 CPUs with 8 cores each, for a total number of 32 hardware threads.

I can reproduce this with 2.23 master as of today, but I have trouble getting proper backtraces because I cannot get GDB to use the right libthread_db, so I am not completely certain the observation below applies to glibc master.

With the glibc 2.17 in Red Hat Enterprise Linux 7, this pthread_mutex_lock call blocks indefinitely:

232           for (int n = 0; n < N; ++n)
233             {
234               int e = pthread_mutex_lock (&map[n]);
235               if (e != 0 && e != EOWNERDEAD)

Sometimes, this happens on the first round, sometimes on subsequent rounds.

(gdb) print/x map[n]
$4 = {__data = {__lock = 0x80000000, __count = 0x1, __owner = 0x0, 
    __nusers = 0x0, __kind = 0x90, __spins = 0x0, __list = {__prev = 0x0, 
      __next = 0x0}}, __size = {0x0, 0x0, 0x0, 0x80, 0x1, 
    0x0 <repeats 11 times>, 0x90, 0x0 <repeats 23 times>}, 
  __align = 0x180000000}
(gdb) print/x map[n-1]
$5 = {__data = {__lock = 0x800052b0, __count = 0x1, __owner = 0x7fffffff, 
    __nusers = 0x1, __kind = 0x90, __spins = 0x0, __list = {
      __prev = 0x7ffff7feca20, __next = 0x7ffff7ff8480}}, __size = {0xb0, 
    0x52, 0x0, 0x80, 0x1, 0x0, 0x0, 0x0, 0xff, 0xff, 0xff, 0x7f, 0x1, 0x0, 
    0x0, 0x0, 0x90, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x20, 0xca, 0xfe, 
    0xf7, 0xff, 0x7f, 0x0, 0x0, 0x80, 0x84, 0xff, 0xf7, 0xff, 0x7f, 0x0, 
    0x0}, __align = 0x1800052b0}
(gdb) print/x map[n+1]
$7 = {__data = {__lock = 0x40000000, __count = 0x1, __owner = 0x52b3, 
    __nusers = 0x1, __kind = 0x90, __spins = 0x0, __list = {
      __prev = 0x7ffff7ff8200, __next = 0x7ffff7ff8138}}, __size = {0x0, 
    0x0, 0x0, 0x40, 0x1, 0x0, 0x0, 0x0, 0xb3, 0x52, 0x0, 0x0, 0x1, 0x0, 0x0, 
    0x0, 0x90, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x82, 0xff, 0xf7, 
    0xff, 0x7f, 0x0, 0x0, 0x38, 0x81, 0xff, 0xf7, 0xff, 0x7f, 0x0, 0x0}, 
  __align = 0x140000000}
(gdb) print/x map[n+2]
$8 = {__data = {__lock = 0xc0000000, __count = 0x1, __owner = 0x52ba, 
    __nusers = 0x1, __kind = 0x90, __spins = 0x0, __list = {
      __prev = 0x7ffff7ff8d18, __next = 0x7ffff7ff8250}}, __size = {0x0, 
    0x0, 0x0, 0xc0, 0x1, 0x0, 0x0, 0x0, 0xba, 0x52, 0x0, 0x0, 0x1, 0x0, 0x0, 
    0x0, 0x90, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x18, 0x8d, 0xff, 0xf7, 
    0xff, 0x7f, 0x0, 0x0, 0x50, 0x82, 0xff, 0xf7, 0xff, 0x7f, 0x0, 0x0}, 
  __align = 0x1c0000000}

I don't know enough about futexes to debug this further.
Comment 1 Florian Weimer 2015-09-25 08:54:03 UTC
Full backtrace:

#1  0x00007ffff7bc4d7c in _L_robust_lock_205 () at pthread_mutex_lock.c:478
No symbol table info available.
#2  0x00007ffff7bc4a05 in __pthread_mutex_lock_full (mutex=0x7ffff7ff84b0)
    at pthread_mutex_lock.c:225
        result = <optimized out>
        oldval = <optimized out>
        id = 21168
#3  0x00007ffff7bc8c1a in __GI___pthread_mutex_lock (mutex=<optimized out>)
    at pthread_mutex_lock.c:55
        type = <optimized out>
        id = <optimized out>
#4  0x0000000000402349 in do_test () at tst-robust8.c:234
        e = <optimized out>
        n = 30
        p = 21169
        status = 9
        round = 1
        ma = {__size = "\000\000\000\300", __align = -1073741824}
#5  0x0000000000401a17 in main (argc=<optimized out>, argv=<optimized out>)
    at test-skeleton.c:412
        direct = <optimized out>
        status = 0
        opt = <optimized out>
        timeoutfactor = 1
        termpid = <optimized out>
        envstr_timeoutfactor = <optimized out>
        envstr_direct = <optimized out>
Comment 2 Florian Weimer 2015-09-25 09:45:00 UTC
I could reproduce this on a much smaller machine (with once i7-4810MQ CPU), with glibc-2.17-78.el7.x86_64.  The hang looks identical.
Comment 3 Paulo Andrade 2016-01-27 11:49:42 UTC
Created attachment 8924 [details]
robust_test.stp

  I believe I have a test case from an user that makes it
easier to debug this condition.

  Attached is my WIP stap script to debug it. It traces
the test case pretty much down to schedule calls in the
kernel side, but still may need some extra tweaks in the
user level trace (but should give enough hints to someone
familiar enough with the glibc code).
Comment 4 Paulo Andrade 2016-01-27 11:54:22 UTC
Created attachment 8925 [details]
robust_test2.c

  While it is very easy to reproduce in the command line, if
using stap it may take quite some time to trigger the (apparent)
race.

  Sample systemtap script output when a dead lock happens:

[29352] > syscall.wait4(upid=0x7562 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30050]  signal.send SIGCHLD to 29352
[30050] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7562
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30051] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30051] < syscall.set_robust_list() return=0x0
[29352]  signal.send SIGTERM to 30051
[30051] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[30051]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[29352] > syscall.wait4(upid=0x7563 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30051]  signal.send SIGCHLD to 29352
[30051] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7563
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30052] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30052] < syscall.set_robust_list() return=0x0
[29352]  signal.send SIGTERM to 30052
[30052] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[30052]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[29352] > syscall.wait4(upid=0x7564 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30052]  signal.send SIGCHLD to 29352
[30052] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7564
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30053] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30053] < syscall.set_robust_list() return=0x0
[29352]  signal.send SIGTERM to 30053
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[30053] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[30053]  do_exit(code=0xf)
[29352] > syscall.wait4(upid=0x7565 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30053]  signal.send SIGCHLD to 29352
[30053] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7565
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] < 0x400fed() return=0x0
[30054] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30054] < syscall.set_robust_list() return=0x0
[29352] < 0x400fed() return=0x0
[30054] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352]  signal.send SIGTERM to 30054
[30054]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[29352] > syscall.wait4(upid=0x7566 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30054]  signal.send SIGCHLD to 29352
[30054] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7566
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30055] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30055] < syscall.set_robust_list() return=0x0
[30055] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352]  signal.send SIGTERM to 30055
[30055]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[29352] > syscall.wait4(upid=0x7567 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30055]  signal.send SIGCHLD to 29352
[30055] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7567
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[30056] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30056] < syscall.set_robust_list() return=0x0
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30056] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352]  signal.send SIGTERM to 30056
[30056]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[29352] > syscall.wait4(upid=0x7568 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30056]  signal.send SIGCHLD to 29352
[30056] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7568
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[30057] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30057] < syscall.set_robust_list() return=0x0
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30057] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352]  signal.send SIGTERM to 30057
[30057]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[29352] > syscall.wait4(upid=0x7569 stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30057]  signal.send SIGCHLD to 29352
[30057] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x7569
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30058] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30058] < syscall.set_robust_list() return=0x0
[30058] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352]  signal.send SIGTERM to 30058
[30058]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] < 0x400f98() return=0x0
[29352] < 0x400f98() return=0x0
[29352] > syscall.wait4(upid=0x756a stat_addr=0x7fff9f59152c options=0x0 ru=0x0)
[29352] > schedule()
[30058]  signal.send SIGCHLD to 29352
[30058] > schedule()
[29352] < schedule() 
[29352] < syscall.wait4() return=0x756a
[30059] > syscall.set_robust_list(head=0x7fbf4b38aa20 len=0x18)
[30059] < syscall.set_robust_list() return=0x0
[29352] > schedule()
[30059] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[30059] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[30059] > syscall.futex(uaddr=0x7fbf4b3ae000 op=0x0 val=0x800072a8 utime=0x0 uaddr2=0x80 val3=0x0)
[30059] > futex_wait(uaddr=0x7fbf4b3ae000 flags=0x1 val=0x800072a8 abs_time=0x0 bitset=0xffffffff)
[30059] > futex_wait_setup(uaddr=0x7fbf4b3ae000 val=0x800072a8 flags=0x1 q=0xffff88028616bdc8 hb=0xffff88028616bd58)
[30059] < futex_wait() return=0x0
[30059] > futex_wait_queue_me(hb=0xffffc9000189c840 q=0xffff88028616bdc8 timeout=0x0)
[30059] > schedule()
Done 700 iterations
[29352] > schedule()
[29352] > pthread_mutex_unlock(mutex=0x7fbf4b3ae000)
[29352] > pthread_mutex_unlock(decr=0x1 mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_unlock_full(mutex=0x7fbf4b3ae000 decr=0x1)
[29352] > syscall.futex(uaddr=0x7fbf4b3ae000 op=0x1 val=0x1 utime=0x0 uaddr2=0x7fbf4b3ae000 val3=0x8)
[29352] < syscall.futex() return=0x1
[30059] < schedule() 
[29352] < 0x400fed() return=0x0
[29352] < 0x400fed() return=0x0
[30059] < futex_wait() 
[30059] < do_futex() return=0x0
[30059] < syscall.futex() return=0x0
[29352]  signal.send SIGTERM to 30059
[30059]  do_exit(code=0xf)
[29352] > __pthread_mutex_lock(mutex=0x7fbf4b3ae000)
[29352] > __pthread_mutex_lock_full(mutex=0x7fbf4b3ae000)
[29352] > syscall.futex(uaddr=0x7fbf4b3ae000 op=0x0 val=0x8000756b utime=0x0 uaddr2=0x80 val3=0x0)
[29352] > futex_wait(uaddr=0x7fbf4b3ae000 flags=0x1 val=0x8000756b abs_time=0x0 bitset=0xffffffff)
[29352] > futex_wait_setup(uaddr=0x7fbf4b3ae000 val=0x8000756b flags=0x1 q=0xffff88027e8cbdc8 hb=0xffff88027e8cbd58)
[29352] < futex_wait() return=0x0
[29352] > futex_wait_queue_me(hb=0xffffc9000189c840 q=0xffff88027e8cbdc8 timeout=0x0)
[29352] > schedule()
[30059]  signal.send SIGCHLD to 29352
[30059] > schedule()

  and it is dead locked at the end.

29352 is the main process and 30059 is the last forked
process. (Note that the test case only forks one child
process at a time, and tries to "race" with it to get
the mutex lock).
Comment 5 Paulo Andrade 2016-01-27 18:42:38 UTC
  Adding extra kernel probes to handle_futex_death, mm_release and
exit_robust_list, it appears the robust list is in an inconsistent
state in exit_robust_list(), passing an invalid "user" pointer to
handle_futex_death(). This appears to happen when "pending" is not
null in exit_robust_list.

  Extra sample log, of the previous working process, and the one
locking:

[1648] > syscall.set_robust_list(head=0x7ff55806fa20 len=0x18)
[1648] < syscall.set_robust_list() return=0x0
[755]  signal.send SIGTERM to 1648
[1648] > __pthread_mutex_lock(mutex=0x7ff558093000)
[755] > __pthread_mutex_lock(mutex=0x7ff558093000)
[1648]  do_exit(code=0xf)
[755] > __pthread_mutex_lock_full(mutex=0x7ff558093000)
[1648]  exit_mm(tsk=0xffff8800d252e780)
[755] < 0x400f98() return=0x0
[1648] > mm_release(tsk=0xffff8800d252e780 mm=0xffff88019a64e400)
[755] < 0x400f98() return=0x0
[755] > syscall.wait4(upid=0x670 stat_addr=0x7ffdb7e2e80c options=0x0 ru=0x0)
[1648] > exit_robust_list(curr=0xffff8800d252e780)
head = 0x7ff55806fa20
entry = 0x7ff558093020
futex_offset = -32
pending = 0x0
[755] > schedule()
[1648] > handle_futex_death(uaddr=0x7ff558093000 curr=0xffff8800d252e780 pi=0x0)
[1648] < handle_futex_death() return=0x0
[1648]   exit_robust_list locals head=? entry=? next_entry=? pending=? limit=? pi=? pip=0x0 next_pi=? futex_offset=? rc=?
RIP: ffffffff81078553
RSP: ffff880118a63cd0  EFLAGS: 00000282
RAX: 0000000000000000 RBX: ffff88019a64e400 RCX: ffff880118a63fd8
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800d252e780
RBP: ffff880118a63ce0 R08: 0000000000000000 R09: 000000018020001f
R10: ffffea000c1e6400 R11: 0000000000000004 R12: ffff8800d252e780
R13: ffff8800d252e780 R14: ffff88011df60100 R15: ffff8800d252e780
FS:  00007ff55806f740(0000) GS:ffff88031e280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff557955290 CR3: 00000002881d4000 CR4: 00000000001407e0
[1648] < exit_robust_list() 
[1648] < mm_release() 
[1648]  signal.send SIGCHLD to 755
[1648] > schedule()
[755] < schedule() 
[755] < syscall.wait4() return=0x670
[755] > schedule()
[1649] > syscall.set_robust_list(head=0x7ff55806fa20 len=0x18)
[1649] < syscall.set_robust_list() return=0x0
[1649] > __pthread_mutex_lock(mutex=0x7ff558093000)
[1649] > __pthread_mutex_lock_full(mutex=0x7ff558093000)
[1649] > syscall.futex(uaddr=0x7ff558093000 op=0x0 val=0x800002f3 utime=0x0 uaddr2=0x80 val3=0x0)
[1649] > futex_wait(uaddr=0x7ff558093000 flags=0x1 val=0x800002f3 abs_time=0x0 bitset=0xffffffff)
[1649] > futex_wait_setup(uaddr=0x7ff558093000 val=0x800002f3 flags=0x1 q=0xffff8801a2b0bdc8 hb=0xffff8801a2b0bd58)
[1649] < futex_wait() return=0x0
[1649] > futex_wait_queue_me(hb=0xffffc9000189fb00 q=0xffff8801a2b0bdc8 timeout=0x0)
[1649] > schedule()
[755] > pthread_mutex_unlock(mutex=0x7ff558093000)
[755] > pthread_mutex_unlock(decr=0x1 mutex=0x7ff558093000)
[755] > __pthread_mutex_unlock_full(mutex=0x7ff558093000 decr=0x1)
[755] > syscall.futex(uaddr=0x7ff558093000 op=0x1 val=0x1 utime=0x0 uaddr2=0x7ff558093000 val3=0x8)
[755] < syscall.futex() return=0x1
[1649] < schedule() 
[755] < 0x400fed() return=0x0
[755] < 0x400fed() return=0x0
[1649] < futex_wait() 
[1649] < do_futex() return=0x0
[1649] < syscall.futex() return=0x0
[755]  signal.send SIGTERM to 1649
[1649]  do_exit(code=0xf)
[755] > __pthread_mutex_lock(mutex=0x7ff558093000)
[755] > __pthread_mutex_lock_full(mutex=0x7ff558093000)
[1649]  exit_mm(tsk=0xffff8800d252a280)
[755] > syscall.futex(uaddr=0x7ff558093000 op=0x0 val=0x80000671 utime=0x0 uaddr2=0x80 val3=0x0)
[1649] > mm_release(tsk=0xffff8800d252a280 mm=0xffff88019a649f40)
[755] > futex_wait(uaddr=0x7ff558093000 flags=0x1 val=0x80000671 abs_time=0x0 bitset=0xffffffff)
[1649] > exit_robust_list(curr=0xffff8800d252a280)
head = 0x7ff55806fa20
entry = 0x7ff558093020
futex_offset = -32
pending = 0x7ff558093020
[755] > futex_wait_setup(uaddr=0x7ff558093000 val=0x80000671 flags=0x1 q=0xffff880115593dc8 hb=0xffff880115593d58)
[755] < futex_wait() return=0x0
[1649] > handle_futex_death(uaddr=0xffffffffffffffe0 curr=0xffff8800d252a280 pi=0x0)
[755] > futex_wait_queue_me(hb=0xffffc9000189fb00 q=0xffff880115593dc8 timeout=0x0)
[1649] < handle_futex_death() return=0xffffffffffffffff
[755] > schedule()
[1649]   exit_robust_list locals head=? entry=? next_entry=? pending=? limit=? pi=? pip=0x0 next_pi=? futex_offset=? rc=?
RIP: ffffffff81078553
RSP: ffff8801a2b0bcd0  EFLAGS: 00000282
RAX: 00000000ffffffff RBX: ffff88019a649f40 RCX: ffff8801a2b0bfd8
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffe0
RBP: ffff8801a2b0bce0 R08: 0000000000000000 R09: 000000018020001e
R10: ffffea000c1e6400 R11: 0000000000000004 R12: ffff8800d252a280
R13: ffff8800d252a280 R14: ffff88011df60100 R15: ffff8800d252a280
FS:  00007ff55806f740(0000) GS:ffff88031e280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000002a337b000 CR4: 00000000001407e0
[1649] < exit_robust_list() 
[1649] < mm_release() 
[1649]  signal.send SIGCHLD to 755
[1649] > schedule()
Comment 6 Paulo Andrade 2016-01-27 18:44:09 UTC
Created attachment 8927 [details]
robust_test.stp

Updated stap script. Was run, as root with the command:

# while :; do stap --ldd robust_test.stp -c ./robust_test2; done

sometimes it triggers quickly the issue, sometimes it
takes up to 10 minutes...
Comment 7 Paulo Andrade 2016-01-29 13:10:22 UTC
Created attachment 8930 [details]
linux-kernel-test.patch

  I was thinking it could be a kernel bug, because when adding
a minor change to kernel/futex.c:exit_robust_list() I could no
longer reproduce the problem when "instrumenting" the execution
with systemtap; did not reproduce after 12+ hours, but if running
from command line, would reproduce in 15 to 30 runs.

  The kernel patch, I added to my own computer, running rhel 7.2,
was due to noticing these conditions in entry of exit_robust_list
in one of the dead locks:

head          = 0x7fb7ce4afa20
&head.list    = 0x7fb7ce4afa20
entry         = 0x7fb7ce4d3020
entry->next   = 0x0
futex_offset  = -32
pending       = 0x7fb7ce4d3020

// loop because 0x7fb7ce4d3020 != 0x7fb7ce4afa20
	while (entry != &head->list) {

// next_entry is set (again) to NULL
		rc = fetch_robust_entry(&next_entry, &entry->next, &next_pi);

// not executed, because entry == pending
		if (entry != pending)
			if (handle_futex_death((void __user *)entry + futex_offset,
						curr, pi))
				return;

// entry is set to NULL
		entry = next_entry;

*** restart loop

// loop because NULL != 0x7fb7ce4afa20
	while (entry != &head->list) {

// executed, because NULL != pending
		if (entry != pending)
			if (handle_futex_death((void __user *)entry + futex_offset,
						curr, pi))
				return;

*** since it leaves with a return on error, it never executes:

	if (pending)
		handle_futex_death((void __user *)pending + futex_offset,
				   curr, pip);

  Apparently this is a side effect of https://lkml.org/lkml/2007/9/30/55

  But I wonder, could "head->list.next->next" really be NULL
in normal conditions?
Comment 8 Paulo Andrade 2016-01-29 13:12:59 UTC
Created attachment 8933 [details]
robust_test.c

After the kernel test patch, I could only reproduce the
issue with systemtap if using the alternate version that
spawns 8 child processes, instead of just 1.
Comment 9 Paulo Andrade 2016-01-29 13:20:30 UTC
Created attachment 8934 [details]
robust_test.stp

Updated stap script, and s/robust_test2/robust_test/

  Now I can easily trigger again the issue, instrumenting
with systemtap:

0 S root     19653 19650  0  80   0 -  1584 kretpr 11:13 pts/1    00:00:00 ./robust_test
1 S root     19656 19653  0  80   0 -  1582 kretpr 11:13 pts/1    00:00:00 ./robust_test
1 Z root     19708 19653  0  80   0 -     0 kretpr 11:13 pts/1    00:00:00 [robust_test] <defunct>
1 Z root     19711 19653  0  80   0 -     0 kretpr 11:13 pts/1    00:00:00 [robust_test] <defunct>
1 S root     19712 19653  0  80   0 -  1584 kretpr 11:13 pts/1    00:00:00 ./robust_test
1 S root     19713 19653  0  80   0 -  1584 kretpr 11:13 pts/1    00:00:00 ./robust_test
1 Z root     19714 19653  0  80   0 -     0 kretpr 11:13 pts/1    00:00:00 [robust_test] <defunct>
1 S root     19715 19653  0  80   0 -  1584 kretpr 11:13 pts/1    00:00:00 ./robust_test
1 S root     19716 19653  0  80   0 -  1584 kretpr 11:13 pts/1    00:00:00 ./robust_test
1 Z root     19717 19653  0  80   0 -     0 kretpr 11:13 pts/1    00:00:00 [robust_test] <defunct>

 A few pages of the systemtap output:

[19697] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff8800c1608000 pi=0x0)
[19697] < handle_futex_death() return=0x0
[19697] < exit_robust_list() 
[19697] < mm_release() 
[19653] > schedule()
[19653] < schedule() 
[19697]  signal.send SIGCHLD to 19653
[19697] > schedule()
[19653] > syscall.wait4(upid=0x4cf1 stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] < syscall.wait4() return=0x4cf1
[19707] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19707] < syscall.set_robust_list() return=0x0
[19707] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19707] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19707] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19707] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19653] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19707] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88020b5c3dc8 hb=0xffff88020b5c3d58)
[19707] < futex_wait() return=0x0
[19707] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020b5c3dc8 timeout=0x0)
[19707] > schedule()
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9a9a10 uaddr2=0x7f87da9cd000 val3=0x4cc5)
[19653] < syscall.futex() return=0x1
[19702] < schedule() 
[19653] < 0x4011a7() return=0x0
[19702] < futex_wait() 
[19653] < 0x4011a7() return=0x0
[19702] < do_futex() return=0x0
[19708] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19702] < syscall.futex() return=0x0
[19708] < syscall.set_robust_list() return=0x0
[19653]  signal.send SIGTERM to 19701
[19702] < 0x401122() return=0x0
[19702] < 0x401122() return=0x0
[19701] < schedule() 
[19653] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19701] < futex_wait() 
[19702] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19701] < do_futex() return=0xfffffffffffffe00
[19653] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19701] < syscall.futex() return=0xfffffffffffffe00
[19702] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19701]  do_exit(code=0xf)
[19702] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19701]  exit_mm(tsk=0xffff88030a4d4500)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cf6 utime=0x0 uaddr2=0x80 val3=0x0)
[19701] > mm_release(tsk=0xffff88030a4d4500 mm=0xffff8803046e12c0)
[19653] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cf6 abs_time=0x0 bitset=0xffffffff)
[19702] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9cd020 uaddr2=0x7f87da9cd000 val3=0x0)
[19701] > exit_robust_list(curr=0xffff88030a4d4500)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x0
  futex_offset = -32
  pending      = 0x7f87da9cd020
[19653] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cf6 flags=0x1 q=0xffff8801df113dc8 hb=0xffff8801df113d58)
[19702] < syscall.futex() return=0x1
[19701] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d4500 pi=0x0)
[19653] < futex_wait() return=0xfffffffffffffff5
[19702] < 0x4011a7() return=0x0
[19701] < handle_futex_death() return=0x0
[19701] < exit_robust_list() 
[19653] < do_futex() return=0xfffffffffffffff5
[19702] < 0x4011a7() return=0x0
[19701] < mm_release() 
[19653] < syscall.futex() return=0xfffffffffffffff5
[19653] < 0x401122() return=0x0
[19708] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19653] < 0x401122() return=0x0
[19653] > syscall.wait4(upid=0x4cf5 stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] > schedule()
[19702] > __pthread_mutex_lock(mutex=0x7f87da9d0908)
[19708] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19702] < _dl_fini() return=0x0
[19708] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19708] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19708] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff8801de85bdc8 hb=0xffff8801de85bd58)
[19708] < futex_wait() return=0x0
[19702] > pthread_mutex_unlock(mutex=0x7f87da9d0908)
[19708] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff8801de85bdc8 timeout=0x0)
[19708] > schedule()
[19702] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9d0908)
[19702] < _dl_fini() return=0x0
[19702]  do_exit(code=0x0)
[19702]  exit_mm(tsk=0xffff88030a4d0b80)
[19701]  signal.send SIGCHLD to 19653
[19702] > mm_release(tsk=0xffff88030a4d0b80 mm=0xffff8803046e3840)
[19701] > schedule()
[19653] < schedule() 
[19703] < schedule() 
[19702] > exit_robust_list(curr=0xffff88030a4d0b80)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x7f87da9a9a20
  futex_offset = -32
  pending      = 0x0
[19703] < futex_wait() 
[19702] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d0b80 pi=0x0)
[19653] < syscall.wait4() return=0x4cf5
[19702] < handle_futex_death() return=0x0
[19703] < do_futex() return=0x0
[19702] < exit_robust_list() 
[19703] < syscall.futex() return=0x0
[19702] < mm_release() 
[19703] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19703] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19703] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88026a2fbdc8 hb=0xffff88026a2fbd58)
[19703] < futex_wait() return=0x0
[19703] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88026a2fbdc8 timeout=0x0)
[19703] > schedule()
[19702] > schedule()
[19702] < schedule() 
[19702]  signal.send SIGCHLD to 19653
[19702] > schedule()
[19653] > syscall.wait4(upid=0x4cf6 stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] < syscall.wait4() return=0x4cf6
[19709] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19709] < syscall.set_robust_list() return=0x0
[19709] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19709] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19709] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19709] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9a9a10 uaddr2=0x7f87da9cd000 val3=0x4cc5)
[19710] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19710] < syscall.set_robust_list() return=0x0
[19709] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff8801de85fdc8 hb=0xffff8801de85fd58)
[19653] < syscall.futex() return=0x1
[19709] < futex_wait() return=0xfffffffffffffff5
[19653] < 0x4011a7() return=0x0
[19709] < do_futex() return=0xfffffffffffffff5
[19653] < 0x4011a7() return=0x0
[19709] < syscall.futex() return=0xfffffffffffffff5
[19709] < 0x401122() return=0x0
[19653]  signal.send SIGTERM to 19703
[19709] < 0x401122() return=0x0
[19653] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19709] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19703] < schedule() 
[19703] < futex_wait() 
[19709] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19703] < do_futex() return=0xfffffffffffffe00
[19703] < syscall.futex() return=0xfffffffffffffe00
[19709] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19703]  do_exit(code=0xf)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cfd utime=0x0 uaddr2=0x80 val3=0x0)
[19703]  exit_mm(tsk=0xffff88030a4d6780)
[19709] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9cd020 uaddr2=0x7f87da9cd000 val3=0x0)
[19653] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cfd abs_time=0x0 bitset=0xffffffff)
[19703] > mm_release(tsk=0xffff88030a4d6780 mm=0xffff8803046e5dc0)
[19709] < syscall.futex() return=0x1
[19653] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cfd flags=0x1 q=0xffff8801df113dc8 hb=0xffff8801df113d58)
[19710] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19709] < 0x4011a7() return=0x0
[19653] < futex_wait() return=0xfffffffffffffff5
[19703] > exit_robust_list(curr=0xffff88030a4d6780)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x0
  futex_offset = -32
  pending      = 0x7f87da9cd020
[19709] < 0x4011a7() return=0x0
[19653] < do_futex() return=0xfffffffffffffff5
[19653] < syscall.futex() return=0xfffffffffffffff5
[19703] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d6780 pi=0x0)
[19703] < handle_futex_death() return=0x0
[19703] < exit_robust_list() 
[19653] < 0x401122() return=0x0
[19703] < mm_release() 
[19653] < 0x401122() return=0x0
[19710] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19653] > syscall.wait4(upid=0x4cfd stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19709] > __pthread_mutex_lock(mutex=0x7f87da9d0908)
[19710] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19653] > schedule()
[19709] < _dl_fini() return=0x0
[19710] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19710] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88020b5a3dc8 hb=0xffff88020b5a3d58)
[19710] < futex_wait() return=0x0
[19709] > pthread_mutex_unlock(mutex=0x7f87da9d0908)
[19709] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9d0908)
[19710] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020b5a3dc8 timeout=0x0)
[19710] > schedule()
[19709] < _dl_fini() return=0x0
[19709]  do_exit(code=0x0)
[19709]  exit_mm(tsk=0xffff88030a4d2280)
[19709] > mm_release(tsk=0xffff88030a4d2280 mm=0xffff8803046e3200)
[19709] > exit_robust_list(curr=0xffff88030a4d2280)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x7f87da9a9a20
  futex_offset = -32
  pending      = 0x0
[19703]  signal.send SIGCHLD to 19653
[19709] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d2280 pi=0x0)
[19703] > schedule()
[19709] < handle_futex_death() return=0x0
[19709] < exit_robust_list() 
[19709] < mm_release() 
[19709]  signal.send SIGCHLD to 19653
[19709] > schedule()
[19704] < schedule() 
[19653] < schedule() 
[19704] < futex_wait() 
[19704] < do_futex() return=0x0
[19704] < syscall.futex() return=0x0
[19653] < syscall.wait4() return=0x4cfd
[19704] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19704] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19704] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88026a2ffdc8 hb=0xffff88026a2ffd58)
[19704] < futex_wait() return=0x0
[19704] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88026a2ffdc8 timeout=0x0)
[19704] > schedule()
[19705] < schedule() 
[19705] < futex_wait() 
[19705] < do_futex() return=0x0
[19705] < syscall.futex() return=0x0
[19705] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19705] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19705] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff8801df15bdc8 hb=0xffff8801df15bd58)
[19705] < futex_wait() return=0x0
[19705] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff8801df15bdc8 timeout=0x0)
[19705] > schedule()
[19653] > syscall.wait4(upid=0x4cf7 stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] < syscall.wait4() return=0x4cf7
[19711] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19711] < syscall.set_robust_list() return=0x0
[19711] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19711] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19711] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19711] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19711] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88020b5a7dc8 hb=0xffff88020b5a7d58)
[19711] < futex_wait() return=0x0
[19711] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020b5a7dc8 timeout=0x0)
[19653] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19711] > schedule()
[19653] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19712] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19712] < syscall.set_robust_list() return=0x0
[19653] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x0 uaddr2=0x7f87da9cd000 val3=0x4)
[19653] < syscall.futex() return=0x1
[19706] < schedule() 
[19653] < 0x4011a7() return=0x0
[19706] < futex_wait() 
[19706] < do_futex() return=0x0
[19653] < 0x4011a7() return=0x0
[19706] < syscall.futex() return=0x0
[19706] < 0x401122() return=0x0
[19653]  signal.send SIGTERM to 19707
[19706] < 0x401122() return=0x0
[19653] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19707] < schedule() 
[19706] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19707] < futex_wait() 
[19706] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19707] < do_futex() return=0xfffffffffffffe00
[19707] < syscall.futex() return=0xfffffffffffffe00
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cfa utime=0x0 uaddr2=0x80 val3=0x0)
[19706] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19653] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cfa abs_time=0x0 bitset=0xffffffff)
[19707]  do_exit(code=0xf)
[19706] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9cd020 uaddr2=0x7f87da9cd000 val3=0x0)
[19712] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19707]  exit_mm(tsk=0xffff88030a4d3980)
[19653] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cfa flags=0x1 q=0xffff8801df113dc8 hb=0xffff8801df113d58)
[19706] < syscall.futex() return=0x1
[19707] > mm_release(tsk=0xffff88030a4d3980 mm=0xffff8803046e2580)
[19653] < futex_wait() return=0xfffffffffffffff5
[19706] < 0x4011a7() return=0x0
[19653] < do_futex() return=0xfffffffffffffff5
[19706] < 0x4011a7() return=0x0
[19653] < syscall.futex() return=0xfffffffffffffff5
[19707] > exit_robust_list(curr=0xffff88030a4d3980)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x0
  futex_offset = -32
  pending      = 0x7f87da9cd020
[19712] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19707] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d3980 pi=0x0)
[19653] < 0x401122() return=0x0
[19707] < handle_futex_death() return=0x0
[19653] < 0x401122() return=0x0
[19707] < exit_robust_list() 
[19712] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19707] < mm_release() 
[19653] > syscall.wait4(upid=0x4cfb stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19712] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19653] > schedule()
[19706] > __pthread_mutex_lock(mutex=0x7f87da9d0908)
[19712] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88020b5abdc8 hb=0xffff88020b5abd58)
[19712] < futex_wait() return=0x0
[19706] < _dl_fini() return=0x0
[19712] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020b5abdc8 timeout=0x0)
[19712] > schedule()
[19706] > pthread_mutex_unlock(mutex=0x7f87da9d0908)
[19706] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9d0908)
[19706] < _dl_fini() return=0x0
[19706]  do_exit(code=0x0)
[19706]  exit_mm(tsk=0xffff88030a4d2e00)
[19706] > mm_release(tsk=0xffff88030a4d2e00 mm=0xffff8803046e1900)
[19707]  signal.send SIGCHLD to 19653
[19706] > exit_robust_list(curr=0xffff88030a4d2e00)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x7f87da9a9a20
  futex_offset = -32
  pending      = 0x0
[19706] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d2e00 pi=0x0)
[19707] > schedule()
[19706] < handle_futex_death() return=0x0
[19706] < exit_robust_list() 
[19706] < mm_release() 
[19653] < schedule() 
[19653] < syscall.wait4() return=0x4cfb
[19706]  signal.send SIGCHLD to 19653
[19706] > schedule()
[19708] < schedule() 
[19708] < futex_wait() 
[19708] < do_futex() return=0x0
[19708] < syscall.futex() return=0x0
[19708] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19708] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19708] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff8801de85bdc8 hb=0xffff8801de85bd58)
[19708] < futex_wait() return=0x0
[19708] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff8801de85bdc8 timeout=0x0)
[19708] > schedule()
[19653] > syscall.wait4(upid=0x4cfa stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] < syscall.wait4() return=0x4cfa
[19653] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9a9a10 uaddr2=0x7f87da9cd000 val3=0x4cc5)
[19653] < syscall.futex() return=0x1
[19653] < 0x4011a7() return=0x0
[19653] < 0x4011a7() return=0x0
[19653]  signal.send SIGTERM to 19710
[19653] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19714] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19653] < 0x401122() return=0x0
[19714] < syscall.set_robust_list() return=0x0
[19653] < 0x401122() return=0x0
[19653] > syscall.wait4(upid=0x4cfe stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] > schedule()
[19713] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19713] < syscall.set_robust_list() return=0x0
[19714] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19714] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19714] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19714] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19714] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88020c567dc8 hb=0xffff88020c567d58)
[19714] < futex_wait() return=0x0
[19714] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020c567dc8 timeout=0x0)
[19714] > schedule()
[19710] < schedule() 
[19710] < futex_wait() 
[19710] < do_futex() return=0x0
[19713] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19710] < syscall.futex() return=0x0
[19710]  do_exit(code=0xf)
[19710]  exit_mm(tsk=0xffff8802bba2ae00)
[19710] > mm_release(tsk=0xffff8802bba2ae00 mm=0xffff8803046e0000)
[19713] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19710] > exit_robust_list(curr=0xffff8802bba2ae00)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x7f87da9a9a20
  futex_offset = -32
  pending      = 0x7f87da9cd020
[19710] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff8802bba2ae00 pi=0x0)
[19713] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19710] < handle_futex_death() return=0x0
[19710] < exit_robust_list() 
[19710] < mm_release() 
[19713] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19713] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff8802bbae7dc8 hb=0xffff8802bbae7d58)
[19713] < futex_wait() return=0x0
[19713] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff8802bbae7dc8 timeout=0x0)
[19713] > schedule()
[19710]  signal.send SIGCHLD to 19653
[19710] > schedule()
[19653] < schedule() 
[19653] < syscall.wait4() return=0x4cfe
[19653] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9a9a10 uaddr2=0x7f87da9cd000 val3=0x4cc5)
[19653] < syscall.futex() return=0x1
[19715] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19715] < syscall.set_robust_list() return=0x0
[19653] < 0x4011a7() return=0x0
[19653] < 0x4011a7() return=0x0
[19704] < schedule() 
[19704] < futex_wait() 
[19653]  signal.send SIGTERM to 19704
[19704] < do_futex() return=0x0
[19704] < syscall.futex() return=0x0
[19653] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19704]  do_exit(code=0xf)
[19653] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19653] < 0x401122() return=0x0
[19704]  exit_mm(tsk=0xffff88030a4d5080)
[19653] < 0x401122() return=0x0
[19704] > mm_release(tsk=0xffff88030a4d5080 mm=0xffff8803046e3e80)
[19653] > syscall.wait4(upid=0x4cf8 stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] > schedule()
[19704] > exit_robust_list(curr=0xffff88030a4d5080)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x7f87da9a9a20
  futex_offset = -32
  pending      = 0x7f87da9cd020
[19715] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19704] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d5080 pi=0x0)
[19704] < handle_futex_death() return=0x0
[19704] < exit_robust_list() 
[19704] < mm_release() 
[19715] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19715] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19715] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19715] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88020cb3bdc8 hb=0xffff88020cb3bd58)
[19715] < futex_wait() return=0x0
[19715] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020cb3bdc8 timeout=0x0)
[19715] > schedule()
[19704]  signal.send SIGCHLD to 19653
[19704] > schedule()
[19653] < schedule() 
[19653] < syscall.wait4() return=0x4cf8
[19653] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9a9a10 uaddr2=0x7f87da9cd000 val3=0x4cc5)
[19653] < syscall.futex() return=0x1
[19716] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19653] < 0x4011a7() return=0x0
[19716] < syscall.set_robust_list() return=0x0
[19653] < 0x4011a7() return=0x0
[19705] < schedule() 
[19705] < futex_wait() 
[19653]  signal.send SIGTERM to 19705
[19705] < do_futex() return=0x0
[19705] < syscall.futex() return=0x0
[19653] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19705]  do_exit(code=0xf)
[19653] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19705]  exit_mm(tsk=0xffff88030a4d0000)
[19653] < 0x401122() return=0x0
[19705] > mm_release(tsk=0xffff88030a4d0000 mm=0xffff8803046e7080)
[19653] < 0x401122() return=0x0
[19653] > syscall.wait4(upid=0x4cf9 stat_addr=0x7ffc78b0308c options=0x0 ru=0x0)
[19653] > schedule()
[19705] > exit_robust_list(curr=0xffff88030a4d0000)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x7f87da9a9a20
  futex_offset = -32
  pending      = 0x7f87da9cd020
[19705] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d0000 pi=0x0)
[19716] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19705] < handle_futex_death() return=0x0
[19705] < exit_robust_list() 
[19705] < mm_release() 
[19716] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19716] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cc5 utime=0x0 uaddr2=0x80 val3=0x0)
[19716] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cc5 abs_time=0x0 bitset=0xffffffff)
[19716] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cc5 flags=0x1 q=0xffff88020cb3fdc8 hb=0xffff88020cb3fd58)
[19716] < futex_wait() return=0x0
[19716] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020cb3fdc8 timeout=0x0)
[19716] > schedule()
[19705]  signal.send SIGCHLD to 19653
[19705] > schedule()
[19653] < schedule() 
[19653] < syscall.wait4() return=0x4cf9
[19653] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19653] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9a9a10 uaddr2=0x7f87da9cd000 val3=0x4cc5)
[19653] < syscall.futex() return=0x1
[19653] < 0x4011a7() return=0x0
[19717] > syscall.set_robust_list(head=0x7f87da9a9a20 len=0x18)
[19653] < 0x4011a7() return=0x0
[19717] < syscall.set_robust_list() return=0x0
[19711] < schedule() 
[19653]  signal.send SIGTERM to 19714
[19711] < futex_wait() 
[19653] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19711] < do_futex() return=0x0
[19711] < syscall.futex() return=0x0
[19653] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19711] < 0x401122() return=0x0
[19653] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004cff utime=0x0 uaddr2=0x80 val3=0x0)
[19711] < 0x401122() return=0x0
[19653] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004cff abs_time=0x0 bitset=0xffffffff)
[19711] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19653] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004cff flags=0x1 q=0xffff8801df113dc8 hb=0xffff8801df113d58)
[19711] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19653] < futex_wait() return=0x0
[19653] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff8801df113dc8 timeout=0x0)
[19653] > schedule()
[19711] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19711] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9cd020 uaddr2=0x7f87da9cd000 val3=0x0)
[19717] > __pthread_mutex_lock(mutex=0x7f87da9cd000)
[19711] < syscall.futex() return=0x1
[19711] < 0x4011a7() return=0x0
[19712] < schedule() 
[19711] < 0x4011a7() return=0x0
[19712] < futex_wait() 
[19712] < do_futex() return=0x0
[19717] > __pthread_mutex_lock_full(mutex=0x7f87da9cd000)
[19712] < syscall.futex() return=0x0
[19717] < 0x401122() return=0x0
[19712] > syscall.futex(uaddr=0x7f87da9cd000 op=0x0 val=0x80004d05 utime=0x0 uaddr2=0x80 val3=0x0)
[19717] < 0x401122() return=0x0
[19712] > futex_wait(uaddr=0x7f87da9cd000 flags=0x1 val=0x80004d05 abs_time=0x0 bitset=0xffffffff)
[19717] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19712] > futex_wait_setup(uaddr=0x7f87da9cd000 val=0x80004d05 flags=0x1 q=0xffff88020b5abdc8 hb=0xffff88020b5abd58)
[19711] > __pthread_mutex_lock(mutex=0x7f87da9d0908)
[19712] < futex_wait() return=0x0
[19717] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19712] > futex_wait_queue_me(hb=0xffffc90001893fc0 q=0xffff88020b5abdc8 timeout=0x0)
[19712] > schedule()
[19711] < _dl_fini() return=0x0
[19717] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19717] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9cd020 uaddr2=0x7f87da9cd000 val3=0x0)
[19711] > pthread_mutex_unlock(mutex=0x7f87da9d0908)
[19717] < syscall.futex() return=0x1
[19711] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9d0908)
[19708] < schedule() 
[19708] < futex_wait() 
[19717] < 0x4011a7() return=0x0
[19708] < do_futex() return=0x0
[19711] < _dl_fini() return=0x0
[19708] < syscall.futex() return=0x0
[19717] < 0x4011a7() return=0x0
[19708] < 0x401122() return=0x0
[19708] < 0x401122() return=0x0
[19708] > pthread_mutex_unlock(mutex=0x7f87da9cd000)
[19708] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9cd000)
[19708] > __pthread_mutex_unlock_full(mutex=0x7f87da9cd000 decr=0x1)
[19717] > __pthread_mutex_lock(mutex=0x7f87da9d0908)
[19708] > syscall.futex(uaddr=0x7f87da9cd000 op=0x1 val=0x1 utime=0x7f87da9cd020 uaddr2=0x7f87da9cd000 val3=0x0)
[19711]  do_exit(code=0x0)
[19717] < _dl_fini() return=0x0
[19708] < syscall.futex() return=0x1
[19708] < 0x4011a7() return=0x0
[19711]  exit_mm(tsk=0xffff8802bba2b980)
[19708] < 0x4011a7() return=0x0
[19711] > mm_release(tsk=0xffff8802bba2b980 mm=0xffff8803046e76c0)
[19717] > pthread_mutex_unlock(mutex=0x7f87da9d0908)
[19711] > exit_robust_list(curr=0xffff8802bba2b980)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x0
  futex_offset = -32
  pending      = 0x0
[19717] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9d0908)
[19711] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff8802bba2b980 pi=0x0)
[19708] > __pthread_mutex_lock(mutex=0x7f87da9d0908)
[19717] < _dl_fini() return=0x0
[19711] < handle_futex_death() return=0x0
[19711] < exit_robust_list() 
[19708] < _dl_fini() return=0x0
[19711] < mm_release() 
[19708] > pthread_mutex_unlock(mutex=0x7f87da9d0908)
[19708] > pthread_mutex_unlock(decr=0x1 mutex=0x7f87da9d0908)
[19717]  do_exit(code=0x0)
[19708] < _dl_fini() return=0x0
[19717]  exit_mm(tsk=0xffff8802bba2dc00)
[19717] > mm_release(tsk=0xffff8802bba2dc00 mm=0xffff8803046e1900)
[19717] > exit_robust_list(curr=0xffff8802bba2dc00)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x0
  futex_offset = -32
  pending      = 0x0
[19708]  do_exit(code=0x0)
[19717] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff8802bba2dc00 pi=0x0)
[19717] < handle_futex_death() return=0x0
[19708]  exit_mm(tsk=0xffff88030a4d7300)
[19717] < exit_robust_list() 
[19717] < mm_release() 
[19708] > mm_release(tsk=0xffff88030a4d7300 mm=0xffff8803046e5780)
[19708] > exit_robust_list(curr=0xffff88030a4d7300)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x0
  futex_offset = -32
  pending      = 0x0
[19708] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff88030a4d7300 pi=0x0)
[19708] < handle_futex_death() return=0x0
[19708] < exit_robust_list() 
[19708] < mm_release() 
[19711]  signal.send SIGCHLD to 19653
[19711] > schedule()
[19717]  signal.send SIGCHLD to 19653
[19717] > schedule()
[19714] < schedule() 
[19714] < futex_wait() 
[19714] < do_futex() return=0x0
[19714] < syscall.futex() return=0x0
[19708]  signal.send SIGCHLD to 19653
[19708] > schedule()
[19714]  do_exit(code=0xf)
[19714]  exit_mm(tsk=0xffff8802bba2a280)
[19714] > mm_release(tsk=0xffff8802bba2a280 mm=0xffff8803046e12c0)
[19714] > exit_robust_list(curr=0xffff8802bba2a280)
  head         = 0x7f87da9a9a20
  &head.list   = 0x7f87da9a9a20
  entry        = 0x7f87da9cd020
  next         = 0x0
  futex_offset = -32
  pending      = 0x7f87da9cd020
[19714] > handle_futex_death(uaddr=0x7f87da9cd000 curr=0xffff8802bba2a280 pi=0x0)
[19714] < handle_futex_death() return=0x0
[19714] < exit_robust_list() 
[19714] < mm_release() 
[19714]  signal.send SIGCHLD to 19653
[19714] > schedule()

  So, I wonder if the below conditions can become true...

1. process A spawns process B and race for a robust mutex
2. process B has robust mutex locked or in process of locking it
3. process B dies
4. while cleaning up from process B death, process C is started
5. during fork, robust_list of A or C may become inconsistent
   because only one process would call pthread_mutex_consistent,
   so, while the mutex would be recovered, the robust list would
   not
6. C also dies with mutex locked causing corruption and dead lock
Comment 10 Paulo Andrade 2016-01-29 19:26:09 UTC
  As an extra note, returning to test robust_test2, but
changing MAX_CHILDREN to 5 I can again trigger a dead lock
with systemtap.

  It enters an "infinite" loop in exit_robust_list due to a
circular list (well, it leaves due to the kernel preventing
it, looping '#define ROBUST_LIST_LIMIT	2048' times):

[16445] > exit_robust_list(curr=0xffff880305c49700)
  head         = 0x7f7b510f2a20
  entry        = 0x7f7b51116020
  next         = 0x7f7b51116020
  pending      = 0x0
Comment 11 Paulo Andrade 2016-02-01 17:15:53 UTC
[trying to get more information...]

  For robust_test2.c, with MAX_CHILDREN set to 1, but
with this (pseudo) patch:

 	CHECKED(pthread_mutexattr_setrobust, &attr, PTHREAD_MUTEX_ROBUST);
+	CHECKED(pthread_mutexattr_setprotocol, &attr, PTHREAD_PRIO_INHERIT);
 
 	CHECKED(pthread_mutex_init, mutex, &attr);

*and* the kernel patch I posted earlier, I cannot reproduce
the problem in rhel 7.2.

  Below, testing with the robust_test2 patch, but no
kernel patch on "real" hardware:
---
Recent rawhide I get an assertion shortly:
$ ./robust_test2 
Done 0 iterations
Done 100 iterations
Done 200 iterations
robust_test2: ../nptl/pthread_mutex_lock.c:352: __pthread_mutex_lock_full: Assertion `INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust' failed.

Older rhel7 (> 7.0 < 7.2) kernel it dead locks very quickly.
Comment 12 Carlos O'Donell 2016-12-20 00:05:32 UTC
Even with the fix for bug 20973 I still saw an intermittent hang in tst-robust8.