This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

BZ 20822 :powerpc: race condition in __lll_unlock_elision



The initial problem reported was memory corruption in MongoDB only on Ubuntu 16.04 ppc64le(not on Ubuntu 15). The problem was not easily recreatable and debugging with many tools and creative ideas, the problem is narrowed down to lock elision in glibc. Ubuntu 16.04 has glibc 2.23 with lock elision enabled by default which made the testcase fail only on 16.04.

As stated in BZ 20822, short description is

"The update of *adapt_count after the release of the lock causes a race condition. Thread A unlocks, thread B continues and returns, destroying mutex on stack, then gets into another function, thread A writes to *adapt_count and corrupts stack.The window is very narrow and requires that the machine be in smt mode, so likely the two threads have to be on the same core in order to hit this"

Looking back the file changes in sysdeps/unix/sysv/linux/powerpc/elision-unlock.c, suspecting the commit
 https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=fadd2ad9cc36115440d50b0eae9299e65988917d which is introduced to improve
the performance.

Thinking of the following fixes.

1) Update of adapt_count before the unlock. But I have still not identified if its going to affect the performance.

2) In elision-lock.c/elision-trylock.c update adapt_count right after the lock acquire at the end of the function. In either case we have to be careful not to introduce another race condition.

Any suggestions?

--
Thanks
Rajalakshmi S


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]