This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] Propose fix for race conditions in pthread cancellation (bz#12683)


Hi all,

I have summarized in [1] the current issues with GLIBC pthread cancellation system,
the current GLIBC implementation and the proposed solution by Rich Felker with the
adjustment required to enabled it on GLIBC.

It is still heavily WIP and I'm still planning to add more content, so any question,
comments, advices are welcomed.

The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
but initial results are promising. It is building on both powerpc64 and x86_64 
(it won't build on others platforms basically because I rewrite the way cancelable
syscalls are done).

Current NPTL testcase are all passing but:

FAIL: nptl/tst-cancel-wrappers
FAIL: nptl/tst-cancel20
FAIL: nptl/tst-cancel21-static
FAIL: nptl/tst-cancel4
FAIL: nptl/tst-cancel5
FAIL: nptl/tst-cancelx20
FAIL: nptl/tst-cancelx21
FAIL: nptl/tst-cancelx4
FAIL: nptl/tst-cancelx5
FAIL: nptl/tst-detach1

The 'nptl/tst-cancel-wrappers' is expected since I get rid of the 
enable_asynccancel/disable_asynccancel function, but the other are due the fact now
cancellation *will not* on one important case:

* syscall is blocked but with some side effects already having taken place (for
  instance partial read/write/send/etc.)

This is the cases for tst-cancel[4/5] that checks for cancelable write and send
and the way the test is code, kernel IP address from signal handler is *after*
syscall, indicating partial read/send.  Similar cases occurs for tst-cancel[20|21],
where the read returns after the syscall in pipe reading. I'm still checking
nptl/tst-detach1.

Anyway, now I would like comments about proposed solution and if the cases for
new failures should not be allowed or if testcases now should be adjusted.

I also note that this new implementation shows correct behavior on the testcases
from bug reported and replicated on bugzilla: first one does not show leaked
file descriptors and second correctly hangs.

[1] https://sourceware.org/glibc/wiki/Release/2.21/bz12683
[2] https://github.com/zatrazz/glibc/commits/master-bz12683


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]