This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/12683] New: Race conditions in pthread cancellation


http://sourceware.org/bugzilla/show_bug.cgi?id=12683

           Summary: Race conditions in pthread cancellation
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: critical
          Priority: P2
         Component: nptl
        AssignedTo: drepper.fsp@gmail.com
        ReportedBy: bugdal@aerifal.cx


Created attachment 5676
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5676
Demonstration of file descriptor leak due to problem 1

The current approach to implementing pthread cancellation points is to enable
asynchronous cancellation prior to making the syscall, and restore the previous
cancellation type once the syscall returns. I've asked around and heard
conflicting answers as to whether this violates the requirements in POSIX (I
believe it does), but either way, from a quality of implementation standpoint
this approach is very undesirable due to at least 2 problems, the latter of
which is very serious:

1. Cancellation can act after the syscall has returned from kernelspace, but
before userspace saves the return value. This results in a resource leak if the
syscall allocated a resource, and there is no way to patch over it with
cancellation handlers. Even if the syscall did not allocate a resource, it may
have had an effect (like consuming data from a socket/pipe/terminal buffer)
which the application will never see.

2. If a signal is handled while the thread is blocked at a cancellable syscall,
the entire signal handler runs with asynchronous cancellation enabled. This
could be extremely dangerous, since the signal handler may call functions which
are async-signal-safe but not async-cancel-safe. Even worse, the signal handler
may call functions which are not even async-signal-safe (like stdio) if it
knows the interrupted code could only be using async-signal-safe functions, and
having a thread asynchronously terminated while modifying such functions'
internal data structures could lead to serious program malfunction.

I am attaching simple programs which demonstrate both issues.

The solution to problem 2 is making the thread's current execution context
(e.g. stack pointer) at syscall time part of the cancellability state, so that
cancellation requests received while the cancellation point is interrupted by a
signal handler can identify that the thread is not presently in the cancellable
context.

The solution to problem 1 is making successful return from kernelspace and
exiting the cancellable state an atomic operation. While at first this seems
impossible without kernel support, I have a working implementation in musl
(http://www.etalabs.net/musl) which solves both problems.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]