POSIX thread cancellation

POSIX thread cancellation aims to provide a method for terminating a thread of execution.

The POSIX thread cancellation API includes 6 functions:

Threads will set their cancel state, type, push and pop cleanup handlers to quiesce state in the event of a cancellation, cancel other threads, and test for their own cancellation.

The API seems relatively simple, but as with all concurrent APIs there is a considerably level of complexity that is not entirely clear at first.

This document aims to describe the more complex behaviour of the GNU implementation when using gcc and glibc to build your application.

Can I use cancellation?

Using cancellation requires that the application and all libraries be aware that cancellation is being used and act accordingly.

This means the following:

There is the additional implementation dependent requirement for C++ in gcc and glibc:

The reason for this is more obscure. The cancellation support in glibc is provided by the unwinding machinery in the compiler. Destructors are listed as noexcept and the compiler expects no throwing from them. Because cancellation is a form of exception, as it tries to unwind beyond the destructor it will terminate the process because no throwing is allowed from the destructor. This can't be worked around since the compiler may use noexcept to make changes to the generated code and unwind tables that make it impossible to cancel from that region i.e. there is not enough information recorded.

As you can see, cancellation requires significant coordination between the application and libraries it uses. It is still a useful feature, but may not be readily useful for common off-the-shelf C or C++ libraries.

Asynchronous cancellation safety

The purpose of asynchronous cancellation is to allow purely computations threads to be interrupted. Asynchronous cancellation is not intended for any other purpose.

When asynchronous cancellation is active, POSIX states that only three functions are safe to call:

In summary, you can cancel yourself, disable cancellation, or move from asynchronous cancellation to deferred cancellation. All of these steps move your out of asynchronous cancellation.

The additional implementation details are as follows:

Asynchronous cancellation is a very special case feature used by very few applications and in very special cases. This makes it unlikely that you will have problems if you only carry out computation in the code that has asynchronous cancellation enabled.

Deferred cancellation and signals

A common pitfall when using deferred cancellation and signals is to fail to realize the compositional issues with both of these features.

If you enable deferred cancellation, and receive an asynchronous signal, and if during the asynchronous signal handler you call a function that has deferred cancellation enabled, it is the semantic equivalent of having enabled asynchronous cancellation. For example calling stat in a signal handler is allowed, but if you do this in a signal handler that interrupts a deferred cancellation region, it will cause the cancellation to be immediately acted upon. The cancellation will then attempt to run cleanup handlers in asynchronous-signal context and that could be problematic if those cleanup handlers were not asynchronous-signal safe.

Again this underscores the need to coordinate the entire application and libraries if cancellation is to be supported safely.

One way to make cancellation easier to use is to ignore deferred cancellation in a signal handler, and delay the handling until the signal handler returns.

Cancellation and C++

As already discussed earlier, C++ destructors are marked noexcept regions from which cancellations cannot be started. This means that cancellations must be deferred until a later point. In practice this is not enforced in the GNU runtimes, and enabling cancellation in C++ requires those actions noted earlier in this document e.g. no destructors may call functions which are cancellation points.

Harmonizing cancellation in C and C++

The biggest problem in the glibc cancellation implementation is the various interactions with C++.

The known issues are:

Overcoming these issues is not impossible and we list here one way to do this.

In glibc we already have what is called nocancel entry points for many functions which would otherwise be cancellation points (syscall wrappers in particular). Calling one of these functions is the equivalent of disabling cancellation, calling the function, and then enabling cancellation, but without having to pay the cost of doing two additional function calls and their state manipulation.

If all C++ code was compiled so as to cause cancellation point functions (all 247 possible functions) to call their non-cancellation enabled entry points, then all C++ code would be free of any cancellation points without any real additional cost. This would allow C++ application developers to make use of pthread_testcancel to add a cancellation point at specific places in their C++ code that would be valid places e.g. outside of destructors and noexcept regions. The call to pthread_testcancel would also prevent the compiler from optimizing away the unwind tables needed to unwind from the function calling the routine. Care would still need to be taken when calling other C libraries because they may contain calls to cancellation points, but this is no different than the normal inter-library coordination required to enable cancellation. At the very least these changes would ensure that C++ code would be able to use C library functions safely without introducing cancellation points.

A simpler alternative would be to add a thread attribute in glibc which indicates that cancellation is disabled. The C++ library would use the glibc thread attribute and disable cancellation for all C++ started threads. Then as a final resort for adding cancellation where needed teh developer could add calls to pthread_testcancel which would ignore the thread attribute and cause a cancellation point. Developers could use pthread_testcancel when they know it to be safe to throw an exception.

Compiling a distribution

Compiling the distribution with the ability to interoperate with C++ and cancellation means that we need unwind tables enabled everywhere. There is no technical reason not to compile your entire distribution with unwind tables to support interoperability with C++ and POSIX thread cancellation. These unwind tables should at a minimum support synchronous exceptions.

Lastly, calling a function that is a cancellation point in an asynchronous signal handler with deferred cancellation is the equivalent of acting upon asynchronous cancellation. In practice this requires everything that can be interrupted by the signal handler to be compiled with -fasynchronous-unwind-tables, it requires the cleanup handlers to be async-signal safe, and this is almost an impossible requirement to meet in a distribution. Therefore deferred cancellation handling from a signal handler should be considered undefined behaviour.

To summarize the suggested options for compiling a distribution:

None: Cancellation (last edited 2017-08-29 14:35:13 by CarlosODonell)