This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: pthread_kill: signals remain pending after target thread exits


> From: Corinna Vinschen [corinna-cygwin@cygwin.com]
> Sent: Wednesday, October 21, 2015 4:48 AM
> Subject: Re: pthread_kill: signals remain pending after target thread exits
...
> > On Sep 11 18:11, John Carey wrote:
> > There seems to be a problem with pthread_kill: a pending signal
> > targeting a particular thread prevents other threads from receiving
> > signals sharing the same signal number--even after the original target
> > thread exits and is joined.
...
> The important thing here is to get rid of the pending signal.

Yes, I agree that is the most important thing.

> > In my view it would be desirable if:
> >
> >   - Pending signals targeting a particular thread would not outlast
> >   that thread.
> 
> Since you looked into the code anyway, do you have an idea how to
> implement that?  For a start, do you have a simple testcase, only
> the bare code needed to reproduce the issue?

I've attached a test case that I *think* gets into the right spot, at least for 64-bit Cygwin 2.0.4.  That is, it hangs trying to receive the signal, instead of terminating.  (This test passes (terminates) in 32-bit Cygwin 1.7.9 and 64-bit Ubuntu 14.04.3 LTS.)

As to a fix: sorry, but though I looked at the code, I am not sufficiently confident to suggest a specific change.  I think that the internal signal handling thread has exclusive access to the pending signal collection, which is one difficulty.  And I'm not sure how the race is resolved between something trying to use the cygtls and the cygtls being destroyed.  At a guess, there are at least two general approaches to a fix:

1. Somehow prevent new signals from being sent to the terminating thread, then notify the internal signal handling thread of the need to purge pending signals targeting the doomed thread, then delay cygtls destruction until confirmation that that purge is complete.

2. In the pending signal representation, replace the direct cygtls address with a pointer to some small reference-counted object associated with the cygtls.  That small object could live on for a while, even after the original cygtls has been destroyed and its memory reused for a new cygtls, so that the signal processing thread can take its time purging references.  But there has to be some way to atomically do two things: 1) check whether this small object still points to a valid cygtls, and 2) if it does, delay destruction of that cygtls until some task has been performed (such as processing a signal).  Perhaps this small object could contain an invalidation flag and some synchronization objects (mutex, condition variable, etc.) in addition to the raw cygtls pointer.

> >   - Multiple pending signals targeting different threads could
> >   coexist, even if they shared the same signal number.  This happens
> >   on Linux (Ubuntu 14.04.3), where I can generate two signals for two
> >   different threads, then sleep for a bit in each target thread, and
> >   finally have each thread receive its signal with sigwait()--neither
> >   signal is lost during the sleeping period.
> 
> That requires to extend the handling for pending signals.  That's
> a rather bigger task...

Yeah.  It's nice if threads don't interfere with each other, but this part would indeed be harder to change.

-- John Carey
/* Copyright (c) 2015, Electric Cloud, Inc.
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 *   - Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *
 *   - Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in the
 *     documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
 * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */

/* This test program demonstates a Cygwin bug in which a signal sent
 * to a particular thread remains pending after the thread terminates.
 */

#include <unistd.h>
#include <pthread.h>
#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
static int fully_up = 0;
static int allow_exit = 0;
static int about_to_sigwait = 0;
static int signal_received = -1;

static void check_syscall(char const *context, int result)
{
    if (result == -1) {
        fprintf(stderr, "%s: %s\n", context, strerror(errno));
        exit(EXIT_FAILURE);
    }
}

static void check_threadcall(char const *context, int error_number)
{
    if (error_number) {
        fprintf(stderr, "%s: %s\n", context, strerror(error_number));
        exit(EXIT_FAILURE);
    }
}

static void *thread1(void *arg)
{
    check_threadcall("pthread_mutex_lock",
            pthread_mutex_lock(&mutex));

    fully_up = 1;

    check_threadcall("pthread_cond_broadcast",
            pthread_cond_broadcast(&cond));

    while (! allow_exit) {
        pthread_cond_wait(&cond, &mutex);
    }

    check_threadcall("pthread_mutex_unlock",
            pthread_mutex_unlock(&mutex));
    return NULL;
}

static void *thread2(void *arg)
{
    sigset_t set;
    int sig;

    check_syscall("sigemptyset",
            sigemptyset(&set));

    check_syscall("sigaddset",
            sigaddset(&set, SIGTERM));

    check_threadcall("pthread_mutex_lock",
            pthread_mutex_lock(&mutex));

    about_to_sigwait = 1;

    check_threadcall("pthread_cond_broadcast",
            pthread_cond_broadcast(&cond));

    check_threadcall("pthread_mutex_unlock",
            pthread_mutex_unlock(&mutex));

    /* Try to receive a signal. */

    sig = 0;
    check_threadcall("sigwait",
            sigwait(&set, &sig));

    check_threadcall("pthread_mutex_lock",
            pthread_mutex_lock(&mutex));

    signal_received = sig;

    check_threadcall("pthread_cond_broadcast",
            pthread_cond_broadcast(&cond));

    check_threadcall("pthread_mutex_unlock",
            pthread_mutex_unlock(&mutex));

    return NULL;
}

int main(int argc, char **argv)
{
    sigset_t set;
    pthread_t tid1, tid2;
    int sig;

    /* Block SIGTERM for this thread and all threads it creates. */

    check_syscall("sigemptyset",
            sigemptyset(&set));

    check_syscall("sigaddset",
            sigaddset(&set, SIGTERM));

    check_threadcall("pthread_sigmask",
            pthread_sigmask(SIG_BLOCK, &set, NULL));

    check_threadcall("pthread_create",
            pthread_create(&tid1, NULL, thread1, NULL));

    /* Just be sure, wait until a known point in the other thread. */

    check_threadcall("pthread_mutex_lock",
            pthread_mutex_lock(&mutex));

    while (! fully_up) {
        pthread_cond_wait(&cond, &mutex);
    }

    /* Send a signal specifically to that other thread
     * (as opposed to the process as a whole). */

    check_threadcall("pthread_kill",
            pthread_kill(tid1, SIGTERM));

    /* Allow the other thread to terminate. */

    allow_exit = 1;

    check_threadcall("pthread_cond_broadcast",
            pthread_cond_broadcast(&cond));

    check_threadcall("pthread_mutex_unlock",
            pthread_mutex_unlock(&mutex));

    /* Join the other thread. */

    check_threadcall("pthread_join",
            pthread_join(tid1, NULL));

    /* At this point the other thread is gone, but because of the bug
     * there is still a SIGTERM that cannot be delivered unless and
     * until a new thread obtains the same cygtls instance. */

    /* Start a new signal-handling thread. */

    check_threadcall("pthread_create",
            pthread_create(&tid2, NULL, thread2, NULL));

    check_threadcall("pthread_mutex_lock",
            pthread_mutex_lock(&mutex));

    while (! about_to_sigwait) {
        pthread_cond_wait(&cond, &mutex);
    }

    check_threadcall("pthread_mutex_unlock",
            pthread_mutex_unlock(&mutex));

    /* Unfortunately, we have no way to be sure that the
     * other thread really is blocked in sigwait.  Sleep
     * a second to give it time to get into sigwait. */

    sleep(1);

    /* Send the same signal to the process as a whole;
     * if things were working, the new signal handling
     * thread would receive it inside of sigwait().
     *
     * We avoid raise(sig) because POSIX says it means
     * pthread_kill(pthread_self(), sig), meaning that
     * this thread would have to receive the signal. */

    check_syscall("kill",
            kill(getpid(), SIGTERM));

    printf("starting to await signal\n");
    fflush(stdout);

    /* Wait for that thread to receive that signal. */

    check_threadcall("pthread_mutex_lock",
            pthread_mutex_lock(&mutex));

    while ((sig = signal_received) == -1) {
        pthread_cond_wait(&cond, &mutex);
    }

    check_threadcall("pthread_mutex_unlock",
            pthread_mutex_unlock(&mutex));

    printf("received signal: %d (\"%s\")\n", sig, strsignal(sig));
    fflush(stdout);

    /* Join the other thread. */

    check_threadcall("pthread_join",
            pthread_join(tid2, NULL));

    return 0;
}
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]