SIGKILL may no longer work after many SIGCONT/SIGSTOP signals

Christian Franke Christian.Franke@t-online.de
Tue Nov 12 09:53:58 GMT 2024


Found with 'stress-ng --cpu-sched' from current stress-ng upstream HEAD:

Testcase (attached):

$ gcc -O2 -o manysignals manysignals.c

$ ./manysignals
fork() = 1833
...
fork() = 1848
...
kill(1833, 17)
...
kill(1848, 17)
kill(1833, 9)
...
kill(1848, 9)
waitpid(1833, ., 0)


Run this in second terminal:

$ watch "ps | sed -n '1p;/manysignals/{/sed/d;p}'"

If 'S' appear in the first column, the child processes likely reached 
the final SIGSTOP state. This takes some time. The parent process may 
still hang in first waitpid() but should not.

If the parent process is aborted with ^C, child processes may be stopped 
or left behind. Occasionally a child process that can not be stopped by 
Cygwin (kill -9) is left behind.

Tested with ancient (i7-2600K) and more recent (i7-14700K) CPU :-)


Unrelated to the above, but related to 'stress-ng --cpu-sched' which 
uses sched_get/setscheduler():

- sched_getscheduler() always returns SCHED_FIFO. As far as I understand 
Linux sched(7), this is a non-preemptive real-time policy. The 
preemptive SCHED_RR would possibly a more reasonable value. 
Unfortunately SCHED_OTHER cannot be used because it would require to 
ignore the priority.

- sched_setscheduler() always fails with ENOSYS. It IMO should allow to 
set 'param->sched_priority' if 'policy' is equal to the value returned 
by sched_getscheduler().

-- 
Regards,
Christian

-------------- next part --------------
#define _GNU_SOURCE
#include <sched.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <wait.h>

static void xkill(pid_t pid, int sig)
{
  printf("kill(%d, %d)\n", (int)pid, sig);
  int ret = kill(pid, sig);
  if (ret)
    perror("kill");
}

int main()
{
  // number of child processes
  const int nprocs = 16;
  // number of SIGSTOP+SIGCONT, ..., SIGSTOP+SIGCONT, SIGSTOP.
  const int nstopcont = 10;

  pid_t pids[nprocs];
  for (int p = 0; p < nprocs; p++) {
    pid_t pid = fork();
    if (pid == (pid_t)-1) {
      perror("fork"); return 1;
    }
    if (pid == 0) {
      cpu_set_t cpus; CPU_ZERO(&cpus);
      CPU_SET(0, &cpus);
      if (sched_setaffinity(getpid(), sizeof(cpus), &cpus))
        perror("setaffinity");

      for (;;)
        sched_yield();
    }

    printf("fork() = %d\n", (int)pid);
    pids[p] = pid;
  }
  sleep(1);

  for (int i = 0; ; ) {
    for (int p = 0; p < nprocs; p++)
      xkill(pids[p], SIGSTOP);
    if (++i >= nstopcont)
      break;
    for (int p = 0; p < nprocs; p++)
      xkill(pids[p], SIGCONT);
  }

  for (int p = 0; p < nprocs; p++)
    xkill(pids[p], SIGKILL);

  for (int p = 0; p < nprocs; p++) {
    pid_t pid = pids[p];
    printf("waitpid(%d, ., 0)\n", (int)pid); fflush(stdout);
    int status;
    pid_t ret = waitpid(pid, &status, 0);
    if (ret == (pid_t)-1)
      perror("waitpid");
  }
  return 0;
}


More information about the Cygwin mailing list